
Chapter 09: Token Management and Counting
Overview
Tokens are the currency of Claude API - they determine both your costs and what you can accomplish. Understanding how tokenization works, accurately counting tokens, optimizing context window usage, and implementing effective budget controls transforms Claude from an unpredictable expense into a cost-effective tool.
This chapter teaches you what tokens are and how they're calculated, how to count tokens before making API calls, strategies for staying within context limits, cost optimization techniques, and budget management systems.
By the end, you'll build production-ready token management systems that prevent cost overruns while maximizing Claude's capabilities.
Prerequisites
Before starting, ensure you understand:
- ✓ Basic Claude API usage (Chapters 00-03)
- ✓ Message structure and conversation flow
- ✓ System prompts and role definition (Chapter 07)
- ✓ Basic mathematics and cost calculation
Estimated Time: 45-60 minutes
What You'll Build
By the end of this chapter, you will have created:
- A
TokenCounterclass that accurately estimates token counts for text, messages, and API requests - A
TokenTrackersystem that monitors real-time token usage and calculates costs - A
ConversationContextManagerthat manages context windows and prunes conversations intelligently - A
SmartContextPrunerwith multiple pruning strategies (recent, important, balanced, summarize) - A
BudgetManagerthat enforces spending limits and tracks costs per request - A
CostOptimizerthat selects cost-effective models and optimizes requests - A complete
TokenManagementServicethat combines all components for production use
Objectives
By the end of this chapter, you will:
- Understand what tokens are and how Claude tokenizes text
- Learn Claude's token limits and pricing across different models
- Implement accurate token counting before making API calls
- Build systems to track token usage and costs in real-time
- Manage context windows effectively within Claude's 200K token limit
- Implement strategic conversation pruning to maximize context efficiency
- Create budget management systems to prevent cost overruns
- Optimize costs by choosing the right model for each task
- Build production-ready token management systems
Understanding Tokenization
What Are Tokens?
Tokens are not words - they're chunks of text that language models process.
<?php
# filename: examples/01-token-basics.php
declare(strict_types=1);
/**
* Token examples - approximate tokenization
*
* Claude uses a tokenizer similar to other LLMs.
* Rough rules of thumb:
* - 1 token ≈ 4 characters of English text
* - 1 token ≈ ¾ of a word on average
* - Common words = 1 token
* - Uncommon words = 2-3 tokens
* - Code is typically more tokens per character
*/
$examples = [
'Hello' => 1, // Common word = 1 token
'Hello world' => 2, // Two common words = 2 tokens
'PHP' => 1, // Acronym = 1 token
'PHP developer' => 2, // 2 tokens
'tokenization' => 2, // Long word = multiple tokens
'antidisestablishmentarianism' => 6, // Very long = many tokens
'function getName() {}' => 6, // Code, roughly 6 tokens
'$user->getName()' => 5, // PHP code with symbols
];
echo "Token Estimation Examples:\n\n";
foreach ($examples as $text => $estimatedTokens) {
$charCount = strlen($text);
$wordCount = str_word_count($text);
$tokensPerChar = $charCount > 0 ? $estimatedTokens / $charCount : 0;
echo "Text: \"{$text}\"\n";
echo " Characters: {$charCount}\n";
echo " Words: {$wordCount}\n";
echo " Estimated tokens: {$estimatedTokens}\n";
echo " Tokens per character: " . round($tokensPerChar, 2) . "\n\n";
}
// General estimation formula
function estimateTokens(string $text): int
{
// Very rough estimate: 1 token per 4 characters
return (int) ceil(strlen($text) / 4);
}
$sampleText = <<<'PHP'
function authenticateUser(string $email, string $password): ?User
{
$user = User::where('email', $email)->first();
if (!$user || !password_verify($password, $user->password_hash)) {
return null;
}
return $user;
}
PHP;
echo "Sample PHP Code:\n";
echo $sampleText . "\n\n";
echo "Estimated tokens: " . estimateTokens($sampleText) . "\n";
echo "Actual tokens would need precise counting...\n";Claude's Token Limits
Claude models have consistent context windows but vary significantly in pricing. Understanding these limits helps you choose the right model and estimate costs accurately.
<?php
# filename: examples/02-model-limits.php
declare(strict_types=1);
class ClaudeModelLimits
{
public const MODELS = [
'claude-opus-4-20250514' => [
'context_window' => 200_000,
'max_output' => 16_384,
'input_price_per_1m' => 15.00, // USD
'output_price_per_1m' => 75.00, // USD
],
'claude-sonnet-4-20250514' => [
'context_window' => 200_000,
'max_output' => 16_384,
'input_price_per_1m' => 3.00,
'output_price_per_1m' => 15.00,
],
'claude-haiku-4-20250514' => [
'context_window' => 200_000,
'max_output' => 16_384,
'input_price_per_1m' => 0.80,
'output_price_per_1m' => 4.00,
],
];
public static function getLimit(string $model, string $type): int
{
return self::MODELS[$model][$type] ?? 0;
}
public static function getPrice(string $model, string $type): float
{
$key = $type . '_price_per_1m';
return self::MODELS[$model][$key] ?? 0.0;
}
public static function calculateCost(
string $model,
int $inputTokens,
int $outputTokens
): float {
$inputCost = ($inputTokens / 1_000_000) * self::getPrice($model, 'input');
$outputCost = ($outputTokens / 1_000_000) * self::getPrice($model, 'output');
return $inputCost + $outputCost;
}
public static function estimateMaxTokenCost(string $model): float
{
$contextWindow = self::getLimit($model, 'context_window');
$maxOutput = self::getLimit($model, 'max_output');
// Worst case: full context window of input + max output
return self::calculateCost($model, $contextWindow, $maxOutput);
}
}
// Display model information
echo "Claude Model Comparison:\n\n";
foreach (ClaudeModelLimits::MODELS as $model => $specs) {
echo str_pad($model, 30) . "\n";
echo " Context window: " . number_format($specs['context_window']) . " tokens\n";
echo " Max output: " . number_format($specs['max_output']) . " tokens\n";
echo " Input cost: $" . $specs['input_price_per_1m'] . " per 1M tokens\n";
echo " Output cost: $" . $specs['output_price_per_1m'] . " per 1M tokens\n";
echo " Max theoretical cost: $" . number_format(
ClaudeModelLimits::estimateMaxTokenCost($model),
2
) . " per request\n\n";
}Accurate Token Counting
Accurate token counting is essential for cost estimation and staying within context limits. While exact tokenization requires Claude's tokenizer, we can build reliable estimation systems that get you within 5-10% accuracy.
Token Counter Implementation
<?php
# filename: src/TokenCounter.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
/**
* Token counter for Claude API
*
* Uses approximate counting since exact tokenization requires
* the same tokenizer Claude uses. This provides good estimates
* for planning and budgeting.
*/
class TokenCounter
{
private const CHARS_PER_TOKEN = 4;
private const CODE_MULTIPLIER = 1.3; // Code uses ~30% more tokens
/**
* Estimate token count for text
*/
public function count(string $text): int
{
// Basic estimation
$baseCount = strlen($text) / self::CHARS_PER_TOKEN;
// Adjust for special characters and structure
$adjustedCount = $baseCount;
// Code detection (has common code markers)
if ($this->looksLikeCode($text)) {
$adjustedCount *= self::CODE_MULTIPLIER;
}
return (int) ceil($adjustedCount);
}
/**
* Count tokens in messages array
*/
public function countMessages(array $messages): int
{
$total = 0;
foreach ($messages as $message) {
// Message role overhead (~4 tokens per message)
$total += 4;
// Message content
if (is_string($message['content'])) {
$total += $this->count($message['content']);
} elseif (is_array($message['content'])) {
// Multi-part content
foreach ($message['content'] as $part) {
if (isset($part['text'])) {
$total += $this->count($part['text']);
}
if (isset($part['image'])) {
$total += 1000; // Images ~1000 tokens
}
}
}
}
return $total;
}
/**
* Count tokens in entire API request
*/
public function countRequest(array $request): array
{
$counts = [
'system' => 0,
'messages' => 0,
'overhead' => 10, // API overhead
'total' => 0,
];
// System prompt
if (isset($request['system'])) {
$counts['system'] = $this->count($request['system']);
}
// Messages
if (isset($request['messages'])) {
$counts['messages'] = $this->countMessages($request['messages']);
}
$counts['total'] = $counts['system'] + $counts['messages'] + $counts['overhead'];
return $counts;
}
/**
* Detect if text looks like code
*/
private function looksLikeCode(string $text): bool
{
$codeIndicators = [
'function ', 'class ', 'public ', 'private ', 'protected ',
'return ', 'if (', 'for (', 'while (', 'foreach (',
'{', '}', '=>', '->', '::', '<?php'
];
$indicatorCount = 0;
foreach ($codeIndicators as $indicator) {
if (str_contains($text, $indicator)) {
$indicatorCount++;
}
}
// If 3+ code indicators, likely code
return $indicatorCount >= 3;
}
/**
* Estimate response tokens based on max_tokens
*/
public function estimateResponse(int $maxTokens, float $utilizationRate = 0.8): int
{
return (int) ($maxTokens * $utilizationRate);
}
}
// Usage
$counter = new TokenCounter();
$systemPrompt = 'You are a PHP expert who reviews code for security issues.';
$userMessage = 'Review this code: function login($user, $pass) { /* ... */ }';
echo "Token Estimates:\n";
echo "System prompt: " . $counter->count($systemPrompt) . " tokens\n";
echo "User message: " . $counter->count($userMessage) . " tokens\n\n";
$request = [
'system' => $systemPrompt,
'messages' => [
['role' => 'user', 'content' => $userMessage]
]
];
$breakdown = $counter->countRequest($request);
print_r($breakdown);Real-Time Token Tracking
Tracking actual token usage after API calls helps you refine your estimates and understand real costs. This system compares estimated vs actual tokens to improve accuracy over time.
<?php
# filename: src/TokenTracker.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
use Anthropic\Contracts\ClientContract;
class TokenTracker
{
private array $history = [];
public function __construct(
private ClientContract $client,
private TokenCounter $counter
) {}
public function track(array $request): object
{
// Count input tokens
$estimatedInput = $this->counter->countRequest($request);
// Make API call
$response = $this->client->messages()->create($request);
// Get actual token counts from response
$actualInput = $response->usage->inputTokens;
$actualOutput = $response->usage->outputTokens;
// Calculate costs
$model = $request['model'];
$cost = ClaudeModelLimits::calculateCost($model, $actualInput, $actualOutput);
// Track
$record = [
'timestamp' => time(),
'model' => $model,
'estimated_input' => $estimatedInput['total'],
'actual_input' => $actualInput,
'actual_output' => $actualOutput,
'total_tokens' => $actualInput + $actualOutput,
'cost' => $cost,
'accuracy' => $this->calculateAccuracy($estimatedInput['total'], $actualInput),
];
$this->history[] = $record;
return $response;
}
private function calculateAccuracy(int $estimated, int $actual): float
{
if ($actual === 0) return 0.0;
return 100 - (abs($estimated - $actual) / $actual * 100);
}
public function getHistory(): array
{
return $this->history;
}
public function getStats(): array
{
if (empty($this->history)) {
return [];
}
return [
'total_requests' => count($this->history),
'total_input_tokens' => array_sum(array_column($this->history, 'actual_input')),
'total_output_tokens' => array_sum(array_column($this->history, 'actual_output')),
'total_tokens' => array_sum(array_column($this->history, 'total_tokens')),
'total_cost' => array_sum(array_column($this->history, 'cost')),
'average_accuracy' => array_sum(array_column($this->history, 'accuracy')) / count($this->history),
];
}
public function exportCSV(string $filename): void
{
$fp = fopen($filename, 'w');
// Header
fputcsv($fp, ['Timestamp', 'Model', 'Input', 'Output', 'Total', 'Cost', 'Accuracy']);
// Data
foreach ($this->history as $record) {
fputcsv($fp, [
date('Y-m-d H:i:s', $record['timestamp']),
$record['model'],
$record['actual_input'],
$record['actual_output'],
$record['total_tokens'],
number_format($record['cost'], 6),
round($record['accuracy'], 2) . '%',
]);
}
fclose($fp);
}
}
// Usage
$tracker = new TokenTracker($client, new TokenCounter());
$response = $tracker->track([
'model' => 'claude-sonnet-4-20250514',
'max_tokens' => 1024,
'messages' => [[
'role' => 'user',
'content' => 'Explain PHP generators'
]]
]);
echo $response->content[0]->text . "\n\n";
$stats = $tracker->getStats();
echo "Token Usage Stats:\n";
print_r($stats);
// Export to CSV for analysis
$tracker->exportCSV('token_usage.csv');Context Window Management
Claude's 200K token context window is generous, but long conversations can still exceed it. Effective context management involves tracking usage, pruning strategically, and summarizing when needed.
Conversation Context Manager
<?php
# filename: src/ConversationContextManager.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
use Anthropic\Contracts\ClientContract;
class ConversationContextManager
{
private array $messages = [];
private int $maxContextTokens;
public function __construct(
private TokenCounter $counter,
int $maxContextTokens = 180_000 // Leave room for response
) {
$this->maxContextTokens = $maxContextTokens;
}
public function addMessage(string $role, string $content): void
{
$this->messages[] = [
'role' => $role,
'content' => $content
];
$this->pruneIfNeeded();
}
public function getMessages(): array
{
return $this->messages;
}
public function getCurrentTokenCount(): int
{
return $this->counter->countMessages($this->messages);
}
public function getRemainingTokens(): int
{
return max(0, $this->maxContextTokens - $this->getCurrentTokenCount());
}
public function canFit(string $content): bool
{
$additionalTokens = $this->counter->count($content);
return ($this->getCurrentTokenCount() + $additionalTokens) <= $this->maxContextTokens;
}
/**
* Prune old messages if context is too large
*/
private function pruneIfNeeded(): void
{
while ($this->getCurrentTokenCount() > $this->maxContextTokens && count($this->messages) > 1) {
// Remove oldest message (but keep at least 1)
array_shift($this->messages);
}
}
/**
* Prune strategically - keep important messages
*/
public function pruneStrategic(array $importantIndices = []): void
{
$newMessages = [];
$tokenCount = 0;
// Always keep important messages
foreach ($importantIndices as $idx) {
if (isset($this->messages[$idx])) {
$newMessages[] = $this->messages[$idx];
$tokenCount += $this->counter->countMessages([$this->messages[$idx]]);
}
}
// Add most recent messages until we hit limit
$reversed = array_reverse($this->messages, true);
foreach ($reversed as $idx => $message) {
if (in_array($idx, $importantIndices)) {
continue; // Already added
}
$messageTokens = $this->counter->countMessages([$message]);
if ($tokenCount + $messageTokens <= $this->maxContextTokens) {
array_unshift($newMessages, $message);
$tokenCount += $messageTokens;
} else {
break;
}
}
$this->messages = $newMessages;
}
/**
* Summarize old messages to save tokens
*/
public function summarizeOldMessages(
ClientContract $client,
int $keepRecentCount = 5
): void {
if (count($this->messages) <= $keepRecentCount) {
return;
}
// Messages to summarize
$toSummarize = array_slice($this->messages, 0, -$keepRecentCount);
// Keep recent messages
$recent = array_slice($this->messages, -$keepRecentCount);
// Create summary
$conversationText = '';
foreach ($toSummarize as $msg) {
$conversationText .= "{$msg['role']}: {$msg['content']}\n\n";
}
$response = $client->messages()->create([
'model' => 'claude-haiku-4-20250514', // Use fast model for summary
'max_tokens' => 500,
'messages' => [[
'role' => 'user',
'content' => "Summarize this conversation concisely:\n\n{$conversationText}"
]]
]);
$summary = $response->content[0]->text;
// Replace old messages with summary
$this->messages = [
['role' => 'assistant', 'content' => "[Previous conversation summary: {$summary}]"],
...$recent
];
}
public function clear(): void
{
$this->messages = [];
}
}
// Usage
$contextManager = new ConversationContextManager(new TokenCounter());
// Add messages
$contextManager->addMessage('user', 'What is PHP?');
$contextManager->addMessage('assistant', 'PHP is a server-side scripting language...');
$contextManager->addMessage('user', 'How do I use arrays?');
$contextManager->addMessage('assistant', 'PHP arrays are versatile...');
echo "Current tokens: " . $contextManager->getCurrentTokenCount() . "\n";
echo "Remaining tokens: " . $contextManager->getRemainingTokens() . "\n";
// Check if new content fits
$newQuestion = 'Can you explain object-oriented programming in PHP?';
if ($contextManager->canFit($newQuestion)) {
$contextManager->addMessage('user', $newQuestion);
echo "Added new message\n";
} else {
echo "Not enough context space, pruning...\n";
$contextManager->pruneIfNeeded();
$contextManager->addMessage('user', $newQuestion);
}Smart Context Pruning
<?php
# filename: src/SmartContextPruner.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
class SmartContextPruner
{
public function __construct(
private TokenCounter $counter
) {}
/**
* Prune conversation using different strategies
*/
public function prune(
array $messages,
int $targetTokens,
string $strategy = 'recent'
): array {
return match($strategy) {
'recent' => $this->pruneKeepRecent($messages, $targetTokens),
'important' => $this->pruneKeepImportant($messages, $targetTokens),
'balanced' => $this->pruneBalanced($messages, $targetTokens),
'summarize' => $this->pruneBySummarizing($messages, $targetTokens),
default => $messages,
};
}
/**
* Keep most recent messages
*/
private function pruneKeepRecent(array $messages, int $targetTokens): array
{
$kept = [];
$tokenCount = 0;
// Work backwards from most recent
for ($i = count($messages) - 1; $i >= 0; $i--) {
$msgTokens = $this->counter->countMessages([$messages[$i]]);
if ($tokenCount + $msgTokens <= $targetTokens) {
array_unshift($kept, $messages[$i]);
$tokenCount += $msgTokens;
} else {
break;
}
}
return $kept;
}
/**
* Keep important messages (first and recent)
*/
private function pruneKeepImportant(array $messages, int $targetTokens): array
{
if (empty($messages)) return [];
$kept = [];
$tokenCount = 0;
// Always keep first message (context/instructions)
$first = $messages[0];
$firstTokens = $this->counter->countMessages([$first]);
if ($firstTokens <= $targetTokens) {
$kept[] = $first;
$tokenCount += $firstTokens;
}
// Add recent messages
for ($i = count($messages) - 1; $i > 0; $i--) {
$msgTokens = $this->counter->countMessages([$messages[$i]]);
if ($tokenCount + $msgTokens <= $targetTokens) {
array_splice($kept, 1, 0, [$messages[$i]]);
$tokenCount += $msgTokens;
} else {
break;
}
}
return $kept;
}
/**
* Balance between old and new
*/
private function pruneBalanced(array $messages, int $targetTokens): array
{
$halfTarget = (int) ($targetTokens / 2);
// Get first half from beginning
$beginning = [];
$beginTokens = 0;
for ($i = 0; $i < count($messages); $i++) {
$msgTokens = $this->counter->countMessages([$messages[$i]]);
if ($beginTokens + $msgTokens <= $halfTarget) {
$beginning[] = $messages[$i];
$beginTokens += $msgTokens;
} else {
break;
}
}
// Get second half from end
$end = [];
$endTokens = 0;
for ($i = count($messages) - 1; $i >= 0; $i--) {
$msgTokens = $this->counter->countMessages([$messages[$i]]);
if ($endTokens + $msgTokens <= $halfTarget) {
array_unshift($end, $messages[$i]);
$endTokens += $msgTokens;
} else {
break;
}
}
// Add placeholder for omitted middle
if (count($beginning) + count($end) < count($messages)) {
$beginning[] = [
'role' => 'assistant',
'content' => '[... middle of conversation omitted to save tokens ...]'
];
}
return array_merge($beginning, $end);
}
/**
* Replace old messages with summary
*/
private function pruneBySummarizing(array $messages, int $targetTokens): array
{
// This is a placeholder - actual implementation would use Claude
// to summarize old messages (see ConversationContextManager)
$summary = [
'role' => 'system',
'content' => '[Summarized earlier conversation]'
];
return array_merge(
[$summary],
$this->pruneKeepRecent($messages, $targetTokens - 100)
);
}
}
// Usage
$pruner = new SmartContextPruner(new TokenCounter());
$messages = [
['role' => 'user', 'content' => 'Long message 1...'],
['role' => 'assistant', 'content' => 'Long response 1...'],
['role' => 'user', 'content' => 'Long message 2...'],
['role' => 'assistant', 'content' => 'Long response 2...'],
['role' => 'user', 'content' => 'Long message 3...'],
];
$pruned = $pruner->prune($messages, targetTokens: 1000, strategy: 'balanced');
echo "Original messages: " . count($messages) . "\n";
echo "Pruned messages: " . count($pruned) . "\n";Prompt Caching for Token Savings
Anthropic's native prompt caching reduces input tokens on repeated requests by up to 90%. When you have large, static context (documentation, system instructions, lengthy examples), caching lets Claude reuse processed tokens instead of reprocessing them.
Understanding Prompt Caching
Prompt caching works by flagging blocks of your prompt as cacheable:
<?php
# filename: src/PromptCacheManager.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
use Anthropic\Contracts\ClientContract;
class PromptCacheManager
{
private const CACHE_CONTROL = ['type' => 'ephemeral']; // 5-minute cache
// For longer caching (1 hour), use: ['type' => 'session']
public function __construct(
private ClientContract $client,
private TokenCounter $counter
) {}
/**
* Make a request with prompt caching enabled
*/
public function query(
string $userPrompt,
string $staticContext,
array $examples = [],
string $model = 'claude-sonnet-4-20250514'
): object {
// Build messages with cache control
$systemBlocks = [];
// Static context is always cached
$systemBlocks[] = [
'type' => 'text',
'text' => $staticContext,
'cache_control' => self::CACHE_CONTROL,
];
// Examples (often repeated) are cached
if (!empty($examples)) {
$examplesText = "Examples:\n" . implode("\n\n", $examples);
$systemBlocks[] = [
'type' => 'text',
'text' => $examplesText,
'cache_control' => self::CACHE_CONTROL,
];
}
// Dynamic instruction (not cached)
$systemBlocks[] = [
'type' => 'text',
'text' => 'Respond concisely and accurately.',
];
// Make request with cache-enabled system prompt
$response = $this->client->messages()->create([
'model' => $model,
'max_tokens' => 1024,
'system' => $systemBlocks,
'messages' => [
[
'role' => 'user',
'content' => $userPrompt,
]
],
]);
return $response;
}
/**
* Calculate cache savings
*
* Usage object includes:
* - input_tokens: Actual input tokens used
* - cache_creation_input_tokens: Tokens cached for future use
* - cache_read_input_tokens: Tokens read from cache
*/
public function analyzeCacheSavings(object $usage): array
{
$inputTokens = $usage->inputTokens ?? 0;
$cacheCreateTokens = $usage->cacheCreationInputTokens ?? 0;
$cacheReadTokens = $usage->cacheReadInputTokens ?? 0;
// On first request: pay for all tokens + cache creation overhead
// cache_creation_input_tokens = 25% more than normal input tokens
$firstRequestCost = ($inputTokens + $cacheCreateTokens) * 0.003 / 1_000_000;
// On subsequent requests: only 10% of cached token cost
$subsequentRequestCost = ($cacheReadTokens * 0.1) * 0.003 / 1_000_000;
// Breakeven point
$savingsPerRequest = ($cacheReadTokens * 0.9) * 0.003 / 1_000_000;
return [
'first_request_cost_usd' => $firstRequestCost,
'subsequent_request_cost_usd' => $subsequentRequestCost,
'savings_per_cached_request_usd' => $savingsPerRequest,
'cache_read_tokens' => $cacheReadTokens,
'breakeven_requests' => $cacheCreateTokens > 0 ? (int) ceil($cacheCreateTokens / ($cacheReadTokens * 0.9)) : 0,
];
}
}
// Usage
$cacheManager = new PromptCacheManager($client, new TokenCounter());
$largeDocumentation = file_get_contents('api-documentation.md');
$examples = [
"Example 1: Extract email\nInput: Contact me at john@example.com\nOutput: john@example.com",
"Example 2: Extract phone\nInput: Call +1-555-0123\nOutput: +1-555-0123",
];
// First request: creates cache (25% overhead)
$response1 = $cacheManager->query(
'Extract email from: "Reach out to alice@company.com"',
$largeDocumentation,
$examples
);
echo "First request tokens: " . $response1->usage->inputTokens . "\n";
echo "Cache creation tokens: " . ($response1->usage->cacheCreationInputTokens ?? 0) . "\n\n";
// Subsequent requests: use cache (90% savings)
$response2 = $cacheManager->query(
'Extract email from: "Contact bob@work.com"',
$largeDocumentation,
$examples
);
echo "Second request tokens: " . $response2->usage->inputTokens . "\n";
echo "Cache read tokens: " . ($response2->usage->cacheReadInputTokens ?? 0) . "\n";
echo "Savings: " . number_format($cacheManager->analyzeCacheSavings($response2->usage)['savings_per_cached_request_usd'], 4) . " USD\n";When to Use Prompt Caching
✅ Use when:
- You have large, static context (>1024 tokens) that doesn't change frequently
- You make multiple requests with the same system prompt or examples
- Context consists of documentation, API specs, or reference materials
- You need 5-minute or 1-hour cache durations
❌ Avoid when:
- Context changes frequently (defeats cache efficiency)
- Single one-off requests (overhead not worth it)
- You need real-time context updates
Cost-benefit: Cache breaks even after ~3-5 requests with large context, then saves 90% on input tokens.
Batch Processing for Cost-Effective Operations
Batch processing reduces Claude API costs by 50% when you need to process multiple requests asynchronously. Perfect for bulk operations that don't need real-time responses.
Batch Processing Strategy
<?php
# filename: src/BatchProcessor.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
use Anthropic\Contracts\ClientContract;
class BatchProcessor
{
private const BATCH_COST_MULTIPLIER = 0.5; // 50% discount
public function __construct(
private ClientContract $client,
private TokenCounter $counter
) {}
/**
* Submit batch of requests for processing
*/
public function submitBatch(array $requests): object
{
// Format requests for batch API
$batchRequests = array_map(function ($request, $index) {
return [
'custom_id' => "request-{$index}",
'params' => [
'model' => $request['model'] ?? 'claude-sonnet-4-20250514',
'max_tokens' => $request['max_tokens'] ?? 1024,
'system' => $request['system'] ?? null,
'messages' => $request['messages'],
]
];
}, $requests, array_keys($requests));
// Submit batch
$batch = $this->client->batches()->create([
'requests' => $batchRequests,
]);
return $batch;
}
/**
* Check batch processing status
*/
public function getBatchStatus(string $batchId): object
{
return $this->client->batches()->retrieve($batchId);
}
/**
* Retrieve batch results when complete
*/
public function getBatchResults(string $batchId): array
{
$batch = $this->client->batches()->retrieve($batchId);
if ($batch->processingStatus !== 'completed') {
throw new \RuntimeException(
"Batch {$batchId} not ready. Status: {$batch->processingStatus}"
);
}
$results = [];
foreach ($batch->requestCounts->succeeded as $result) {
$results[] = $result;
}
return $results;
}
/**
* Calculate cost savings for batch processing
*/
public function calculateBatchSavings(
array $requests,
string $model = 'claude-sonnet-4-20250514'
): array {
$totalInputTokens = 0;
$totalOutputTokens = 0;
// Estimate tokens for each request
foreach ($requests as $request) {
$tokenCounts = $this->counter->countRequest($request);
$totalInputTokens += $tokenCounts['total'];
$totalOutputTokens += $request['max_tokens'] ?? 1024;
}
// Standard API cost
$standardCost = ClaudeModelLimits::calculateCost(
$model,
$totalInputTokens,
$totalOutputTokens
);
// Batch cost (50% discount)
$batchCost = $standardCost * self::BATCH_COST_MULTIPLIER;
$savings = $standardCost - $batchCost;
$savingsPercent = ($savings / $standardCost) * 100;
return [
'request_count' => count($requests),
'total_input_tokens' => $totalInputTokens,
'total_output_tokens' => $totalOutputTokens,
'standard_cost_usd' => $standardCost,
'batch_cost_usd' => $batchCost,
'savings_usd' => $savings,
'savings_percent' => round($savingsPercent, 2),
];
}
}
// Usage
$batchProcessor = new BatchProcessor($client, new TokenCounter());
// Prepare bulk requests
$requests = [
[
'model' => 'claude-sonnet-4-20250514',
'max_tokens' => 200,
'system' => 'Summarize the following text in one sentence.',
'messages' => [
['role' => 'user', 'content' => 'Long article text 1...']
]
],
[
'model' => 'claude-sonnet-4-20250514',
'max_tokens' => 200,
'system' => 'Summarize the following text in one sentence.',
'messages' => [
['role' => 'user', 'content' => 'Long article text 2...']
]
],
// ... more requests
];
// Calculate savings
$savings = $batchProcessor->calculateBatchSavings($requests);
echo "Batch Processing Savings:\n";
echo "Requests: " . $savings['request_count'] . "\n";
echo "Standard cost: $" . number_format($savings['standard_cost_usd'], 4) . "\n";
echo "Batch cost: $" . number_format($savings['batch_cost_usd'], 4) . "\n";
echo "Total savings: $" . number_format($savings['savings_usd'], 4) . " (" . $savings['savings_percent'] . "%)\n\n";
// Submit batch
$batch = $batchProcessor->submitBatch($requests);
echo "Batch ID: " . $batch->id . "\n";
echo "Status: " . $batch->processingStatus . "\n";
// Check status later
// $status = $batchProcessor->getBatchStatus($batch->id);
// if ($status->processingStatus === 'completed') {
// $results = $batchProcessor->getBatchResults($batch->id);
// }When to Use Batch Processing
✅ Perfect for:
- Daily/weekly bulk analysis (document processing, data extraction)
- Non-time-sensitive operations (content generation, summarization)
- Bulk customer analysis or feedback processing
- Report generation from large datasets
❌ Not suitable for:
- Real-time user interactions (users won't wait 1+ hour)
- Complex workflows with dependencies
- Requests needing immediate responses
Process time: Usually completes within 1 minute, up to 24 hours for large batches.
Enhanced Image Token Calculation
Images consume varying tokens based on their dimensions, not just a flat ~1000 tokens. Here's a more accurate calculation:
<?php
# filename: src/ImageTokenCalculator.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
class ImageTokenCalculator
{
/**
* Calculate tokens for an image more accurately
*
* Token cost = 1100 base tokens + dimension-based tokens
* Dimension tokens scale with image complexity
*/
public static function calculateImageTokens(
int $width,
int $height,
string $mediaType = 'image/jpeg'
): int {
// Base tokens for any image
$baseTokens = 1100;
// Dimension-based tokens (~0.75 tokens per pixel for processed image)
$scaledDimensions = self::scaleImageDimensions($width, $height);
$dimensionTokens = (int) ceil(
($scaledDimensions['width'] * $scaledDimensions['height']) / 750
);
return $baseTokens + $dimensionTokens;
}
/**
* Scale image to Claude's processing dimensions
* Claude processes images in tiles of up to 1024×1024
*/
private static function scaleImageDimensions(int $width, int $height): array
{
$maxDimension = 1024;
if ($width <= $maxDimension && $height <= $maxDimension) {
return ['width' => $width, 'height' => $height];
}
// Scale down larger images
$aspectRatio = $width / $height;
if ($width > $height) {
return [
'width' => $maxDimension,
'height' => (int) ($maxDimension / $aspectRatio),
];
} else {
return [
'width' => (int) ($maxDimension * $aspectRatio),
'height' => $maxDimension,
];
}
}
/**
* Real-world examples
*/
public static function examples(): void
{
$examples = [
['width' => 400, 'height' => 300, 'description' => 'Small thumbnail'],
['width' => 800, 'height' => 600, 'description' => 'Mobile photo'],
['width' => 1920, 'height' => 1080, 'description' => 'HD screenshot'],
['width' => 4000, 'height' => 3000, 'description' => 'High-res camera'],
];
echo "Image Token Costs:\n\n";
foreach ($examples as $image) {
$tokens = self::calculateImageTokens($image['width'], $image['height']);
echo "{$image['description']}: {$image['width']}×{$image['height']} = {$tokens} tokens\n";
}
}
}
// Show examples
ImageTokenCalculator::examples();Update your TokenCounter to use this improved calculation:
// In TokenCounter class
if (isset($part['image'])) {
// More accurate image token calculation
if (isset($part['image']['width']) && isset($part['image']['height'])) {
$total += ImageTokenCalculator::calculateImageTokens(
$part['image']['width'],
$part['image']['height'],
$part['image']['media_type'] ?? 'image/jpeg'
);
} else {
// Fallback if dimensions not available
$total += 1100;
}
}Cost Management
Preventing cost overruns requires proactive budget management and cost optimization. These systems help you stay within budget while maximizing Claude's capabilities.
Budget Manager
<?php
# filename: src/BudgetManager.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
use Anthropic\Contracts\ClientContract;
class BudgetManager
{
private float $spent = 0.0;
private array $transactions = [];
public function __construct(
private ClientContract $client,
private TokenCounter $counter,
private float $budgetUSD,
private ?string $period = 'monthly'
) {}
public function query(array $request): object
{
// Estimate cost before making request
$estimatedCost = $this->estimateRequestCost($request);
if ($this->spent + $estimatedCost > $this->budgetUSD) {
throw new \RuntimeException(
"Budget exceeded: \${$this->budgetUSD} limit. " .
"Spent: \${$this->spent}, Estimated: \${$estimatedCost}"
);
}
// Make request
$response = $this->client->messages()->create($request);
// Calculate actual cost
$actualCost = ClaudeModelLimits::calculateCost(
$request['model'],
$response->usage->inputTokens,
$response->usage->outputTokens
);
// Track
$this->spent += $actualCost;
$this->transactions[] = [
'timestamp' => time(),
'model' => $request['model'],
'input_tokens' => $response->usage->inputTokens,
'output_tokens' => $response->usage->outputTokens,
'cost' => $actualCost,
'estimated_cost' => $estimatedCost,
];
return $response;
}
private function estimateRequestCost(array $request): float
{
$inputTokens = $this->counter->countRequest($request)['total'];
$outputTokens = $request['max_tokens'] ?? 1024;
return ClaudeModelLimits::calculateCost(
$request['model'],
$inputTokens,
$outputTokens
);
}
public function getRemaining(): float
{
return max(0, $this->budgetUSD - $this->spent);
}
public function getSpent(): float
{
return $this->spent;
}
public function getUtilization(): float
{
return ($this->spent / $this->budgetUSD) * 100;
}
public function canAfford(array $request): bool
{
$estimatedCost = $this->estimateRequestCost($request);
return ($this->spent + $estimatedCost) <= $this->budgetUSD;
}
public function getTransactions(): array
{
return $this->transactions;
}
public function getSummary(): array
{
return [
'budget' => $this->budgetUSD,
'spent' => $this->spent,
'remaining' => $this->getRemaining(),
'utilization' => round($this->getUtilization(), 2) . '%',
'transaction_count' => count($this->transactions),
'average_cost_per_request' => count($this->transactions) > 0
? $this->spent / count($this->transactions)
: 0,
];
}
public function reset(): void
{
$this->spent = 0.0;
$this->transactions = [];
}
}
// Usage
$budget = new BudgetManager(
client: $client,
counter: new TokenCounter(),
budgetUSD: 10.00,
period: 'daily'
);
try {
$response = $budget->query([
'model' => 'claude-sonnet-4-20250514',
'max_tokens' => 1024,
'messages' => [[
'role' => 'user',
'content' => 'Explain PHP namespaces'
]]
]);
echo $response->content[0]->text . "\n\n";
$summary = $budget->getSummary();
echo "Budget Summary:\n";
print_r($summary);
} catch (\RuntimeException $e) {
echo "Error: " . $e->getMessage() . "\n";
}Cost Optimizer
<?php
# filename: src/CostOptimizer.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
class CostOptimizer
{
/**
* Choose the most cost-effective model for a task
*/
public function chooseModel(
string $task,
int $estimatedInputTokens,
int $maxOutputTokens,
?string $quality = 'balanced'
): string {
$taskLower = strtolower($task);
// Simple tasks -> Haiku
if (str_contains($taskLower, 'extract') ||
str_contains($taskLower, 'classify') ||
str_contains($taskLower, 'simple') ||
$estimatedInputTokens < 1000
) {
return 'claude-haiku-4-20250514';
}
// Complex reasoning -> Opus
if ($quality === 'best' ||
str_contains($taskLower, 'complex') ||
str_contains($taskLower, 'analyze deeply') ||
str_contains($taskLower, 'comprehensive')
) {
return 'claude-opus-4-20250514';
}
// Default: Sonnet (best value)
return 'claude-sonnet-4-20250514';
}
/**
* Optimize request to reduce costs
*/
public function optimizeRequest(array $request): array
{
// 1. Trim whitespace
if (isset($request['system'])) {
$request['system'] = $this->trimExcessWhitespace($request['system']);
}
foreach ($request['messages'] as &$message) {
if (is_string($message['content'])) {
$message['content'] = $this->trimExcessWhitespace($message['content']);
}
}
// 2. Reduce max_tokens if possible
if (isset($request['max_tokens']) && $request['max_tokens'] > 4096) {
// Most responses don't need 16K tokens
// Consider reducing unless explicitly needed
}
// 3. Use lower temperature for deterministic tasks (faster, same quality)
if (!isset($request['temperature'])) {
$request['temperature'] = 0.5; // Lower = faster
}
return $request;
}
private function trimExcessWhitespace(string $text): string
{
// Remove extra newlines (more than 2 consecutive)
$text = preg_replace('/\n{3,}/', "\n\n", $text);
// Remove trailing whitespace
$text = preg_replace('/[ \t]+$/m', '', $text);
return trim($text);
}
/**
* Calculate potential savings
*/
public function calculateSavings(
string $originalModel,
string $optimizedModel,
int $inputTokens,
int $outputTokens
): array {
$originalCost = ClaudeModelLimits::calculateCost(
$originalModel,
$inputTokens,
$outputTokens
);
$optimizedCost = ClaudeModelLimits::calculateCost(
$optimizedModel,
$inputTokens,
$outputTokens
);
$savings = $originalCost - $optimizedCost;
$savingsPercent = $originalCost > 0
? ($savings / $originalCost) * 100
: 0;
return [
'original_model' => $originalModel,
'original_cost' => $originalCost,
'optimized_model' => $optimizedModel,
'optimized_cost' => $optimizedCost,
'savings' => $savings,
'savings_percent' => round($savingsPercent, 2),
];
}
}
// Usage
$optimizer = new CostOptimizer();
// Choose cost-effective model
$model = $optimizer->chooseModel(
task: 'Extract email addresses from text',
estimatedInputTokens: 500,
maxOutputTokens: 100
);
echo "Recommended model: {$model}\n";
// Calculate savings
$savings = $optimizer->calculateSavings(
originalModel: 'claude-opus-4-20250514',
optimizedModel: 'claude-haiku-4-20250514',
inputTokens: 1000,
outputTokens: 500
);
echo "Potential savings:\n";
print_r($savings);Production Token Management System
Complete Token Management Service
<?php
# filename: src/TokenManagementService.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
use Anthropic\Contracts\ClientContract;
class TokenManagementService
{
private TokenCounter $counter;
private BudgetManager $budget;
private CostOptimizer $optimizer;
private array $stats = [];
public function __construct(
private ClientContract $client,
float $dailyBudget = 50.00
) {
$this->counter = new TokenCounter();
$this->budget = new BudgetManager($client, $this->counter, $dailyBudget);
$this->optimizer = new CostOptimizer();
}
public function query(
string $task,
array $messages,
?string $system = null,
?int $maxTokens = null,
array $options = []
): object {
// Build request
$request = [
'messages' => $messages,
];
if ($system) {
$request['system'] = $system;
}
// Auto-select model if not specified
$inputTokens = $this->counter->countRequest($request)['total'];
$outputTokens = $maxTokens ?? 2048;
$request['model'] = $options['model'] ?? $this->optimizer->chooseModel(
task: $task,
estimatedInputTokens: $inputTokens,
maxOutputTokens: $outputTokens,
quality: $options['quality'] ?? 'balanced'
);
$request['max_tokens'] = $outputTokens;
// Optimize request
$request = $this->optimizer->optimizeRequest($request);
// Check budget
if (!$this->budget->canAfford($request)) {
throw new \RuntimeException(
'Insufficient budget for this request. ' .
'Remaining: $' . number_format($this->budget->getRemaining(), 4)
);
}
// Execute
$startTime = microtime(true);
$response = $this->budget->query($request);
$duration = microtime(true) - $startTime;
// Track stats
$this->trackStats($request, $response, $duration);
return $response;
}
private function trackStats(array $request, object $response, float $duration): void
{
$this->stats[] = [
'timestamp' => time(),
'model' => $request['model'],
'input_tokens' => $response->usage->inputTokens,
'output_tokens' => $response->usage->outputTokens,
'duration' => $duration,
'cost' => ClaudeModelLimits::calculateCost(
$request['model'],
$response->usage->inputTokens,
$response->usage->outputTokens
),
];
}
public function getStats(): array
{
if (empty($this->stats)) {
return [];
}
return [
'total_requests' => count($this->stats),
'total_tokens' => array_sum(array_map(
fn($s) => $s['input_tokens'] + $s['output_tokens'],
$this->stats
)),
'total_cost' => array_sum(array_column($this->stats, 'cost')),
'average_duration' => array_sum(array_column($this->stats, 'duration')) / count($this->stats),
'budget_summary' => $this->budget->getSummary(),
];
}
public function exportReport(string $filename): void
{
$report = [
'generated_at' => date('Y-m-d H:i:s'),
'stats' => $this->getStats(),
'transactions' => $this->budget->getTransactions(),
];
file_put_contents($filename, json_encode($report, JSON_PRETTY_PRINT));
}
}
// Usage
$service = new TokenManagementService($client, dailyBudget: 25.00);
$response = $service->query(
task: 'Extract data from text',
messages: [[
'role' => 'user',
'content' => 'Extract the email from: Contact us at support@example.com'
]],
maxTokens: 100
);
echo $response->content[0]->text . "\n\n";
$stats = $service->getStats();
echo "Token Management Stats:\n";
print_r($stats);
$service->exportReport('token_report.json');Exercises
Exercise 1: Token Budget Dashboard
Build a web dashboard that displays real-time token usage and budget status.
Requirements:
- Show current budget utilization
- Display token usage trends
- Alert when approaching budget limits
- Export usage reports
Exercise 2: Adaptive Context Window
Create a system that automatically adjusts context window usage based on conversation importance.
Requirements:
- Identify important vs filler messages
- Summarize or prune strategically
- Maintain conversation coherence
- Maximize context efficiency
Exercise 3: Cost Prediction Engine
Build a tool that predicts costs before making requests.
Requirements:
- Estimate token counts accurately
- Calculate cost ranges (min/max)
- Suggest optimizations
- Compare model costs
Solution Hints
For Exercise 1, create a class that stores usage data in a database and provides endpoints for fetching stats. For Exercise 2, implement a scoring system for message importance and use strategic pruning. For Exercise 3, build on the TokenCounter and add confidence intervals for estimates.
Cache Key Design Patterns
When combining this chapter with caching strategies (Chapter 18), design cache keys that account for token patterns:
<?php
# filename: src/TokenAwareCacheKey.php
declare(strict_types=1);
namespace CodeWithPHP\Claude;
class TokenAwareCacheKey
{
/**
* Generate cache key that accounts for semantic similarity
* Similar prompts should ideally share cache entries
*/
public static function generate(
array $request,
TokenCounter $counter
): string {
// Extract key components
$model = $request['model'] ?? 'claude-sonnet-4-20250514';
$system = $request['system'] ?? '';
$userMessage = $request['messages'][0]['content'] ?? '';
// Normalize for comparison (remove extra whitespace)
$normalizedSystem = self::normalize($system);
$normalizedMessage = self::normalize($userMessage);
// Create semantic fingerprint
$systemHash = substr(hash('sha256', $normalizedSystem), 0, 8);
$messageTokens = $counter->count($normalizedMessage);
$messageHash = substr(hash('sha256', $normalizedMessage), 0, 8);
// Cache key includes:
// - Model (different models = different caches)
// - System prompt hash (identifies unique instructions)
// - Message token count (similar-length messages = similar complexity)
// - Message hash (exact content)
return "claude:{$model}:system:{$systemHash}:tokens:{$messageTokens}:msg:{$messageHash}";
}
/**
* Normalize text for semantic comparison
*/
private static function normalize(string $text): string
{
// Remove extra whitespace
$text = preg_replace('/\s+/', ' ', trim($text));
// Remove common filler words that don't affect meaning
$fillers = ['please', 'thank you', 'kindly', 'could you'];
foreach ($fillers as $filler) {
$text = preg_replace('/\b' . $filler . '\b/i', '', $text);
}
return strtolower($text);
}
/**
* Estimate if two requests are semantically similar
* (would benefit from shared cached result)
*/
public static function isSimilar(
array $request1,
array $request2,
TokenCounter $counter,
float $similarityThreshold = 0.8
): bool {
$msg1 = $request1['messages'][0]['content'] ?? '';
$msg2 = $request2['messages'][0]['content'] ?? '';
// If token counts differ significantly, not similar
$tokens1 = $counter->count($msg1);
$tokens2 = $counter->count($msg2);
if (abs($tokens1 - $tokens2) / max($tokens1, $tokens2) > 0.2) {
return false;
}
// Calculate similarity score using simple word overlap
$words1 = array_unique(preg_split('/\W+/', strtolower($msg1)));
$words2 = array_unique(preg_split('/\W+/', strtolower($msg2)));
$intersection = count(array_intersect($words1, $words2));
$union = count(array_union($words1, $words2));
$similarity = $union > 0 ? $intersection / $union : 0;
return $similarity >= $similarityThreshold;
}
}
// Usage
$key1 = TokenAwareCacheKey::generate([
'model' => 'claude-sonnet-4-20250514',
'system' => 'You are a helpful assistant.',
'messages' => [['role' => 'user', 'content' => 'What is PHP?']]
], $counter);
echo "Cache key: {$key1}\n";
// Check similarity for deduplication
$similar = TokenAwareCacheKey::isSimilar(
[
'model' => 'claude-sonnet-4-20250514',
'messages' => [['role' => 'user', 'content' => 'What is PHP?']]
],
[
'model' => 'claude-sonnet-4-20250514',
'messages' => [['role' => 'user', 'content' => 'Tell me about PHP']]
],
$counter
);
echo "Similar requests: " . ($similar ? "Yes" : "No") . "\n";Troubleshooting
Token Count Estimates Are Too High
Symptom: Your token counter estimates significantly more tokens than Claude actually uses.
Cause: The estimation formula may be too conservative, especially for code or structured text.
Solution: Adjust the CHARS_PER_TOKEN constant or add language-specific multipliers:
// More accurate for English prose
private const CHARS_PER_TOKEN = 4.5;
// More accurate for code
private const CODE_MULTIPLIER = 1.2; // Reduce from 1.3Context Window Exceeded Errors
Symptom: API returns errors about exceeding context window limits.
Cause: Conversation history has grown too large, or a single message is too long.
Solution: Implement proactive pruning before making requests:
// Check before adding message
if (!$contextManager->canFit($newMessage)) {
$contextManager->pruneStrategic([0]); // Keep first message
}Budget Exceeded Unexpectedly
Symptom: Budget runs out faster than expected.
Cause: Output tokens may be higher than estimated, or multiple requests accumulate quickly.
Solution: Track actual costs and adjust estimates:
// Use actual output tokens for future estimates
$avgOutputTokens = $tracker->getStats()['total_output_tokens'] /
$tracker->getStats()['total_requests'];Key Takeaways
- ✓ Tokens are chunks of text (~4 chars each), not words
- ✓ Claude has a 200K token context window across all models
- ✓ Count tokens before requests to estimate costs accurately
- ✓ Implement budget management to prevent cost overruns
- ✓ Prune conversation history strategically to stay within limits
- ✓ Choose the right model for each task to optimize costs
- ✓ Track token usage to understand patterns and optimize
- ✓ Use Haiku for simple tasks, Sonnet for most, Opus for complex
Further Reading
- Anthropic Pricing Documentation — Official pricing information for Claude models
- Anthropic Token Limits — Model specifications and context window limits
- Anthropic Prompt Caching — Reduce costs 90% with cached prompts and examples
- Anthropic Batch Processing — 50% cost reduction for async bulk operations
- Chapter 08: Temperature and Sampling Parameters — Learn how sampling affects token usage
- Chapter 10: Error Handling and Rate Limiting — Build resilient applications with proper error handling
- Chapter 18: Caching Strategies — Response caching, cache invalidation, and semantic matching
- Chapter 39: Cost Optimization and Billing — Complete cost optimization strategies for production
- Anthropic API Reference — Complete API documentation for messages endpoint
Continue to Chapter 10: Error Handling and Rate Limiting to learn about building resilient applications with proper error handling.
💻 Code Samples
All code examples from this chapter are available in the GitHub repository:
Clone and run locally:
git clone https://github.com/dalehurley/codewithphp.git
cd codewithphp/code/claude-php/chapter-09
composer install
export ANTHROPIC_API_KEY="sk-ant-your-key-here"
php examples/02-model-limits.php