10: Reflection and Self-Review Loops
Chapter 10: Reflection and Self-Review Loops
Section titled “Chapter 10: Reflection and Self-Review Loops”Overview
Section titled “Overview”You’ve built agents that react (Chapter 02) and agents that plan (Chapter 09). But what about agents that improve their own work? What if your agent could critique its outputs, identify weaknesses, and refine them iteratively—just like a human would review and revise a draft?
This is the power of reflection loops. Instead of generating output once and hoping for the best, reflection agents operate in a Generate-Reflect-Refine cycle: they create an initial answer, evaluate its quality, and then improve it based on that evaluation. This pattern is essential for tasks where quality matters more than speed—writing, code generation, analysis, and decision-making.
The claude-php/claude-php-agent framework provides ReflectionLoop for exactly this purpose: a self-improving agent that iteratively refines outputs until they meet quality thresholds.
In this chapter you’ll:
- Master the Generate-Reflect-Refine pattern
- Implement quality scoring and thresholds
- Define custom evaluation criteria
- Validate tool outputs through reflection
- Build production-grade quality control systems
- Monitor reflection metrics and costs
- Optimize cost vs. quality trade-offs
Estimated time: ~120 minutes
::: info Framework Version
This chapter is based on claude-php/claude-php-agent v0.5+. We’ll use ReflectionLoop extensively throughout.
:::
::: info Code examples Complete, runnable examples for this chapter:
basic-reflection.php— Simple reflection loopcustom-criteria.php— Custom evaluation criteriaquality-thresholds.php— Quality scoring and stopping conditionstool-validation.php— Validating tool outputsreflection-monitoring.php— Tracking reflection metricscode-review-agent.php— Practical code review examplecontent-refinement.php— Content writing with reflectionproduction-reflection-system.php— Complete production system
All files are in code/10-reflection-and-self-review/.
:::
Understanding Reflection Loops
Section titled “Understanding Reflection Loops”Before implementing, let’s understand what makes reflection different from other loop strategies.
The Generate-Reflect-Refine Cycle
Section titled “The Generate-Reflect-Refine Cycle”A reflection loop operates in three distinct phases:
┌─────────────────────────────────┐│ PHASE 1: GENERATE ││ Create initial output ││ (may use tools) │└───────────┬─────────────────────┘ ↓┌─────────────────────────────────┐│ PHASE 2: REFLECT ││ Evaluate quality ││ Identify issues ││ Suggest improvements ││ Assign quality score (1-10) │└───────────┬─────────────────────┘ ↓ Score ≥ Threshold? ↓ No┌─────────────────────────────────┐│ PHASE 3: REFINE ││ Apply improvements ││ Address issues ││ (may use tools) │└───────────┬─────────────────────┘ ↓ Repeat REFLECT → REFINE (until threshold or max refinements)Comparison: React vs Plan vs Reflection
Section titled “Comparison: React vs Plan vs Reflection”| Aspect | ReactLoop | PlanExecuteLoop | ReflectionLoop |
|---|---|---|---|
| Pattern | Reason → Act → Observe | Plan → Execute → Synthesize | Generate → Reflect → Refine |
| Goal | Complete task | Execute plan systematically | Maximize quality |
| Iterations | Unpredictable (1-10+) | Predictable (plan-driven) | Fixed (1 + N refinements) |
| Quality Focus | Task completion | Plan adherence | Output excellence |
| Best For | General tasks | Multi-step workflows | Quality-critical outputs |
| Token Usage | Medium | High | Very High |
| Latency | Low-Medium | Medium-High | High |
When to Use Reflection Loops
Section titled “When to Use Reflection Loops”✅ Perfect for:
- Content creation: Blog posts, documentation, emails
- Code generation: Functions, classes, APIs
- Analysis: Research reports, data summaries
- Decision-making: Evaluations, recommendations
- Any task where quality > speed
❌ Not ideal for:
- Simple lookups or calculations
- Real-time interactions
- Budget-constrained applications
- Tasks with external quality validation
Basic Reflection Loop
Section titled “Basic Reflection Loop”Let’s start with a simple example to see reflection in action.
Minimal Example
Section titled “Minimal Example”<?php
use ClaudeAgents\Agent;use ClaudeAgents\Loops\ReflectionLoop;use ClaudePhp\ClaudePhp;
require 'vendor/autoload.php';
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));
// Create reflection loop$loop = new ReflectionLoop( maxRefinements: 3, // Up to 3 refinement iterations qualityThreshold: 8, // Stop when score ≥ 8/10 criteria: 'clarity, accuracy, and completeness');
// Add callback to monitor reflection progress$loop->onReflection(function (int $refinement, int $score, string $feedback) { echo "Refinement #{$refinement}: Score {$score}/10\n"; echo "Feedback: " . substr($feedback, 0, 200) . "...\n\n";});
// Create agent with reflection loop$agent = Agent::create($client) ->withLoopStrategy($loop) ->withSystemPrompt('You are a helpful assistant that creates high-quality explanations.') ->maxIterations(15);
// Run a quality-critical task$result = $agent->run( 'Explain the concept of dependency injection in PHP to a junior developer.');
echo "Final Output:\n";echo str_repeat("=", 80) . "\n";echo $result->getAnswer() . "\n";echo str_repeat("=", 80) . "\n";
// Access reflection metadata$metadata = $result->getMetadata();echo "\nQuality Metrics:\n";echo "- Final Score: {$metadata['final_score']}/10\n";echo "- Total Refinements: " . count($metadata['reflections']) . "\n";echo "- Iterations: {$result->getIterations()}\n";echo "- Tokens: " . json_encode($result->getTokenUsage()) . "\n";What Happens Here?
Section titled “What Happens Here?”- Generate: Agent creates initial explanation
- Reflect: Agent evaluates clarity, accuracy, completeness
- Score: Agent assigns quality score (e.g., 6/10)
- Refine: Agent improves explanation based on feedback
- Repeat: Continue until score ≥ 8 or 3 refinements done
Configuring Reflection Loops
Section titled “Configuring Reflection Loops”ReflectionLoop provides several configuration options to control behavior.
Constructor Parameters
Section titled “Constructor Parameters”$loop = new ReflectionLoop( logger: $logger, // PSR-3 logger (optional) maxRefinements: 3, // Max refinement iterations qualityThreshold: 8, // Quality score (1-10) to stop criteria: 'accuracy and clarity' // Evaluation criteria);Configuration Trade-offs
Section titled “Configuration Trade-offs”| Parameter | Low Value | High Value |
|---|---|---|
maxRefinements | Faster, cheaper | Better quality |
qualityThreshold | More refinements | Stops earlier |
criteria | General evaluation | Task-specific quality |
Example: Strict Quality Requirements
Section titled “Example: Strict Quality Requirements”// For mission-critical outputs$strictLoop = new ReflectionLoop( maxRefinements: 5, // Allow more iterations qualityThreshold: 9, // Demand excellence criteria: 'correctness, security, performance, and maintainability');Example: Fast Iteration
Section titled “Example: Fast Iteration”// For budget-constrained scenarios$fastLoop = new ReflectionLoop( maxRefinements: 1, // Single refinement pass qualityThreshold: 7, // Lower bar criteria: 'basic correctness');Custom Evaluation Criteria
Section titled “Custom Evaluation Criteria”The power of reflection comes from custom evaluation criteria tailored to your specific task.
Default Criteria
Section titled “Default Criteria”Without custom criteria, ReflectionLoop uses:
“correctness, completeness, clarity, and quality”
Domain-Specific Criteria
Section titled “Domain-Specific Criteria”Code Generation
Section titled “Code Generation”$codeLoop = new ReflectionLoop( maxRefinements: 3, qualityThreshold: 8, criteria: 'correctness, error handling, type safety, PSR-12 compliance, and documentation');Content Writing
Section titled “Content Writing”$contentLoop = new ReflectionLoop( maxRefinements: 3, qualityThreshold: 8, criteria: 'clarity, engagement, accuracy, tone appropriateness, and grammar');Data Analysis
Section titled “Data Analysis”$analysisLoop = new ReflectionLoop( maxRefinements: 3, qualityThreshold: 9, criteria: 'statistical accuracy, insight depth, visualization clarity, and actionable recommendations');Technical Documentation
Section titled “Technical Documentation”$docsLoop = new ReflectionLoop( maxRefinements: 3, qualityThreshold: 8, criteria: 'accuracy, completeness, clarity for target audience, code examples, and navigation');Structured Criteria Approach
Section titled “Structured Criteria Approach”For complex evaluations, structure your criteria:
$criteria = <<<CRITERIAEvaluate on these dimensions:
1. Correctness (30%): - Factual accuracy - No misleading statements - Valid code examples
2. Completeness (25%): - All requirements addressed - Edge cases considered - No missing information
3. Clarity (25%): - Easy to understand - Logical structure - Clear examples
4. Professionalism (20%): - Appropriate tone - Proper grammar - Polished presentation
Provide specific feedback for each dimension.CRITERIA;
$loop = new ReflectionLoop( maxRefinements: 3, qualityThreshold: 8, criteria: $criteria);Quality Scoring and Thresholds
Section titled “Quality Scoring and Thresholds”Understanding how scoring works helps optimize your reflection loops.
How Scores Are Extracted
Section titled “How Scores Are Extracted”ReflectionLoop uses pattern matching to extract scores from reflection text:
// Recognizes these patterns:"Score: 7/10""Quality: 8""Rating: 6 out of 10""Overall score of 7""7/10"Default if no score found: 5/10
Score Interpretation Guidelines
Section titled “Score Interpretation Guidelines”| Score | Meaning | Action |
|---|---|---|
| 1-3 | Poor quality, major issues | Continue refining (likely) |
| 4-5 | Below average, multiple problems | Continue refining |
| 6-7 | Acceptable but improvable | Depends on threshold |
| 8-9 | High quality, minor issues only | Often meets threshold |
| 10 | Perfect (rare) | Exceeds threshold |
Setting Appropriate Thresholds
Section titled “Setting Appropriate Thresholds”High-Stakes Tasks (Threshold: 9)
Section titled “High-Stakes Tasks (Threshold: 9)”// Medical advice, legal content, financial analysis$loop = new ReflectionLoop( maxRefinements: 5, qualityThreshold: 9, criteria: 'accuracy, completeness, and legal compliance');Production Code (Threshold: 8)
Section titled “Production Code (Threshold: 8)”// Code that ships to users$loop = new ReflectionLoop( maxRefinements: 3, qualityThreshold: 8, criteria: 'correctness, security, and maintainability');Internal Documentation (Threshold: 7)
Section titled “Internal Documentation (Threshold: 7)”// Internal docs, draft content$loop = new ReflectionLoop( maxRefinements: 2, qualityThreshold: 7, criteria: 'clarity and completeness');Rapid Prototyping (Threshold: 6)
Section titled “Rapid Prototyping (Threshold: 6)”// Quick drafts, proof-of-concepts$loop = new ReflectionLoop( maxRefinements: 1, qualityThreshold: 6, criteria: 'basic functionality');Reflection Callbacks and Monitoring
Section titled “Reflection Callbacks and Monitoring”Monitor reflection progress with callbacks to track quality improvements.
Available Callbacks
Section titled “Available Callbacks”$loop = new ReflectionLoop(maxRefinements: 3);
// 1. Iteration callback (fired for every LLM call)$loop->onIteration(function (int $iteration, $response, $context) { echo "Iteration {$iteration}: " . ($response->stop_reason ?? 'unknown') . "\n";});
// 2. Tool execution callback$loop->onToolExecution(function (string $tool, array $input, $result) { echo "Tool '{$tool}' executed\n";});
// 3. Reflection callback (fired after each reflection)$loop->onReflection(function (int $refinement, int $score, string $feedback) { echo "Refinement {$refinement}: {$score}/10\n"; echo "Issues: " . substr($feedback, 0, 150) . "\n\n";});Building a Reflection Monitor
Section titled “Building a Reflection Monitor”class ReflectionMonitor{ private array $refinements = []; private float $startTime;
public function start(): void { $this->startTime = microtime(true); $this->refinements = []; }
public function recordRefinement(int $num, int $score, string $feedback): void { $this->refinements[] = [ 'number' => $num, 'score' => $score, 'feedback' => $feedback, 'timestamp' => microtime(true) - $this->startTime, ]; }
public function getReport(): array { $scores = array_column($this->refinements, 'score');
return [ 'total_refinements' => count($this->refinements), 'initial_score' => $scores[0] ?? 0, 'final_score' => end($scores) ?: 0, 'improvement' => (end($scores) ?: 0) - ($scores[0] ?? 0), 'duration' => microtime(true) - $this->startTime, 'refinements' => $this->refinements, ]; }}
// Usage$monitor = new ReflectionMonitor();$monitor->start();
$loop = new ReflectionLoop(maxRefinements: 3);$loop->onReflection([$monitor, 'recordRefinement']);
$agent = Agent::create($client)->withLoopStrategy($loop);$result = $agent->run($task);
$report = $monitor->getReport();echo "Improved by {$report['improvement']} points over {$report['duration']}s\n";Tool Validation with Reflection
Section titled “Tool Validation with Reflection”One powerful use case: validating tool outputs for correctness.
Problem: Tools Can Return Bad Data
Section titled “Problem: Tools Can Return Bad Data”// Search tool might return irrelevant results// API tool might return stale data// Calculator tool might have edge case bugsSolution: Reflect on Tool Outputs
Section titled “Solution: Reflect on Tool Outputs”<?php
use ClaudeAgents\Agent;use ClaudeAgents\Loops\ReflectionLoop;use ClaudeAgents\Tools\Tool;
// Create a search tool (simulated)$searchTool = Tool::create('search') ->description('Search for information') ->parameter('query', 'string', 'Search query') ->required('query') ->handler(function (array $input): string { // Simulate potentially incomplete/incorrect search results return json_encode([ 'results' => [ 'PHP 8.4 was released in November 2024', 'PHP 8.4 includes property hooks', // Intentionally incomplete ] ]); });
// Reflection loop with validation criteria$loop = new ReflectionLoop( maxRefinements: 3, qualityThreshold: 8, criteria: 'completeness of information, factual accuracy, and coverage of all key features');
$loop->onReflection(function (int $ref, int $score, string $feedback) { echo "Validation pass {$ref}: {$score}/10\n"; if ($score < 8) { echo "Issues found: " . substr($feedback, 0, 200) . "\n\n"; }});
$agent = Agent::create($client) ->withLoopStrategy($loop) ->withTool($searchTool) ->withSystemPrompt('You are a helpful assistant. Use tools when needed.') ->maxIterations(15);
$result = $agent->run('What are the major features in PHP 8.4?');
echo "Final answer (validated through reflection):\n";echo $result->getAnswer() . "\n";What happens:
- Agent calls search tool
- Gets incomplete results
- Generates initial answer
- Reflection identifies missing information
- Agent searches again or synthesizes better answer
- Process repeats until validation passes
Production Reflection System
Section titled “Production Reflection System”Let’s build a complete production-grade reflection system with monitoring, logging, and cost controls.
Architecture
Section titled “Architecture”┌─────────────────────────────────────┐│ ReflectionOrchestrator ││ - Quality settings per task type ││ - Cost tracking ││ - Performance monitoring │└─────────────┬───────────────────────┘ ↓┌─────────────────────────────────────┐│ ReflectionLoop ││ - Generate → Reflect → Refine │└─────────────┬───────────────────────┘ ↓┌─────────────────────────────────────┐│ Reflection Metrics Store ││ - Score history ││ - Token usage ││ - Performance data │└─────────────────────────────────────┘Implementation
Section titled “Implementation”See the complete implementation in:
Key components:
class ReflectionOrchestrator{ private array $qualityProfiles = [ 'critical' => [ 'maxRefinements' => 5, 'qualityThreshold' => 9, 'maxTokens' => 20000, ], 'standard' => [ 'maxRefinements' => 3, 'qualityThreshold' => 8, 'maxTokens' => 10000, ], 'draft' => [ 'maxRefinements' => 1, 'qualityThreshold' => 6, 'maxTokens' => 5000, ], ];
public function executeWithProfile( string $task, string $profile = 'standard', ?string $criteria = null ): array { $config = $this->qualityProfiles[$profile];
$loop = new ReflectionLoop( maxRefinements: $config['maxRefinements'], qualityThreshold: $config['qualityThreshold'], criteria: $criteria );
// Add monitoring $metrics = []; $loop->onReflection(function ($ref, $score, $feedback) use (&$metrics) { $metrics[] = compact('ref', 'score', 'feedback'); });
$agent = Agent::create($this->client) ->withLoopStrategy($loop) ->maxIterations(20);
$result = $agent->run($task);
return [ 'result' => $result, 'metrics' => $metrics, 'profile' => $profile, ]; }}Practical Example: Code Review Agent
Section titled “Practical Example: Code Review Agent”Let’s build a code review agent that uses reflection to provide high-quality feedback.
<?php
use ClaudeAgents\Agent;use ClaudeAgents\Loops\ReflectionLoop;use ClaudePhp\ClaudePhp;
require 'vendor/autoload.php';
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));
// Code review criteria$codeReviewCriteria = <<<CRITERIAEvaluate this code review on:
1. Issue Detection (30%): - Are all bugs/issues identified? - Are security concerns noted? - Are performance problems flagged?
2. Suggestion Quality (30%): - Are suggestions specific and actionable? - Are code examples provided? - Are alternatives considered?
3. Completeness (25%): - Is every part of the code addressed? - Are edge cases considered? - Is testing feedback included?
4. Communication (15%): - Is feedback constructive? - Is tone professional? - Are priorities clear?
Provide a score (1-10) and specific improvements needed.CRITERIA;
$loop = new ReflectionLoop( maxRefinements: 3, qualityThreshold: 8, criteria: $codeReviewCriteria);
$loop->onReflection(function (int $ref, int $score, string $feedback) { echo "Review refinement {$ref}: Quality score {$score}/10\n"; if ($score < 8) { echo "Improvements needed:\n"; echo substr($feedback, 0, 300) . "\n\n"; }});
$agent = Agent::create($client) ->withLoopStrategy($loop) ->withSystemPrompt('You are an expert code reviewer. Provide thorough, constructive feedback.') ->maxIterations(15);
// Sample code to review$codeToReview = <<<'PHP'<?php
function processPayment($amount, $userId) { $pdo = new PDO('mysql:host=localhost;dbname=app', 'root', '');
$stmt = $pdo->query("SELECT * FROM users WHERE id = $userId"); $user = $stmt->fetch();
if ($user['balance'] >= $amount) { $newBalance = $user['balance'] - $amount; $pdo->query("UPDATE users SET balance = $newBalance WHERE id = $userId"); return true; }
return false;}PHP;
$result = $agent->run( "Review this payment processing code for security, correctness, and best practices:\n\n" . $codeToReview);
echo "\n" . str_repeat("=", 80) . "\n";echo "FINAL CODE REVIEW:\n";echo str_repeat("=", 80) . "\n";echo $result->getAnswer() . "\n";
$metadata = $result->getMetadata();echo "\nReview Quality: {$metadata['final_score']}/10\n";echo "Refinements: " . count($metadata['reflections']) . "\n";Expected improvements through reflection:
- Initial review might miss SQL injection
- Reflection identifies missing security analysis
- Refined review includes prepared statements
- Further refinement adds error handling notes
- Final review is comprehensive and actionable
Cost vs. Quality Trade-offs
Section titled “Cost vs. Quality Trade-offs”Reflection loops can be expensive. Let’s optimize costs while maintaining quality.
Token Usage Pattern
Section titled “Token Usage Pattern”Standard Task (Threshold: 8, Max Refinements: 3):
Generation: ~1,000 tokensReflection 1: ~800 tokens (Score: 6/10)Refinement 1: ~1,200 tokensReflection 2: ~800 tokens (Score: 7/10)Refinement 2: ~1,200 tokensReflection 3: ~800 tokens (Score: 8/10 ✓)
Total: ~5,800 tokensCost Comparison
Section titled “Cost Comparison”| Loop Strategy | Tokens | Cost (Claude 3.5 Sonnet) | Latency |
|---|---|---|---|
| ReactLoop | ~2,000 | ~$0.006 | 2-3s |
| PlanExecuteLoop | ~3,500 | ~$0.011 | 4-6s |
| ReflectionLoop | ~5,800 | ~$0.017 | 8-12s |
Optimization Strategies
Section titled “Optimization Strategies”1. Profile-Based Quality
Section titled “1. Profile-Based Quality”class AdaptiveReflection{ public function getConfigForTask(string $taskType): array { return match($taskType) { 'user_facing_content' => [ 'maxRefinements' => 3, 'qualityThreshold' => 8, ], 'internal_doc' => [ 'maxRefinements' => 2, 'qualityThreshold' => 7, ], 'draft' => [ 'maxRefinements' => 1, 'qualityThreshold' => 6, ], default => [ 'maxRefinements' => 2, 'qualityThreshold' => 7, ], }; }}2. Early Stopping
Section titled “2. Early Stopping”// Stop early if improvement is minimal$previousScore = 0;$loop->onReflection(function ($ref, $score, $feedback) use (&$previousScore) { if ($score - $previousScore < 1 && $score >= 7) { // Improvement plateaued, consider stopping echo "Minimal improvement detected, score acceptable\n"; } $previousScore = $score;});3. Selective Reflection
Section titled “3. Selective Reflection”function shouldUseReflection(string $task): bool{ // Only use reflection for quality-critical tasks $qualityCritical = [ 'code_generation', 'user_communication', 'data_analysis', 'decision_making', ];
foreach ($qualityCritical as $pattern) { if (str_contains(strtolower($task), $pattern)) { return true; } }
return false;}
$loop = shouldUseReflection($task) ? new ReflectionLoop(maxRefinements: 3) : new ReactLoop();4. Caching Reflections
Section titled “4. Caching Reflections”class ReflectionCache{ public function getCachedScore(string $outputHash): ?int { // If we've seen similar output before, reuse quality score return $this->redis->get("reflection:score:{$outputHash}"); }
public function cacheScore(string $outputHash, int $score): void { $this->redis->setex("reflection:score:{$outputHash}", 3600, $score); }}Best Practices
Section titled “Best Practices”-
Use for quality-critical tasks
- Content creation, code generation, analysis
- When output quality directly impacts users
-
Set realistic thresholds
- Don’t demand 10/10 (rarely achievable)
- 8/10 is excellent for most tasks
- 7/10 is good for internal use
-
Customize criteria
- Domain-specific evaluation dimensions
- Measurable, specific criteria
- Prioritize what matters most
-
Monitor and adjust
- Track quality improvements
- Measure cost vs. benefit
- Tune parameters based on data
-
Use callbacks for visibility
- Log reflection scores
- Track token usage
- Alert on quality issues
❌ DON’T
Section titled “❌ DON’T”-
Use for simple tasks
- Lookups, calculations, formatting
- Tasks with external validation
- Real-time interactions
-
Set unrealistic thresholds
- Threshold 10: Almost never reached
- Too low threshold: Wastes refinements
- Match threshold to task importance
-
Use generic criteria
- “Quality” alone is too vague
- Specify what quality means
- Make criteria actionable
-
Ignore costs
- Reflection is 2-3x more expensive
- Monitor token usage
- Use selectively
-
Forget about latency
- Reflection adds 5-10s per refinement
- Not suitable for real-time apps
- Consider async processing
Common Patterns
Section titled “Common Patterns”Pattern 1: Two-Stage Reflection
Section titled “Pattern 1: Two-Stage Reflection”Generate multiple options, then reflect to choose the best:
// Stage 1: Generate 3 options (no reflection)$options = [];for ($i = 0; $i < 3; $i++) { $result = $agentNoReflection->run($task); $options[] = $result->getAnswer();}
// Stage 2: Use reflection to pick best$loop = new ReflectionLoop(maxRefinements: 2, qualityThreshold: 8);$evaluator = Agent::create($client)->withLoopStrategy($loop);
$result = $evaluator->run( "Choose the best option and refine it:\n\n" . implode("\n\n---\n\n", $options));Pattern 2: Conditional Reflection
Section titled “Pattern 2: Conditional Reflection”Only reflect if initial quality is below threshold:
// Quick quality check$result = $quickAgent->run($task);$qualityScore = $this->quickQualityCheck($result->getAnswer());
if ($qualityScore < 7) { // Quality insufficient, use reflection $loop = new ReflectionLoop(maxRefinements: 2); $result = Agent::create($client) ->withLoopStrategy($loop) ->run($task);}Pattern 3: Reflection with External Validation
Section titled “Pattern 3: Reflection with External Validation”Combine AI reflection with programmatic checks:
$loop = new ReflectionLoop(maxRefinements: 3);$loop->onReflection(function ($ref, $score, $feedback) use ($task) { // Also run external validation if ($task->type === 'code') { $syntaxValid = $this->validatePHPSyntax($task->output); if (!$syntaxValid) { $feedback .= "\n\nSYNTAX ERROR: Code contains syntax errors."; $score = min($score, 4); // Cap score if syntax invalid } }});Debugging Reflection Loops
Section titled “Debugging Reflection Loops”Common Issues
Section titled “Common Issues”Issue: Agent never reaches threshold
// Check if criteria are too strict$loop = new ReflectionLoop( maxRefinements: 5, qualityThreshold: 9, // Try lowering to 8 criteria: 'perfection in every way' // Too vague/strict);Issue: Scores don’t improve
// Add detailed monitoring$loop->onReflection(function ($ref, $score, $feedback) { echo "Refinement {$ref}:\n"; echo "Score: {$score}/10\n"; echo "Full feedback:\n{$feedback}\n\n"; // Look for whether agent understands criteria});Issue: Too expensive
// Reduce refinements and increase threshold$loop = new ReflectionLoop( maxRefinements: 1, // Only one refinement pass qualityThreshold: 7, // Lower bar);Key Takeaways
Section titled “Key Takeaways”-
Reflection = Generate + Reflect + Refine
- Three-phase cycle for quality improvement
- Iterative refinement until threshold met
-
Use for quality-critical tasks
- Content, code, analysis, decisions
- When quality > speed matters
-
Configure appropriately
- Max refinements: 1-5 depending on budget
- Quality threshold: 7-9 depending on stakes
- Custom criteria: Domain-specific evaluation
-
Monitor costs and quality
- Reflection is 2-3x more expensive
- Track score improvements
- Optimize based on data
-
Combine with other strategies
- Use selectively (not for all tasks)
- Combine with external validation
- Consider two-stage approaches
What’s Next?
Section titled “What’s Next?”In Chapter 11: Multi-Stage Workflows and Agent Graphs, we’ll move beyond single-agent loops to orchestrate multiple agents in complex workflows. You’ll learn to:
- Build DAG-style agent graphs
- Coordinate sequential and parallel execution
- Manage state transitions between stages
- Combine planning, reflection, and specialized agents
After mastering individual agent patterns (React, Plan, Reflect), you’ll be ready to compose them into powerful multi-agent systems.
Additional Resources
Section titled “Additional Resources”claude-php/claude-php-agentReflectionLoop Documentation- Reflection Loop Source Code
- Loop Strategies Demo