08: Retrieval-Augmented Generation (RAG) for Agents
Chapter 08: Retrieval-Augmented Generation (RAG) for Agents
Section titled “Chapter 08: Retrieval-Augmented Generation (RAG) for Agents”Overview
Section titled “Overview”In Chapter 07, you built long-term memory systems that let agents store and recall information. But what if you need your agent to answer questions from thousands of documents, technical documentation, or an entire knowledge base? What if the information changes frequently and can’t be memorized?
This is where Retrieval-Augmented Generation (RAG) comes in: the ability to retrieve relevant information from external sources and use it to generate accurate, grounded, cited responses. RAG is the cornerstone of modern AI applications—from customer support bots to research assistants to documentation Q&A systems.
The claude-php/claude-php-agent framework provides a complete RAG system with RAGPipeline, RAGAgent, document loaders, chunking strategies, vector stores, and retrieval algorithms. In this chapter, you’ll learn to build production-grade RAG systems that reduce hallucinations, cite sources, and scale to massive knowledge bases.
In this chapter you’ll:
- Understand RAG architecture and how it differs from pure LLM generation
- Implement document chunking with overlap for better retrieval
- Build semantic search with embeddings and vector stores
- Create citation-style responses that reference source material
- Apply query transformation techniques (multi-query, HyDE, decomposition)
- Use reranking to improve retrieval relevance
- Deploy production RAG pipelines with the framework’s components
Estimated time: ~120 minutes
::: info Framework Version
This chapter is based on claude-php/claude-php-agent v0.5+. We’ll use the framework’s RAG namespace and components extensively.
:::
::: info Code examples Complete, runnable examples for this chapter:
basic-rag-pipeline.php— Simple RAG pipeline with keyword retrievaldocument-chunking-strategies.php— Different chunking approachessemantic-vector-search.php— Embedding-based semantic retrievalcitation-generation.php— Citation-style response generationquery-transformation.php— Query transformation techniquesreranking-results.php— Reranking for better relevancedocument-loaders.php— Loading documents from various sourcesproduction-rag-system.php— Complete production RAG system
All files are in code/agentic-ai-php-developers/08-retrieval-augmented-generation/.
:::
Understanding Retrieval-Augmented Generation
Section titled “Understanding Retrieval-Augmented Generation”Before diving into implementation, let’s understand what RAG is and why it’s essential.
The Hallucination Problem
Section titled “The Hallucination Problem”Pure LLM Generation (without RAG):
User: What's the refund policy for ProductX?Agent: ❌ ProductX offers a 60-day money-back guarantee. (Made up—actual policy is 30 days)With RAG:
User: What's the refund policy for ProductX?Agent: ✅ ProductX offers a 30-day money-back guarantee. [Source: refund-policy.pdf] (Retrieved from actual policy document)RAG grounds responses in factual sources, dramatically reducing hallucinations.
What is RAG?
Section titled “What is RAG?”RAG = Retrieval + Generation
- Retrieval: Find relevant documents/chunks from a knowledge base
- Augmentation: Add retrieved context to the LLM prompt
- Generation: Generate answer based on provided context
RAG vs Long-Term Memory
Section titled “RAG vs Long-Term Memory”| Feature | Long-Term Memory (Chapter 07) | RAG (This Chapter) |
|---|---|---|
| Data source | Extracted facts from conversations | External documents/knowledge bases |
| Scale | 100s to 10,000s of memories | 100,000s to millions of documents |
| Update frequency | Continuous (during conversations) | Batch (periodic re-indexing) |
| Use case | Personal context, user preferences | Documentation, support, research |
| Retrieval | Semantic + recency + importance | Semantic similarity to query |
RAG Architecture
Section titled “RAG Architecture”┌─────────────────────────────────────────────────────────────────┐│ RAG PIPELINE │├─────────────────────────────────────────────────────────────────┤│ ││ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││ │ INDEXING │ │ RETRIEVAL │ │ GENERATION │ ││ │ (Offline) │ │ (Online) │ │ (Online) │ ││ └──────────────┘ └──────────────┘ └──────────────┘ ││ │ │ │ ││ ▼ ▼ ▼ ││ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ││ │ 1. Load Docs │ │ 1. Query │ │ 1. Context │ ││ │ 2. Chunk │────▶│ 2. Embed │────▶│ 2. Prompt │ ││ │ 3. Embed │ │ 3. Search │ │ 3. Generate │ ││ │ 4. Index │ │ 4. Rank │ │ 4. Cite │ ││ └──────────────┘ └──────────────┘ └──────────────┘ ││ │└─────────────────────────────────────────────────────────────────┘Basic RAG Pipeline
Section titled “Basic RAG Pipeline”Let’s start with a simple RAG pipeline using the framework’s RAGPipeline.
Simple Document Q&A
Section titled “Simple Document Q&A”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\RAGPipeline;use ClaudePhp\ClaudePhp;
// Setup$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));
// Create RAG pipeline$rag = RAGPipeline::create($client);
// Add documents to the knowledge baseecho "=== Indexing Documents ===\n\n";
$rag->addDocument( title: 'Product Refund Policy', content: 'ProductX offers a 30-day money-back guarantee. Returns must be in original condition. Refunds are processed within 5-7 business days. Shipping costs are non-refundable unless the item is defective.', metadata: ['category' => 'policy', 'type' => 'refund']);
$rag->addDocument( title: 'Product Warranty', content: 'ProductX includes a 1-year manufacturer warranty covering defects in materials and workmanship. Extended warranty available for purchase. Warranty does not cover accidental damage or normal wear and tear.', metadata: ['category' => 'policy', 'type' => 'warranty']);
$rag->addDocument( title: 'Shipping Information', content: 'ProductX ships within 1-2 business days. Free shipping on orders over $50. Express shipping available for $15. International shipping available to select countries.', metadata: ['category' => 'logistics', 'type' => 'shipping']);
echo "Indexed {$rag->getDocumentCount()} documents with {$rag->getChunkCount()} chunks\n\n";
// Query the knowledge baseecho "=== Querying Knowledge Base ===\n\n";
$questions = [ 'What is the refund policy?', 'Does ProductX have a warranty?', 'How long does shipping take?',];
foreach ($questions as $question) { echo "Q: {$question}\n";
$result = $rag->query($question, topK: 2);
echo "A: {$result['answer']}\n"; echo "\nSources:\n";
foreach ($result['sources'] as $source) { echo " - {$source['source']}\n"; }
if (!empty($result['citations'])) { echo "Citations: " . implode(', ', array_map(fn($i) => "[Source {$i}]", $result['citations'])) . "\n"; }
echo "\n" . str_repeat('-', 60) . "\n\n";}Output:
=== Indexing Documents ===
Indexed 3 documents with 3 chunks
=== Querying Knowledge Base ===
Q: What is the refund policy?A: ProductX offers a 30-day money-back guarantee [Source 0]. Returns must be in original condition, and refunds are processed within 5-7 business days [Source 0]. Shipping costs are non-refundable unless the item is defective [Source 0].
Sources: - Product Refund PolicyCitations: [Source 0]
------------------------------------------------------------How It Works
Section titled “How It Works”- Indexing: Documents are chunked and stored in the pipeline
- Retrieval: Query finds top K most relevant chunks
- Generation: Claude generates answer using retrieved chunks as context
- Citation: Answer includes source references
Document Chunking Strategies
Section titled “Document Chunking Strategies”Chunking is critical for RAG—chunks that are too large waste context, chunks that are too small lose meaning.
The Chunking Challenge
Section titled “The Chunking Challenge”Too large:
- Wastes token budget
- Includes irrelevant information
- Reduces retrieval precision
Too small:
- Loses semantic context
- Fragments information
- Requires more chunks for same question
Overlap:
- Prevents information loss at boundaries
- Improves retrieval recall
Framework Chunking Strategies
Section titled “Framework Chunking Strategies”The framework provides several chunking strategies:
<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\RAGPipeline;use ClaudeAgents\RAG\Chunker;use ClaudeAgents\RAG\Splitters\RecursiveCharacterTextSplitter;use ClaudeAgents\RAG\Splitters\TokenTextSplitter;use ClaudeAgents\RAG\Splitters\MarkdownTextSplitter;use ClaudeAgents\RAG\Splitters\CodeTextSplitter;use ClaudePhp\ClaudePhp;
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));
// 1. Basic Chunker (sentence-based)echo "=== Basic Sentence Chunker ===\n";
$basicChunker = new Chunker( chunkSize: 200, // 200 words per chunk overlap: 20 // 20 word overlap);
$rag1 = RAGPipeline::create($client)->withChunker($basicChunker);
$text = "This is a long document. " . str_repeat("It has many sentences. ", 50);$chunks = $basicChunker->chunk($text);
echo "Created " . count($chunks) . " chunks\n";echo "First chunk: " . substr($chunks[0], 0, 100) . "...\n\n";
// 2. Recursive Character Text Splitter (best for general text)echo "=== Recursive Character Splitter ===\n";
$recursiveSplitter = new RecursiveCharacterTextSplitter( chunkSize: 1000, // 1000 characters chunkOverlap: 200, // 200 character overlap separators: ["\n\n", "\n", ". ", " ", ""]);
$chunks2 = $recursiveSplitter->split($text);echo "Created " . count($chunks2) . " chunks with recursive splitting\n\n";
// 3. Token Text Splitter (respects model token limits)echo "=== Token Text Splitter ===\n";
$tokenSplitter = new TokenTextSplitter( chunkSize: 512, // 512 tokens chunkOverlap: 50 // 50 token overlap);
$chunks3 = $tokenSplitter->split($text);echo "Created " . count($chunks3) . " chunks with token splitting\n";echo "Each chunk respects 512 token limit\n\n";
// 4. Markdown Text Splitter (preserves structure)echo "=== Markdown Text Splitter ===\n";
$markdownText = <<<'MD'# Main Title
## Section 1
This is section 1 content.
### Subsection 1.1
More detailed content here.
## Section 2
This is section 2 content.MD;
$markdownSplitter = new MarkdownTextSplitter( chunkSize: 500, chunkOverlap: 50);
$chunks4 = $markdownSplitter->split($markdownText);echo "Created " . count($chunks4) . " chunks preserving markdown structure\n";foreach ($chunks4 as $i => $chunk) { $firstLine = explode("\n", trim($chunk))[0]; echo " Chunk {$i}: {$firstLine}\n";}echo "\n";
// 5. Code Text Splitter (respects code structure)echo "=== Code Text Splitter ===\n";
$phpCode = <<<'PHP'<?php
class UserService{ public function createUser(string $name): User { // Validate name if (empty($name)) { throw new InvalidArgumentException('Name required'); }
// Create user $user = new User($name); $this->repository->save($user);
return $user; }
public function deleteUser(int $id): void { $user = $this->repository->find($id); $this->repository->delete($user); }}PHP;
$codeSplitter = new CodeTextSplitter( language: 'php', chunkSize: 300, chunkOverlap: 30);
$chunks5 = $codeSplitter->split($phpCode);echo "Created " . count($chunks5) . " chunks preserving PHP code structure\n";echo "Each chunk maintains valid code boundaries\n\n";
// Best practicesecho "=== Best Practices ===\n\n";
echo "1. **General text:** Use RecursiveCharacterTextSplitter (respects paragraphs)\n";echo "2. **Documentation:** Use MarkdownTextSplitter (preserves headers/structure)\n";echo "3. **Code:** Use CodeTextSplitter (respects function/class boundaries)\n";echo "4. **Token-constrained:** Use TokenTextSplitter (respects model limits)\n";echo "5. **Simple needs:** Use basic Chunker (fast, sentence-based)\n";Choosing Chunk Size
Section titled “Choosing Chunk Size”General guidelines:
- Small chunks (200-500 tokens): Precise retrieval, good for Q&A
- Medium chunks (500-1000 tokens): Balance precision/context
- Large chunks (1000-2000 tokens): Retain more context, good for summaries
- Overlap (10-20%): Prevents information loss at boundaries
Semantic Search with Vector Stores
Section titled “Semantic Search with Vector Stores”Keyword matching is limited—semantic search with embeddings finds meaning, not just words.
Vector Store Implementation
Section titled “Vector Store Implementation”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\RAGPipeline;use ClaudeAgents\RAG\SemanticRetriever;use ClaudeAgents\RAG\VectorStore\InMemoryVectorStore;use ClaudePhp\ClaudePhp;
// Setup$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));
// Mock embedding function (in production, use Voyage AI, OpenAI, or Cohere)$embeddingFunction = function (string $text): array { // Simple mock: hash-based deterministic embedding // In production, use: VoyageAI, OpenAI embeddings, Cohere, etc. $hash = hash('sha256', $text); $embedding = [];
for ($i = 0; $i < 384; $i++) { $hexPair = substr($hash, ($i * 2) % 64, 2); $value = hexdec($hexPair) / 255.0; $embedding[] = ($value - 0.5) * 2.0; }
// Normalize to unit vector $norm = sqrt(array_sum(array_map(fn($v) => $v * $v, $embedding))); return array_map(fn($v) => $v / $norm, $embedding);};
// Create semantic retriever$semanticRetriever = new SemanticRetriever($embeddingFunction);
// Create RAG pipeline with semantic retrieval$rag = RAGPipeline::create($client) ->withRetriever($semanticRetriever);
echo "=== Semantic RAG Pipeline ===\n\n";
// Add technical documentation$rag->addDocument( title: 'PHP Arrays', content: 'Arrays in PHP are ordered maps that can hold multiple values. They support numeric and associative keys. Use array functions like array_map, array_filter, and array_reduce for manipulation.', metadata: ['topic' => 'arrays', 'language' => 'php']);
$rag->addDocument( title: 'PHP Type System', content: 'PHP 8.4 includes property hooks, asymmetric visibility, and enhanced type system features. Scalar types include int, float, string, and bool. Use union types and intersection types for complex type definitions.', metadata: ['topic' => 'types', 'language' => 'php']);
$rag->addDocument( title: 'PHP Functions', content: 'Functions in PHP support type hints, return types, and named arguments. Arrow functions provide concise syntax. First-class callables allow treating functions as values.', metadata: ['topic' => 'functions', 'language' => 'php']);
// Semantic queries (meaning-based, not keyword-based)$queries = [ 'How do I work with lists in PHP?', // Should match "Arrays" 'Tell me about PHP data types', // Should match "Type System" 'What are PHP methods?', // Should match "Functions"];
foreach ($queries as $query) { echo "Query: {$query}\n";
$result = $rag->query($query, topK: 1);
echo "Answer: {$result['answer']}\n"; echo "Top Source: {$result['sources'][0]['source']}\n"; echo "\n" . str_repeat('-', 60) . "\n\n";}
// Demonstrate semantic similarityecho "=== Semantic Similarity ===\n\n";
$pairs = [ ['arrays', 'lists'], ['functions', 'methods'], ['variables', 'constants'],];
foreach ($pairs as [$term1, $term2]) { $emb1 = $embeddingFunction($term1); $emb2 = $embeddingFunction($term2);
// Cosine similarity $similarity = 0.0; for ($i = 0; $i < count($emb1); $i++) { $similarity += $emb1[$i] * $emb2[$i]; }
echo "'{$term1}' vs '{$term2}': " . round($similarity, 3) . "\n";}
echo "\nSemantic search finds meaning, not just keywords!\n";Production Embedding Services
Section titled “Production Embedding Services”For production RAG systems, use professional embedding APIs:
1. Voyage AI (recommended for code/technical content)
// Voyage AI embeddingsclass VoyageEmbeddings{ private string $apiKey;
public function __construct(string $apiKey) { $this->apiKey = $apiKey; }
public function embed(string $text): array { $response = $this->httpPost('https://api.voyageai.com/v1/embeddings', [ 'input' => $text, 'model' => 'voyage-code-2', // Best for code ]);
return $response['data'][0]['embedding']; }
private function httpPost(string $url, array $data): array { $ch = curl_init($url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data)); curl_setopt($ch, CURLOPT_HTTPHEADER, [ 'Authorization: Bearer ' . $this->apiKey, 'Content-Type: application/json', ]);
$result = curl_exec($ch); curl_close($ch);
return json_decode($result, true); }}2. OpenAI Embeddings
// OpenAI embeddingsclass OpenAIEmbeddings{ private string $apiKey;
public function __construct(string $apiKey) { $this->apiKey = $apiKey; }
public function embed(string $text): array { $response = $this->httpPost('https://api.openai.com/v1/embeddings', [ 'input' => $text, 'model' => 'text-embedding-3-small', // Cheap and fast ]);
return $response['data'][0]['embedding']; }}Citation Generation
Section titled “Citation Generation”Citations ground responses in sources and let users verify information.
Citation Patterns
Section titled “Citation Patterns”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\RAGPipeline;use ClaudePhp\ClaudePhp;
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));$rag = RAGPipeline::create($client);
// Add research papers$rag->addDocument( title: 'Smith et al. (2023) - RAG Performance', content: 'Our study shows RAG reduces hallucinations by 67% compared to pure LLM generation. We tested on 1000 factual questions across domains.', metadata: ['year' => 2023, 'authors' => 'Smith et al.', 'venue' => 'NeurIPS']);
$rag->addDocument( title: 'Johnson (2024) - Chunking Strategies', content: 'Optimal chunk size varies by domain. For Q&A, 300-500 tokens works best. For summarization, 800-1200 tokens is optimal.', metadata: ['year' => 2024, 'authors' => 'Johnson', 'venue' => 'EMNLP']);
// Query with citations$result = $rag->query('How much does RAG reduce hallucinations?', topK: 2);
echo "=== Answer with Citations ===\n\n";echo $result['answer'] . "\n\n";
echo "=== Source Details ===\n\n";foreach ($result['sources'] as $source) { echo "Source {$source['index']}: {$source['source']}\n"; echo "Preview: {$source['text_preview']}\n"; echo "Metadata: " . json_encode($source['metadata']) . "\n\n";}
echo "=== Verification ===\n\n";echo "User can verify claims by checking [Source N] references\n";echo "Cited sources: " . implode(', ', $result['citations']) . "\n";Output:
=== Answer with Citations ===
RAG reduces hallucinations by 67% compared to pure LLM generation [Source 0].This finding comes from a study testing 1000 factual questions acrossvarious domains [Source 0].
=== Source Details ===
Source 0: Smith et al. (2023) - RAG PerformancePreview: Our study shows RAG reduces hallucinations by 67% compared to pure LLM generation...Metadata: {"year":2023,"authors":"Smith et al.","venue":"NeurIPS"}
=== Verification ===
User can verify claims by checking [Source N] referencesCited sources: 0Query Transformation
Section titled “Query Transformation”Transform queries to improve retrieval quality.
Multi-Query Generation
Section titled “Multi-Query Generation”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\QueryTransformation\MultiQueryGenerator;use ClaudePhp\ClaudePhp;
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));
$generator = new MultiQueryGenerator($client);
$originalQuery = 'How do I deploy a PHP application?';
echo "=== Multi-Query Generation ===\n\n";echo "Original Query: {$originalQuery}\n\n";
// Generate multiple query variations$variations = $generator->generate($originalQuery, numQueries: 3);
echo "Generated Variations:\n";foreach ($variations as $i => $variation) { echo " " . ($i + 1) . ". {$variation}\n";}
echo "\nBenefit: Retrieve from multiple perspectives, improve recall\n";HyDE (Hypothetical Document Embeddings)
Section titled “HyDE (Hypothetical Document Embeddings)”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\QueryTransformation\HyDEGenerator;use ClaudePhp\ClaudePhp;
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));
$hyde = new HyDEGenerator($client);
$query = 'What are the benefits of property hooks in PHP 8.4?';
echo "=== HyDE Query Transformation ===\n\n";echo "Original Query: {$query}\n\n";
// Generate hypothetical document$hypotheticalDoc = $hyde->generate($query);
echo "Hypothetical Document:\n{$hypotheticalDoc}\n\n";
// Combine for retrieval$augmentedQuery = $hyde->augmentQuery($query);
echo "Augmented Query (for retrieval):\n{$augmentedQuery}\n\n";
echo "Benefit: Search for document that would answer the question\n";echo " (More effective than searching for the question itself)\n";Query Decomposition
Section titled “Query Decomposition”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\QueryTransformation\QueryDecomposer;use ClaudePhp\ClaudePhp;
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));
$decomposer = new QueryDecomposer($client);
$complexQuery = 'Compare the performance and developer experience of Laravel vs Symfony for building REST APIs';
echo "=== Query Decomposition ===\n\n";echo "Complex Query: {$complexQuery}\n\n";
// Decompose into sub-queries$subQueries = $decomposer->decompose($complexQuery);
echo "Sub-Queries:\n";foreach ($subQueries as $i => $subQuery) { echo " " . ($i + 1) . ". {$subQuery}\n";}
echo "\nBenefit: Answer each sub-query separately, combine results\n";echo " (Better for multi-part questions)\n";Reranking Results
Section titled “Reranking Results”Retrieval returns candidates; reranking orders them by true relevance.
Score-Based Reranking
Section titled “Score-Based Reranking”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\Reranking\ScoreReranker;
$reranker = new ScoreReranker();
// Initial retrieval results (by similarity)$results = [ [ 'text' => 'PHP 8.4 introduces property hooks for clean getter/setter syntax.', 'score' => 0.85, 'metadata' => ['relevance' => 9, 'recency' => 0.9], ], [ 'text' => 'PHP versions prior to 8.0 required traditional getter/setter methods.', 'score' => 0.82, 'relevance' => 6, 'metadata' => ['recency' => 0.3], ], [ 'text' => 'Property hooks reduce boilerplate code significantly.', 'score' => 0.80, 'metadata' => ['relevance' => 8, 'recency' => 0.9], ],];
$query = 'PHP 8.4 property hooks';
echo "=== Score-Based Reranking ===\n\n";
// Rerank by composite score$reranked = $reranker->rerank($results, $query, weights: [ 'score' => 0.5, // Embedding similarity 'relevance' => 0.3, // Metadata relevance 'recency' => 0.2, // Recency score]);
echo "Reranked Results:\n";foreach ($reranked as $i => $result) { echo " " . ($i + 1) . ". {$result['text']}\n"; echo " Final Score: " . round($result['final_score'], 3) . "\n\n";}LLM-Based Reranking
Section titled “LLM-Based Reranking”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\Reranking\LLMReranker;use ClaudePhp\ClaudePhp;
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));$reranker = new LLMReranker($client);
$results = [ ['text' => 'PHP 8.4 property hooks provide clean syntax.'], ['text' => 'Laravel framework supports PHP 8.4.'], ['text' => 'Property hooks replace traditional getters/setters.'],];
$query = 'How do property hooks work in PHP 8.4?';
echo "=== LLM-Based Reranking ===\n\n";
// LLM judges relevance of each result to query$reranked = $reranker->rerank($results, $query);
echo "Reranked by LLM relevance:\n";foreach ($reranked as $i => $result) { echo " " . ($i + 1) . ". {$result['text']}\n"; echo " Relevance: " . round($result['relevance_score'], 3) . "\n\n";}
echo "Benefit: LLM understands semantic relevance better than keyword/embedding alone\n";Document Loaders
Section titled “Document Loaders”Load documents from various sources.
Framework Document Loaders
Section titled “Framework Document Loaders”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\RAG\RAGPipeline;use ClaudeAgents\RAG\Loaders\TextFileLoader;use ClaudeAgents\RAG\Loaders\JSONLoader;use ClaudeAgents\RAG\Loaders\CSVLoader;use ClaudeAgents\RAG\Loaders\DirectoryLoader;use ClaudeAgents\RAG\Loaders\WebLoader;use ClaudePhp\ClaudePhp;
$client = new ClaudePhp(apiKey: getenv('ANTHROPIC_API_KEY'));$rag = RAGPipeline::create($client);
echo "=== Document Loaders ===\n\n";
// 1. Text File Loaderecho "1. Loading text files...\n";$textLoader = new TextFileLoader();$documents = $textLoader->load(__DIR__ . '/docs/readme.txt');foreach ($documents as $doc) { $rag->addDocument($doc['title'], $doc['content'], $doc['metadata']);}echo " Loaded " . count($documents) . " text documents\n\n";
// 2. JSON Loaderecho "2. Loading JSON data...\n";$jsonLoader = new JSONLoader();$documents = $jsonLoader->load(__DIR__ . '/data/products.json');foreach ($documents as $doc) { $rag->addDocument($doc['title'], $doc['content'], $doc['metadata']);}echo " Loaded " . count($documents) . " JSON documents\n\n";
// 3. CSV Loaderecho "3. Loading CSV data...\n";$csvLoader = new CSVLoader();$documents = $csvLoader->load(__DIR__ . '/data/customers.csv');foreach ($documents as $doc) { $rag->addDocument($doc['title'], $doc['content'], $doc['metadata']);}echo " Loaded " . count($documents) . " CSV rows\n\n";
// 4. Directory Loader (recursive)echo "4. Loading directory...\n";$dirLoader = new DirectoryLoader([ 'extensions' => ['md', 'txt', 'php'], 'recursive' => true, 'exclude_patterns' => ['vendor', 'node_modules'],]);$documents = $dirLoader->load(__DIR__ . '/knowledge-base');foreach ($documents as $doc) { $rag->addDocument($doc['title'], $doc['content'], $doc['metadata']);}echo " Loaded " . count($documents) . " files from directory\n\n";
// 5. Web Loaderecho "5. Loading web content...\n";$webLoader = new WebLoader();$documents = $webLoader->load('https://www.php.net/releases/8.4/en.php');foreach ($documents as $doc) { $rag->addDocument($doc['title'], $doc['content'], $doc['metadata']);}echo " Loaded " . count($documents) . " web pages\n\n";
echo "Total: {$rag->getDocumentCount()} documents, {$rag->getChunkCount()} chunks\n";Production RAG System
Section titled “Production RAG System”Putting it all together with the framework’s RAGAgent.
Complete Production Implementation
Section titled “Complete Production Implementation”<?php
declare(strict_types=1);
require_once __DIR__ . '/../../vendor/autoload.php';
use ClaudeAgents\Factory\AgentFactory;use ClaudeAgents\RAG\Loaders\DirectoryLoader;use ClaudeAgents\RAG\Splitters\RecursiveCharacterTextSplitter;use ClaudeAgents\RAG\SemanticRetriever;use ClaudeAgents\RAG\Reranking\ScoreReranker;use ClaudePhp\ClaudePhp;use Monolog\Logger;use Monolog\Handler\StreamHandler;
class ProductionRAGSystem{ private AgentFactory $factory; private Logger $logger;
public function __construct(string $apiKey) { // Setup logger $this->logger = new Logger('rag-system'); $this->logger->pushHandler(new StreamHandler('php://stdout', Logger::INFO));
// Setup agent factory $client = new ClaudePhp(apiKey: $apiKey); $this->factory = new AgentFactory($client);
$this->logger->info('Production RAG system initialized'); }
/** * Create a RAG agent with production configuration. */ public function createRAGAgent(array $options = []) { return $this->factory->createRAGAgent(array_merge([ 'name' => 'production_rag_agent', 'top_k' => 5, 'logger' => $this->logger, 'enable_ml_optimization' => true, ], $options)); }
/** * Index a knowledge base directory. */ public function indexKnowledgeBase(string $path, $agent): array { $this->logger->info("Indexing knowledge base: {$path}");
// Load documents $loader = new DirectoryLoader([ 'extensions' => ['md', 'txt', 'php', 'json'], 'recursive' => true, 'exclude_patterns' => ['vendor', 'node_modules', 'tests'], ]);
$documents = $loader->load($path);
$this->logger->info('Loaded ' . count($documents) . ' documents');
// Use advanced chunking $splitter = new RecursiveCharacterTextSplitter( chunkSize: 800, chunkOverlap: 100 );
// Add to agent $indexed = 0; foreach ($documents as $doc) { // Split large documents if (strlen($doc['content']) > 1000) { $chunks = $splitter->split($doc['content']); foreach ($chunks as $i => $chunk) { $agent->addDocument( title: "{$doc['title']} (Part " . ($i + 1) . ")", content: $chunk, metadata: array_merge($doc['metadata'], ['chunk' => $i]) ); $indexed++; } } else { $agent->addDocument( title: $doc['title'], content: $doc['content'], metadata: $doc['metadata'] ); $indexed++; } }
$this->logger->info("Indexed {$indexed} document chunks");
return [ 'documents' => count($documents), 'chunks' => $indexed, ]; }
/** * Query with production features. */ public function query(string $question, $agent): array { $startTime = microtime(true);
$this->logger->info("Query: {$question}");
// Run agent $result = $agent->run($question);
$duration = microtime(true) - $startTime;
if (!$result->isSuccess()) { $this->logger->error("Query failed: {$result->getError()}"); return [ 'success' => false, 'error' => $result->getError(), ]; }
$metadata = $result->getMetadata();
$this->logger->info('Query completed', [ 'duration' => round($duration, 3), 'sources' => count($metadata['sources']), 'citations' => count($metadata['citations']), 'tokens' => $metadata['tokens'], ]);
return [ 'success' => true, 'answer' => $result->getAnswer(), 'sources' => $metadata['sources'], 'citations' => $metadata['citations'], 'metrics' => [ 'duration' => $duration, 'tokens' => $metadata['tokens'], 'document_count' => $metadata['document_count'], 'chunk_count' => $metadata['chunk_count'], ], ]; }}
// Usage$system = new ProductionRAGSystem(getenv('ANTHROPIC_API_KEY'));
echo "=== Production RAG System ===\n\n";
// Create agent$agent = $system->createRAGAgent();
// Index knowledge baseecho "Indexing knowledge base...\n";$stats = $system->indexKnowledgeBase(__DIR__ . '/knowledge-base', $agent);echo "Indexed {$stats['documents']} documents into {$stats['chunks']} chunks\n\n";
// Query$questions = [ 'How do property hooks work in PHP 8.4?', 'What are the performance benefits of the new JIT compiler?', 'How do I migrate from PHP 8.3 to 8.4?',];
foreach ($questions as $question) { echo "Q: {$question}\n";
$result = $system->query($question, $agent);
if ($result['success']) { echo "A: {$result['answer']}\n"; echo "\nSources: " . count($result['sources']) . " documents\n"; echo "Citations: " . implode(', ', array_map(fn($i) => "[{$i}]", $result['citations'])) . "\n"; echo "Tokens: {$result['metrics']['tokens']['input']} in, {$result['metrics']['tokens']['output']} out\n"; echo "Duration: " . round($result['metrics']['duration'], 3) . "s\n"; } else { echo "Error: {$result['error']}\n"; }
echo "\n" . str_repeat('-', 60) . "\n\n";}Summary
Section titled “Summary”In this chapter, you learned how to build production-grade RAG systems for agents:
✅ RAG architecture — Retrieval, augmentation, and generation pipeline
✅ Document chunking — Strategies for splitting documents effectively
✅ Vector stores — Semantic search with embeddings
✅ Citation generation — Grounded, verifiable responses
✅ Query transformation — Multi-query, HyDE, decomposition
✅ Reranking — Improving retrieval relevance
✅ Document loaders — Loading from files, databases, web
✅ Production patterns — Complete RAG system with the framework
With RAG, your agents can now answer questions from massive knowledge bases, cite sources, and dramatically reduce hallucinations.
Practice Exercises
Section titled “Practice Exercises”Exercise 1: Build a Documentation Q&A System
Section titled “Exercise 1: Build a Documentation Q&A System”Create a RAG system for your project’s documentation:
- Index all markdown files
- Support code examples in chunks
- Generate code-aware responses
- Track most-asked questions
Exercise 2: Implement Hybrid Search
Section titled “Exercise 2: Implement Hybrid Search”Combine keyword and semantic retrieval:
- Use both KeywordRetriever and SemanticRetriever
- Merge results with RRF (Reciprocal Rank Fusion)
- Compare performance vs single method
- Tune weights for your domain
Exercise 3: Add Metadata Filtering
Section titled “Exercise 3: Add Metadata Filtering”Implement filtered retrieval:
- Filter by date range
- Filter by author
- Filter by document type
- Filter by custom tags
Exercise 4: Build Multi-Document Reasoning
Section titled “Exercise 4: Build Multi-Document Reasoning”Answer questions requiring multiple sources:
- Decompose complex queries
- Retrieve for each sub-query
- Synthesize comprehensive answer
- Cite all relevant sources
Next Steps
Section titled “Next Steps”Now that you have RAG for grounding agent responses in external knowledge, you’re ready to build planning systems. In Chapter 09: Planning: From Tasks to Steps, you’ll implement task decomposition using PlanExecuteLoop, generate plans, track progress, and handle replanning when things change.
Continue to Chapter 09 →