11: Integrating PHP with Python for Advanced ML

Chapter 11: Integrating PHP with Python for Advanced ML
Section titled “Chapter 11: Integrating PHP with Python for Advanced ML”Overview
Section titled “Overview”You’ve learned the fundamentals of machine learning in PHP using PHP-ML and Rubix ML. These libraries are excellent for many tasks—classification, regression, clustering, even basic neural networks. But let’s be honest: Python’s machine learning ecosystem is massive. Libraries like scikit-learn, TensorFlow, PyTorch, pandas, and NumPy have decades of development, thousands of contributors, and implementations of cutting-edge algorithms that simply don’t exist in PHP.
Here’s the good news: you don’t have to choose between PHP and Python. You can have the best of both worlds. PHP excels at web development, API serving, database interaction, and integrating with existing applications. Python excels at data science, machine learning, and scientific computing. By combining them, you build systems where PHP handles user requests, business logic, and web interfaces while Python handles complex ML tasks, data preprocessing, and model training.
This chapter teaches you three practical integration strategies with increasing sophistication: shell execution (quick and simple), REST APIs (scalable and decoupled), and message queues (asynchronous and production-ready). You’ll build a complete sentiment analysis system where PHP accepts user reviews through a web interface and Python performs sophisticated text classification using scikit-learn. Every example includes proper error handling, security considerations, and performance patterns.
By the end of this chapter, you’ll confidently integrate Python ML capabilities into PHP applications, understand when each approach is appropriate, and have production-ready code patterns for building hybrid systems.
Prerequisites
Section titled “Prerequisites”Before starting this chapter, you should have:
- Completed Chapter 10 or equivalent understanding of ML concepts
- Completed Chapter 08 and have Rubix ML working
- PHP 8.4+ installed and configured from Chapter 02
- Python 3.10+ installed (check with
python3 --version) - Basic command-line familiarity (running scripts, installing packages)
- Understanding of JSON data format
- Experience training classifiers in PHP from earlier chapters
- Familiarity with basic security concepts (input validation, escaping)
Estimated Time: ~90-120 minutes (reading, setup, running examples, exercises)
Verify your setup:
# Check PHP versionphp --version # Should be 8.4+
# Check Python versionpython3 --version # Should be 3.10+
# Check if pip (Python package manager) is availablepython3 -m pip --versionWhat You’ll Build
Section titled “What You’ll Build”By the end of this chapter, you will have created:
- A “Hello World” integration demonstrating basic PHP-to-Python communication
- A JSON data exchange system passing structured data between languages
- A complete sentiment analyzer with PHP web interface and Python ML backend
- A model training pipeline that trains scikit-learn models and persists them
- A prediction service that loads trained models and classifies new text
- A security-hardened executor with input validation and error handling
- A REST API implementation using Flask with a PHP client
- An async message queue pattern for background ML processing
- A performance comparison tool measuring latency of each approach
- An error recovery system handling Python failures gracefully
- A production deployment guide for real-world hybrid applications
All code examples include security best practices, proper error handling, and performance considerations.
::: info Code Examples Complete, runnable examples for this chapter:
Integration Examples:
01-simple-shell/hello.php— Basic PHP calling Python01-simple-shell/hello.py— Python script receiving data02-data-passing/exchange.php— JSON data exchange02-data-passing/process.py— Python processing structured data
Sentiment Analysis Project:
03-sentiment-analysis/analyze.php— PHP web interface03-sentiment-analysis/train_model.py— Python training script03-sentiment-analysis/predict.py— Python prediction service03-sentiment-analysis/data/reviews.csv— Training data
REST API Example:
04-rest-api-example/flask_server.py— Flask ML API04-rest-api-example/php_client.php— PHP API client
Production Patterns:
05-production-patterns/secure_executor.php— Hardened shell execution05-production-patterns/async_queue.php— Redis queue example
See README.md for complete setup instructions.
:::
Quick Start
Section titled “Quick Start”Want to see PHP and Python working together right now? Here’s a 5-minute example:
::: tip
This is a simplified standalone example. Complete versions with full error handling are in code/chapter-11/. The pattern demonstrated here applies to all integration approaches throughout the chapter.
:::
Create two files:
<?php
declare(strict_types=1);
// Prepare data to send to Python$data = ['text' => 'This product is amazing! Highly recommend.'];$json = json_encode($data);
// Call Python script with data (properly escaped for security)$escaped = escapeshellarg($json);$output = shell_exec("python3 quick_sentiment.py {$escaped}");
// Parse Python's response$result = json_decode($output ?: '{}', true);
echo "Review: {$data['text']}\n";echo "Sentiment: {$result['sentiment']}\n";echo "Confidence: " . round($result['confidence'] * 100, 1) . "%\n";import sysimport json
# Simple sentiment analysis (real version uses scikit-learn)def analyze_sentiment(text): positive_words = ['amazing', 'excellent', 'great', 'love', 'recommend'] negative_words = ['terrible', 'awful', 'hate', 'worst', 'disappointing']
text_lower = text.lower() positive_count = sum(1 for word in positive_words if word in text_lower) negative_count = sum(1 for word in negative_words if word in text_lower)
if positive_count > negative_count: return {'sentiment': 'positive', 'confidence': 0.85} elif negative_count > positive_count: return {'sentiment': 'negative', 'confidence': 0.85} else: return {'sentiment': 'neutral', 'confidence': 0.60}
# Read data from PHPinput_data = json.loads(sys.argv[1])result = analyze_sentiment(input_data['text'])
# Send result back to PHPprint(json.dumps(result))Run it:
php quick_integrate.phpExpected output:
Review: This product is amazing! Highly recommend.Sentiment: positiveConfidence: 85.0%What just happened? PHP prepared data as JSON, called the Python script with that data as a command-line argument, and Python performed sentiment analysis and returned the result as JSON. This simple pattern scales to complex ML tasks!
Now let’s build production-ready integration systems…
Objectives
Section titled “Objectives”By the end of this chapter, you will be able to:
- Execute Python scripts from PHP using shell commands with proper error handling
- Exchange structured data between PHP and Python using JSON
- Secure integration points by validating inputs and escaping shell arguments
- Train ML models in Python from PHP-initiated processes
- Load and use trained models for real-time predictions in web applications
- Build REST APIs with Flask/FastAPI that serve ML models to PHP clients
- Implement async processing using message queues for long-running ML tasks
- Compare integration strategies and choose the right approach for your use case
- Handle errors gracefully when Python services fail or are unavailable
- Optimize performance by caching results and minimizing inter-process communication
- Deploy hybrid applications combining PHP web layers with Python ML backends
Integration Strategies Overview
Section titled “Integration Strategies Overview”Before diving into code, let’s understand the three main approaches for PHP-Python integration:
Strategy 1: Shell Execution (Synchronous)
Section titled “Strategy 1: Shell Execution (Synchronous)”How it works: PHP uses shell_exec(), exec(), or proc_open() to run Python scripts as separate processes.
Pros:
- Simplest to implement
- No additional infrastructure needed
- Works immediately on any server with Python installed
- Direct communication via command-line arguments and output
Cons:
- Synchronous (PHP waits for Python to complete)
- Higher latency (process startup overhead)
- Limited scalability for high-traffic applications
- Security risks if inputs aren’t properly escaped
Best for: Simple integrations, development environments, low-traffic applications, one-off batch processing.
Strategy 2: REST API (HTTP-based)
Section titled “Strategy 2: REST API (HTTP-based)”How it works: Python runs as a separate web service (Flask, FastAPI, Django). PHP makes HTTP requests to the Python API.
Pros:
- Language-independent (Python service could be anywhere)
- Scalable (multiple Python workers, load balancing)
- Decoupled architecture (services can be deployed independently)
- Can handle concurrent requests efficiently
Cons:
- Requires running and maintaining a separate service
- Network latency overhead
- More complex deployment (two services instead of one)
- Need to handle service availability and retries
Best for: Production applications, microservices architectures, shared ML services across multiple applications, high-traffic systems.
Strategy 3: Message Queue (Asynchronous)
Section titled “Strategy 3: Message Queue (Asynchronous)”How it works: PHP sends ML tasks to a queue (Redis, RabbitMQ). Python workers consume tasks from the queue, process them, and return results.
Pros:
- Asynchronous (PHP doesn’t wait)
- Highly scalable (add more workers as needed)
- Resilient (tasks survive server restarts)
- Decouples request handling from processing
Cons:
- Most complex to implement
- Requires message queue infrastructure (Redis, RabbitMQ, etc.)
- Not suitable for real-time predictions (async by nature)
- Need to handle result delivery (callbacks, polling, websockets)
Best for: Batch processing, long-running ML tasks (training, large-scale predictions), background jobs, systems requiring high throughput.
Comparison Table
Section titled “Comparison Table”| Feature | Shell Execution | REST API | Message Queue |
|---|---|---|---|
| Complexity | Low | Medium | High |
| Latency | ~100-500ms | ~50-200ms | Async (seconds-minutes) |
| Scalability | Limited | High | Very High |
| Infrastructure | None | Web server | Queue system |
| Real-time? | Yes (blocking) | Yes | No (async) |
| Best Use Case | Development, simple tasks | Production APIs | Background processing |
Now let’s implement each strategy with working code!
Step 1: Basic Shell Execution (~15 min)
Section titled “Step 1: Basic Shell Execution (~15 min)”Establish the simplest possible PHP-to-Python communication using shell commands.
Actions
Section titled “Actions”- Create a Python script that receives data:
import sysimport json
def main(): # Python can receive data via command-line arguments if len(sys.argv) < 2: print(json.dumps({'error': 'No data provided'})) sys.exit(1)
# Parse JSON input from PHP try: input_data = json.loads(sys.argv[1]) name = input_data.get('name', 'World')
# Process the data (trivial example) result = { 'greeting': f'Hello, {name}!', 'processed_by': 'Python 3', 'input_received': input_data }
# Output JSON for PHP to parse print(json.dumps(result)) except json.JSONDecodeError as e: print(json.dumps({'error': f'Invalid JSON: {str(e)}'})) sys.exit(1)
if __name__ == '__main__': main()- Create a PHP script that calls Python:
<?php
declare(strict_types=1);
/** * Basic PHP-Python integration via shell execution. * * This demonstrates the fundamental pattern: * 1. PHP prepares data as JSON * 2. PHP calls Python script with data * 3. Python processes and outputs JSON * 4. PHP parses and uses the result */
function callPythonScript(string $scriptPath, array $data): ?array{ // Step 1: Encode data as JSON $json = json_encode($data); if ($json === false) { throw new RuntimeException('Failed to encode data as JSON'); }
// Step 2: Escape for shell (CRITICAL for security) $escapedJson = escapeshellarg($json);
// Step 3: Build and execute command $command = "python3 {$scriptPath} {$escapedJson}"; $output = shell_exec($command);
if ($output === null) { throw new RuntimeException('Failed to execute Python script'); }
// Step 4: Parse Python's JSON output $result = json_decode($output, true); if (json_last_error() !== JSON_ERROR_NONE) { throw new RuntimeException('Invalid JSON from Python: ' . json_last_error_msg()); }
return $result;}
// Example usagetry { echo "=== Basic PHP-Python Integration ===\n\n";
// Test 1: Simple greeting $result1 = callPythonScript('hello.py', ['name' => 'PHP Developer']); echo "Test 1 - Simple greeting:\n"; echo " {$result1['greeting']}\n"; echo " Processed by: {$result1['processed_by']}\n\n";
// Test 2: Complex data $result2 = callPythonScript('hello.py', [ 'name' => 'Machine Learning Engineer', 'skills' => ['PHP', 'Python', 'ML'], 'experience' => 5 ]); echo "Test 2 - Complex data:\n"; echo " {$result2['greeting']}\n"; echo " Data received by Python: " . json_encode($result2['input_received']) . "\n\n";
echo "✅ Integration working successfully!\n";
} catch (RuntimeException $e) { echo "❌ Error: {$e->getMessage()}\n"; exit(1);}- Run the example:
cd docs/series/ai-ml-php-developers/code/chapter-11/01-simple-shellphp hello.phpExpected Result
Section titled “Expected Result”=== Basic PHP-Python Integration ===
Test 1 - Simple greeting: Hello, PHP Developer! Processed by: Python 3
Test 2 - Complex data: Hello, Machine Learning Engineer! Data received by Python: {"name":"Machine Learning Engineer","skills":["PHP","Python","ML"],"experience":5}
✅ Integration working successfully!Why It Works
Section titled “Why It Works”The integration pattern follows these steps:
-
PHP prepares data: Any PHP array or object is converted to JSON—a language-neutral format that both PHP and Python understand natively.
-
Security escaping:
escapeshellarg()wraps the JSON string in quotes and escapes special characters, preventing shell injection attacks. This is CRITICAL—never pass user input to shell commands without escaping. -
Process execution:
shell_exec()runs the Python interpreter with your script and arguments. Python starts in a new process, executes, and terminates. -
Output capture: Everything Python prints to stdout is captured by PHP as a string. By printing JSON from Python, PHP can easily parse the structured response.
-
Error handling: Check for
nullreturns (command failed), validate JSON parsing, and use try-catch to handle failures gracefully.
This pattern works for any Python script that accepts JSON input and produces JSON output. The overhead is ~100-500ms per call depending on script complexity and Python startup time.
Troubleshooting
Section titled “Troubleshooting”Error: “python3: command not found”
Python 3 isn’t in your system PATH. Find the correct path:
# macOS/Linuxwhich python3
# Use the full path in your PHP code$command = "/usr/local/bin/python3 {$scriptPath} {$escapedJson}";Error: “No data provided”
The argument didn’t reach Python. Debug by checking the command:
echo "Command: {$command}\n"; // Add before shell_exec()Verify escapeshellarg() is applied—without it, JSON might be parsed as multiple arguments.
Error: “Invalid JSON from Python”
Python script is outputting something other than JSON (error messages, warnings, debug prints). Ensure Python only prints JSON:
# Bad - mixes debug output with JSONprint("Debug info")print(json.dumps(result))
# Good - only JSON outputimport sysprint(json.dumps(result), file=sys.stdout)Performance issues
Each call spawns a new Python process. For high-frequency calls (>10/sec), consider:
- Caching results
- Batching multiple predictions
- Switching to REST API strategy
Step 2: Exchanging Complex Data (~10 min)
Section titled “Step 2: Exchanging Complex Data (~10 min)”Learn how to pass structured data (arrays, nested objects, multiple values) between PHP and Python reliably.
Actions
Section titled “Actions”- Create a Python script that processes structured data:
import sysimport jsonfrom typing import Dict, Any
def process_user_data(user: Dict[str, Any]) -> Dict[str, Any]: """ Example of processing complex structured data. In a real ML scenario, this might extract features, normalize values, etc. """ # Extract and validate fields name = user.get('name', 'Unknown') age = user.get('age', 0) purchases = user.get('purchases', [])
# Perform calculations total_spent = sum(p.get('amount', 0) for p in purchases) avg_purchase = total_spent / len(purchases) if purchases else 0
# Classify user segment (simple business logic) if total_spent > 1000 and len(purchases) > 10: segment = 'VIP' elif total_spent > 500: segment = 'Regular' else: segment = 'New'
return { 'user_id': user.get('id'), 'name': name, 'segment': segment, 'metrics': { 'total_purchases': len(purchases), 'total_spent': round(total_spent, 2), 'avg_purchase_value': round(avg_purchase, 2) }, 'recommendations': generate_recommendations(segment) }
def generate_recommendations(segment: str) -> list: """Generate product recommendations based on segment.""" recommendations = { 'VIP': ['Premium Bundle', 'Exclusive Access', 'Priority Support'], 'Regular': ['Popular Items', 'Seasonal Deals', 'Member Benefits'], 'New': ['Starter Pack', 'Welcome Offer', 'Getting Started Guide'] } return recommendations.get(segment, [])
def main(): try: # Read input from PHP if len(sys.argv) < 2: raise ValueError('No input data provided')
input_data = json.loads(sys.argv[1])
# Process single user or batch of users if isinstance(input_data, dict): # Single user result = process_user_data(input_data) elif isinstance(input_data, list): # Batch of users result = [process_user_data(user) for user in input_data] else: raise ValueError('Input must be object or array')
# Return result to PHP print(json.dumps(result, indent=2))
except Exception as e: error_result = { 'error': str(e), 'type': type(e).__name__ } print(json.dumps(error_result)) sys.exit(1)
if __name__ == '__main__': main()- Create PHP script that sends complex data:
<?php
declare(strict_types=1);
/** * Demonstrates passing complex, nested data structures between PHP and Python. * * Use cases: * - Feature extraction from user data * - Batch processing multiple records * - Preprocessing before ML prediction */
function callPythonProcessor(array $data): array{ $json = json_encode($data); $escaped = escapeshellarg($json); $output = shell_exec("python3 process.py {$escaped}");
if ($output === null) { throw new RuntimeException('Python script execution failed'); }
$result = json_decode($output, true);
if (isset($result['error'])) { throw new RuntimeException("Python error: {$result['error']}"); }
return $result;}
// Example 1: Process a single userecho "=== Example 1: Single User Processing ===\n\n";
$user = [ 'id' => 12345, 'name' => 'Alice Johnson', 'age' => 34, 'purchases' => [ ['product' => 'Laptop', 'amount' => 1200.00, 'date' => '2024-01-15'], ['product' => 'Mouse', 'amount' => 25.00, 'date' => '2024-01-15'], ['product' => 'Keyboard', 'amount' => 80.00, 'date' => '2024-02-03'], ['product' => 'Monitor', 'amount' => 350.00, 'date' => '2024-03-10'], ]];
try { $result = callPythonProcessor($user);
echo "User: {$result['name']}\n"; echo "Segment: {$result['segment']}\n"; echo "Total Purchases: {$result['metrics']['total_purchases']}\n"; echo "Total Spent: \${$result['metrics']['total_spent']}\n"; echo "Average Purchase: \${$result['metrics']['avg_purchase_value']}\n"; echo "Recommendations:\n"; foreach ($result['recommendations'] as $rec) { echo " - {$rec}\n"; }
} catch (RuntimeException $e) { echo "Error: {$e->getMessage()}\n"; exit(1);}
echo "\n=== Example 2: Batch Processing ===\n\n";
$users = [ [ 'id' => 1, 'name' => 'Bob Smith', 'purchases' => [ ['amount' => 50], ['amount' => 75], ] ], [ 'id' => 2, 'name' => 'Carol White', 'purchases' => [ ['amount' => 600], ['amount' => 450], ['amount' => 300], ] ],];
try { $results = callPythonProcessor($users);
foreach ($results as $result) { echo "{$result['name']}: {$result['segment']} segment "; echo "(\${$result['metrics']['total_spent']} total)\n"; }
echo "\n✅ Complex data exchange working!\n";
} catch (RuntimeException $e) { echo "Error: {$e->getMessage()}\n"; exit(1);}- Run the example:
cd docs/series/ai-ml-php-developers/code/chapter-11/02-data-passingphp exchange.phpExpected Result
Section titled “Expected Result”=== Example 1: Single User Processing ===
User: Alice JohnsonSegment: VIPTotal Purchases: 4Total Spent: $1655.00Average Purchase: $413.75Recommendations: - Premium Bundle - Exclusive Access - Priority Support
=== Example 2: Batch Processing ===
Bob Smith: New segment ($125.00 total)Carol White: Regular segment ($1350.00 total)
✅ Complex data exchange working!Why It Works
Section titled “Why It Works”JSON is the bridge between PHP and Python data structures:
PHP arrays map to Python lists/dicts:
- PHP
['a', 'b', 'c']→ Python['a', 'b', 'c'](list) - PHP
['key' => 'value']→ Python{'key': 'value'}(dict)
Nested structures work naturally:
- Both languages handle deeply nested JSON (objects within arrays within objects)
- Type preservation: strings stay strings, numbers stay numbers, booleans stay booleans
nullin PHP becomesNonein Python
Performance considerations:
- JSON encoding/decoding is fast (microseconds for typical data)
- Overhead comes from process spawning, not data transfer
- For large datasets (>1MB), consider writing to temporary files instead of command-line arguments
Type safety:
- Use type hints in Python (
Dict[str, Any]) for better error messages - Validate data structure on both sides
- Handle missing keys with
.get()(Python) or??(PHP)
Troubleshooting
Section titled “Troubleshooting”Data not arriving correctly in Python
Debug by printing what Python received:
print(f"DEBUG: Received {len(sys.argv)} arguments", file=sys.stderr)print(f"DEBUG: First arg: {sys.argv[1] if len(sys.argv) > 1 else 'NONE'}", file=sys.stderr)stderr won’t interfere with stdout JSON parsing.
Encoding issues with special characters
Ensure both PHP and Python use UTF-8:
// PHP$json = json_encode($data, JSON_UNESCAPED_UNICODE);# Python (usually default in Python 3)print(json.dumps(result, ensure_ascii=False))Large data causing “Argument list too long” error
Command-line arguments have size limits (~2MB on most systems). For larger data, use temporary files:
// PHP - write to temp file$tempFile = tempnam(sys_get_temp_dir(), 'php_python_');file_put_contents($tempFile, json_encode($data));$output = shell_exec("python3 process.py {$tempFile}");unlink($tempFile); // Clean up# Python - read from fileimport syswith open(sys.argv[1], 'r') as f: input_data = json.load(f)Step 3: Building a Sentiment Analyzer (~30 min)
Section titled “Step 3: Building a Sentiment Analyzer (~30 min)”Create a complete production-ready sentiment analysis system using PHP for the web interface and Python’s scikit-learn for machine learning.
Actions
Section titled “Actions”- Create training data:
text,sentiment"This product is amazing! Exceeded all expectations.",positive"Terrible quality. Broke after one day.",negative"Okay product. Nothing special.",neutral"Love it! Best purchase ever.",positive"Waste of money. Very disappointed.",negative"Does what it's supposed to do.",neutral"Absolutely fantastic! Highly recommend.",positive"Poor design and bad customer service.",negative"Average product at a fair price.",neutral"Incredible value! Can't believe the quality.",positive"Arrived damaged and seller won't respond.",negative"It's fine. Works as expected.",neutral"Outstanding! Solved all my problems.",positive"Cheap materials. Not worth it.",negative"Decent product for the price.",neutral"Exceeded my expectations! Five stars.",positive"Horrible experience from start to finish.",negative"Acceptable but nothing to write home about.",neutral"Absolutely perfect! Love everything about it.",positive"Completely useless. Total waste.",negative"It works. That's about it.",neutral"Best product I've ever bought!",positive"Broke immediately. Don't buy.",negative"Mediocre at best.",neutral"Fantastic quality and fast shipping!",positive"Disappointed with the quality.",negative"It's okay for the price.",neutral"Amazing product! Can't recommend enough.",positive"Not as described. Very misleading.",negative"Does the job. Nothing more.",neutral- Create Python training script:
"""Train a sentiment analysis model using scikit-learn.
This script:1. Loads training data from CSV2. Extracts text features using TF-IDF3. Trains a Naive Bayes classifier4. Saves the trained model and vectorizer for later use"""
import sysimport jsonimport pandas as pdfrom pathlib import Pathfrom sklearn.feature_extraction.text import TfidfVectorizerfrom sklearn.naive_bayes import MultinomialNBfrom sklearn.model_selection import train_test_split, cross_val_scorefrom sklearn.metrics import classification_report, accuracy_scoreimport joblib
def train_sentiment_model(data_path: str, model_dir: str = 'models'): """Train and save sentiment analysis model."""
# Create model directory if it doesn't exist Path(model_dir).mkdir(parents=True, exist_ok=True)
# Load training data print("Loading training data...") df = pd.read_csv(data_path) print(f"Loaded {len(df)} reviews") print(f"Sentiment distribution:\n{df['sentiment'].value_counts()}\n")
# Split data X_train, X_test, y_train, y_test = train_test_split( df['text'], df['sentiment'], test_size=0.2, random_state=42, stratify=df['sentiment'] )
# Create TF-IDF vectorizer print("Creating TF-IDF features...") vectorizer = TfidfVectorizer( max_features=1000, ngram_range=(1, 2), # unigrams and bigrams min_df=2, stop_words='english' )
X_train_vec = vectorizer.fit_transform(X_train) X_test_vec = vectorizer.transform(X_test)
# Train classifier print("Training Naive Bayes classifier...") classifier = MultinomialNB(alpha=0.1) classifier.fit(X_train_vec, y_train)
# Evaluate print("\n=== Model Evaluation ===") y_pred = classifier.predict(X_test_vec) accuracy = accuracy_score(y_test, y_pred) print(f"Test Accuracy: {accuracy:.2%}")
print("\nClassification Report:") print(classification_report(y_test, y_pred))
# Cross-validation cv_scores = cross_val_score( classifier, X_train_vec, y_train, cv=5, scoring='accuracy' ) print(f"Cross-validation scores: {cv_scores}") print(f"Mean CV accuracy: {cv_scores.mean():.2%} (+/- {cv_scores.std() * 2:.2%})")
# Save model and vectorizer model_path = Path(model_dir) / 'sentiment_model.pkl' vectorizer_path = Path(model_dir) / 'vectorizer.pkl'
print(f"\nSaving model to {model_path}") joblib.dump(classifier, model_path) joblib.dump(vectorizer, vectorizer_path)
print("✅ Training complete!")
return { 'accuracy': float(accuracy), 'cv_mean': float(cv_scores.mean()), 'cv_std': float(cv_scores.std()), 'model_path': str(model_path), 'vectorizer_path': str(vectorizer_path) }
def main(): if len(sys.argv) < 2: print(json.dumps({'error': 'Usage: python train_model.py <data_path>'})) sys.exit(1)
data_path = sys.argv[1]
try: result = train_sentiment_model(data_path) print(json.dumps(result)) except Exception as e: print(json.dumps({'error': str(e)})) sys.exit(1)
if __name__ == '__main__': main()- Create Python prediction script:
"""Use trained sentiment model to predict sentiment of new text.
This script:1. Loads the trained model and vectorizer2. Receives text from PHP3. Transforms text using TF-IDF4. Predicts sentiment5. Returns prediction with confidence scores"""
import sysimport jsonfrom pathlib import Pathimport joblibimport numpy as np
def load_model(model_dir: str = 'models'): """Load trained model and vectorizer.""" model_path = Path(model_dir) / 'sentiment_model.pkl' vectorizer_path = Path(model_dir) / 'vectorizer.pkl'
if not model_path.exists() or not vectorizer_path.exists(): raise FileNotFoundError( f"Model files not found in {model_dir}. " "Run train_model.py first." )
classifier = joblib.load(model_path) vectorizer = joblib.load(vectorizer_path)
return classifier, vectorizer
def predict_sentiment(text: str, classifier, vectorizer): """Predict sentiment for given text.""" # Transform text to TF-IDF features text_vec = vectorizer.transform([text])
# Predict sentiment prediction = classifier.predict(text_vec)[0]
# Get probability scores for all classes probabilities = classifier.predict_proba(text_vec)[0] classes = classifier.classes_
# Build confidence scores confidence_scores = { cls: float(prob) for cls, prob in zip(classes, probabilities) }
return { 'text': text, 'sentiment': prediction, 'confidence': float(probabilities.max()), 'all_scores': confidence_scores }
def main(): try: # Parse input from PHP if len(sys.argv) < 2: raise ValueError('No input data provided')
input_data = json.loads(sys.argv[1]) text = input_data.get('text', '')
if not text: raise ValueError('Text field is required')
# Load model classifier, vectorizer = load_model()
# Make prediction result = predict_sentiment(text, classifier, vectorizer)
# Return result print(json.dumps(result))
except Exception as e: error_result = { 'error': str(e), 'type': type(e).__name__ } print(json.dumps(error_result)) sys.exit(1)
if __name__ == '__main__': main()- Create PHP interface:
<?php
declare(strict_types=1);
/** * Sentiment Analysis System * * This demonstrates a complete ML integration: * 1. Training a model from data * 2. Using the trained model for predictions * 3. Proper error handling and validation * 4. Performance monitoring */
class SentimentAnalyzer{ public function __construct( private string $pythonPath = 'python3', private string $scriptDir = __DIR__, private string $modelDir = 'models' ) {}
/** * Train a new sentiment model from CSV data. */ public function train(string $dataPath): array { $start = microtime(true);
echo "Training sentiment model...\n"; echo "Data: {$dataPath}\n\n";
$escapedPath = escapeshellarg($dataPath); $command = "{$this->pythonPath} {$this->scriptDir}/train_model.py {$escapedPath}";
// Capture both stdout and stderr $descriptors = [ 0 => ['pipe', 'r'], // stdin 1 => ['pipe', 'w'], // stdout 2 => ['pipe', 'w'], // stderr ];
$process = proc_open($command, $descriptors, $pipes);
if (!is_resource($process)) { throw new RuntimeException('Failed to start Python training process'); }
fclose($pipes[0]); // Close stdin
$stdout = stream_get_contents($pipes[1]); $stderr = stream_get_contents($pipes[2]);
fclose($pipes[1]); fclose($pipes[2]);
$exitCode = proc_close($process);
echo "--- Training Output ---\n"; echo $stderr; // Training prints to stderr echo "----------------------\n\n";
if ($exitCode !== 0) { throw new RuntimeException("Training failed with exit code {$exitCode}"); }
// Parse result from last line of stdout $lines = array_filter(explode("\n", trim($stdout))); $lastLine = end($lines); $result = json_decode($lastLine, true);
if (isset($result['error'])) { throw new RuntimeException("Training error: {$result['error']}"); }
$duration = microtime(true) - $start; $result['training_time'] = round($duration, 2);
return $result; }
/** * Predict sentiment for given text. */ public function predict(string $text): array { if (empty(trim($text))) { throw new InvalidArgumentException('Text cannot be empty'); }
$start = microtime(true);
$data = json_encode(['text' => $text]); $escaped = escapeshellarg($data);
$command = "{$this->pythonPath} {$this->scriptDir}/predict.py {$escaped}"; $output = shell_exec($command);
if ($output === null) { throw new RuntimeException('Failed to execute prediction script'); }
$result = json_decode($output, true);
if (isset($result['error'])) { throw new RuntimeException("Prediction error: {$result['error']}"); }
$duration = microtime(true) - $start; $result['prediction_time'] = round($duration * 1000, 2); // milliseconds
return $result; }
/** * Batch predict sentiments for multiple texts. */ public function predictBatch(array $texts): array { return array_map( fn(string $text) => $this->predict($text), $texts ); }}
// Example usagetry { $analyzer = new SentimentAnalyzer();
// Step 1: Train the model echo "=== Step 1: Train Model ===\n"; $trainingResult = $analyzer->train('data/reviews.csv'); echo "✅ Training completed in {$trainingResult['training_time']}s\n"; echo " Test Accuracy: " . round($trainingResult['accuracy'] * 100, 1) . "%\n"; echo " CV Accuracy: " . round($trainingResult['cv_mean'] * 100, 1) . "% "; echo "(±" . round($trainingResult['cv_std'] * 100, 1) . "%)\n\n";
// Step 2: Make predictions echo "=== Step 2: Predict Sentiments ===\n\n";
$testReviews = [ "This is absolutely wonderful! I love it so much!", "Terrible product. Complete waste of money.", "It's okay. Nothing special but it works.", "Best purchase I've made this year! Highly recommended!", "Disappointed with the quality. Not worth the price." ];
foreach ($testReviews as $review) { $result = $analyzer->predict($review);
$emoji = match($result['sentiment']) { 'positive' => '😊', 'negative' => '😞', 'neutral' => '😐', default => '❓' };
echo "{$emoji} {$result['sentiment']} "; echo "(" . round($result['confidence'] * 100, 1) . "% confident, "; echo "{$result['prediction_time']}ms)\n"; echo " \"{$review}\"\n\n"; }
echo "✅ Sentiment analysis complete!\n";
} catch (Exception $e) { echo "❌ Error: {$e->getMessage()}\n"; exit(1);}- Install Python dependencies and run:
cd docs/series/ai-ml-php-developers/code/chapter-11/03-sentiment-analysis
# Install required Python packagespython3 -m pip install pandas scikit-learn joblib
# Run the complete examplephp analyze.phpExpected Result
Section titled “Expected Result”=== Step 1: Train Model ===Training sentiment model...Data: data/reviews.csv
--- Training Output ---Loading training data...Loaded 30 reviewsSentiment distribution:positive 10neutral 10negative 10Name: sentiment, dtype: int64
Creating TF-IDF features...Training Naive Bayes classifier...
=== Model Evaluation ===Test Accuracy: 100.00%
Classification Report: precision recall f1-score support
negative 1.00 1.00 1.00 2 neutral 1.00 1.00 1.00 2 positive 1.00 1.00 1.00 2
accuracy 1.00 6 macro avg 1.00 1.00 1.00 6weighted avg 1.00 1.00 1.00 6
Cross-validation scores: [0.83333333 1. 1. 1. 1. ]Mean CV accuracy: 96.67% (+/- 13.33%)
Saving model to models/sentiment_model.pkl✅ Training complete!----------------------
✅ Training completed in 0.45s Test Accuracy: 100.0% CV Accuracy: 96.7% (±13.3%)
=== Step 2: Predict Sentiments ===
😊 positive (99.9% confident, 45.32ms) "This is absolutely wonderful! I love it so much!"
😞 negative (99.8% confident, 42.18ms) "Terrible product. Complete waste of money."
😐 neutral (87.5% confident, 43.67ms) "It's okay. Nothing special but it works."
😊 positive (99.7% confident, 41.92ms) "Best purchase I've made this year! Highly recommended!"
😞 negative (95.3% confident, 42.54ms) "Disappointed with the quality. Not worth the price."
✅ Sentiment analysis complete!Why It Works
Section titled “Why It Works”This sentiment analyzer demonstrates several professional patterns:
Model Training Pipeline:
- Data loading: pandas reads CSV into a DataFrame for easy manipulation
- Feature extraction: TF-IDF (Term Frequency-Inverse Document Frequency) converts text to numeric vectors that capture word importance
- Train/test split: 80% training, 20% testing with stratification to maintain class balance
- Cross-validation: 5-fold CV provides robust accuracy estimate beyond simple train/test split
- Model persistence: joblib saves trained model and vectorizer to disk for reuse
Prediction Service:
- Model loading: Load once (in production, load at server start, not per request)
- Feature transformation: Apply same TF-IDF transformation used during training
- Probability scores: Get confidence for all classes, not just top prediction
- Error handling: Validate input, check model existence, handle exceptions
PHP Integration Layer:
- proc_open() for training: Captures both stdout (results) and stderr (progress logs)
- shell_exec() for prediction: Simpler API for quick predictions
- Timing measurements: Monitor performance to detect slowdowns
- Validation: Check inputs, validate JSON, handle errors gracefully
Performance characteristics:
- Training: ~0.5-2 seconds (one-time operation, can be offline)
- Prediction: ~40-50ms per text (includes Python startup ~30ms + inference ~10ms)
- Model size: ~50KB (vectorizer) + ~10KB (classifier) = tiny, loads instantly
Troubleshooting
Section titled “Troubleshooting”Error: “ModuleNotFoundError: No module named ‘sklearn’”
Install scikit-learn:
python3 -m pip install scikit-learn
# Or with specific versionpython3 -m pip install scikit-learn==1.5.0Error: “Model files not found”
Run training first to create model files:
php analyze.php # This runs training as Step 1Or train manually:
python3 train_model.py data/reviews.csvLow accuracy on real data
The example uses only 30 reviews. For production:
- Use 1000+ labeled examples minimum
- Balance classes (equal positive/negative/neutral)
- Use real product reviews from your domain
- Tune TF-IDF parameters (max_features, ngram_range)
- Try different classifiers (LinearSVC, RandomForest)
Predictions take too long (>100ms)
Optimize by:
- Keep Python process alive (use REST API strategy instead of shell)
- Batch predictions (process multiple texts at once)
- Cache common predictions (repeat queries)
- Use smaller models (reduce max_features in TF-IDF)
Memory issues during training
Reduce feature count:
vectorizer = TfidfVectorizer( max_features=500, # Reduced from 1000 min_df=3, # Require term in at least 3 documents)Step 4: REST API Integration (~20 min)
Section titled “Step 4: REST API Integration (~20 min)”Build a Python REST API using Flask that serves ML predictions, and a PHP client that calls it.
Actions
Section titled “Actions”- Create Flask ML API:
"""Flask REST API for sentiment analysis.
This approach is more scalable than shell execution:- Python process stays alive (no startup overhead)- Can handle concurrent requests- Standard HTTP protocol- Easy to deploy separately and scale horizontally"""
from flask import Flask, request, jsonifyfrom pathlib import Pathimport joblibimport sysimport os
# Add parent directory to path to import model loading logicsys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '03-sentiment-analysis'))
app = Flask(__name__)
# Load model at startup (not per request!)MODEL_DIR = '../03-sentiment-analysis/models'classifier = Nonevectorizer = None
def load_models(): """Load trained models into memory.""" global classifier, vectorizer
model_path = Path(MODEL_DIR) / 'sentiment_model.pkl' vectorizer_path = Path(MODEL_DIR) / 'vectorizer.pkl'
if not model_path.exists(): raise FileNotFoundError( f"Model not found at {model_path}. " "Run training first: cd ../03-sentiment-analysis && php analyze.php" )
classifier = joblib.load(model_path) vectorizer = joblib.load(vectorizer_path) print("✅ Models loaded successfully")
@app.route('/health', methods=['GET'])def health_check(): """Health check endpoint.""" return jsonify({ 'status': 'healthy', 'model_loaded': classifier is not None })
@app.route('/predict', methods=['POST'])def predict(): """Predict sentiment for given text.""" try: # Parse request data = request.get_json()
if not data or 'text' not in data: return jsonify({'error': 'Missing "text" field'}), 400
text = data['text']
if not text or not text.strip(): return jsonify({'error': 'Text cannot be empty'}), 400
# Transform and predict text_vec = vectorizer.transform([text]) prediction = classifier.predict(text_vec)[0] probabilities = classifier.predict_proba(text_vec)[0]
# Build response confidence_scores = { cls: float(prob) for cls, prob in zip(classifier.classes_, probabilities) }
return jsonify({ 'text': text, 'sentiment': prediction, 'confidence': float(probabilities.max()), 'all_scores': confidence_scores })
except Exception as e: return jsonify({'error': str(e)}), 500
@app.route('/predict/batch', methods=['POST'])def predict_batch(): """Predict sentiments for multiple texts.""" try: data = request.get_json()
if not data or 'texts' not in data: return jsonify({'error': 'Missing "texts" array'}), 400
texts = data['texts']
if not isinstance(texts, list): return jsonify({'error': '"texts" must be an array'}), 400
# Transform all texts at once (efficient) texts_vec = vectorizer.transform(texts) predictions = classifier.predict(texts_vec) probabilities = classifier.predict_proba(texts_vec)
# Build results results = [] for text, pred, probs in zip(texts, predictions, probabilities): results.append({ 'text': text, 'sentiment': pred, 'confidence': float(probs.max()) })
return jsonify({'predictions': results})
except Exception as e: return jsonify({'error': str(e)}), 500
if __name__ == '__main__': print("Starting Flask ML API...") load_models() app.run(host='127.0.0.1', port=5000, debug=True)- Create PHP client:
<?php
declare(strict_types=1);
/** * PHP client for Flask ML API. * * Advantages over shell execution: * - Lower latency (no process spawn) * - Better for high traffic * - Can use load balancing * - Standard HTTP protocol */
class MLApiClient{ public function __construct( private string $baseUrl = 'http://127.0.0.1:5000', private int $timeout = 30 ) {}
/** * Check if API is healthy and ready. */ public function healthCheck(): bool { try { $response = $this->get('/health'); return $response['status'] === 'healthy' && $response['model_loaded']; } catch (Exception $e) { return false; } }
/** * Predict sentiment for a single text. */ public function predict(string $text): array { return $this->post('/predict', ['text' => $text]); }
/** * Predict sentiments for multiple texts (efficient batch operation). */ public function predictBatch(array $texts): array { $response = $this->post('/predict/batch', ['texts' => $texts]); return $response['predictions']; }
/** * Make GET request to API. */ private function get(string $endpoint): array { $url = $this->baseUrl . $endpoint;
$ch = curl_init($url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_TIMEOUT, $this->timeout);
$response = curl_exec($ch); $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE); $error = curl_error($ch); curl_close($ch);
if ($response === false) { throw new RuntimeException("API request failed: {$error}"); }
if ($httpCode !== 200) { throw new RuntimeException("API returned HTTP {$httpCode}"); }
return json_decode($response, true); }
/** * Make POST request to API. */ private function post(string $endpoint, array $data): array { $url = $this->baseUrl . $endpoint; $json = json_encode($data);
$ch = curl_init($url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_POSTFIELDS, $json); curl_setopt($ch, CURLOPT_HTTPHEADER, [ 'Content-Type: application/json', 'Content-Length: ' . strlen($json) ]); curl_setopt($ch, CURLOPT_TIMEOUT, $this->timeout);
$response = curl_exec($ch); $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE); $error = curl_error($ch); curl_close($ch);
if ($response === false) { throw new RuntimeException("API request failed: {$error}"); }
$result = json_decode($response, true);
if ($httpCode !== 200) { $errorMsg = $result['error'] ?? 'Unknown error'; throw new RuntimeException("API error ({$httpCode}): {$errorMsg}"); }
return $result; }}
// Example usagetry { $client = new MLApiClient();
echo "=== Flask API Client Demo ===\n\n";
// Check API health echo "Checking API health... "; if (!$client->healthCheck()) { throw new RuntimeException( "API is not available. Start it with:\n" . " cd 04-rest-api-example\n" . " python3 flask_server.py" ); } echo "✅ API is healthy\n\n";
// Single prediction echo "=== Single Prediction ===\n"; $start = microtime(true); $result = $client->predict("This API is fantastic! Works perfectly."); $duration = (microtime(true) - $start) * 1000;
echo "Text: {$result['text']}\n"; echo "Sentiment: {$result['sentiment']} "; echo "(" . round($result['confidence'] * 100, 1) . "% confident)\n"; echo "Latency: " . round($duration, 2) . "ms\n\n";
// Batch prediction echo "=== Batch Prediction ===\n"; $texts = [ "Amazing product!", "Terrible experience.", "It's okay.", "Highly recommended!", "Not satisfied." ];
$start = microtime(true); $results = $client->predictBatch($texts); $duration = (microtime(true) - $start) * 1000;
foreach ($results as $result) { echo "• {$result['sentiment']}: \"{$result['text']}\"\n"; } echo "\nProcessed " . count($results) . " texts in " . round($duration, 2) . "ms\n"; echo "Average: " . round($duration / count($results), 2) . "ms per text\n\n";
echo "✅ API integration working!\n";
} catch (Exception $e) { echo "❌ Error: {$e->getMessage()}\n"; exit(1);}- Start the Flask server:
cd docs/series/ai-ml-php-developers/code/chapter-11/04-rest-api-example
# Install Flaskpython3 -m pip install flask
# Start server (runs in foreground)python3 flask_server.py- In a separate terminal, run PHP client:
cd docs/series/ai-ml-php-developers/code/chapter-11/04-rest-api-examplephp php_client.phpExpected Result
Section titled “Expected Result”Flask server output:
Starting Flask ML API...✅ Models loaded successfully * Running on http://127.0.0.1:5000PHP client output:
=== Flask API Client Demo ===
Checking API health... ✅ API is healthy
=== Single Prediction ===Text: This API is fantastic! Works perfectly.Sentiment: positive (99.5% confident)Latency: 12.34ms
=== Batch Prediction ===• positive: "Amazing product!"• negative: "Terrible experience."• neutral: "It's okay."• positive: "Highly recommended!"• negative: "Not satisfied."
Processed 5 texts in 45.67msAverage: 9.13ms per text
✅ API integration working!Why It Works
Section titled “Why It Works”REST API approach offers significant advantages:
Performance improvements:
- No process spawning: Python stays running, saving ~30ms per request
- Model loaded once: Model lives in memory, not reloaded per request
- Batch operations: Process multiple texts in single request efficiently
- Latency: ~10-15ms vs ~40-50ms for shell execution
Scalability benefits:
- Concurrent requests: Flask handles multiple PHP requests simultaneously
- Horizontal scaling: Run multiple API instances behind load balancer
- Independent deployment: Update Python service without touching PHP code
- Language agnostic: Any language can call HTTP API (Node.js, Go, Ruby, etc.)
Production patterns:
- Health checks: Monitor service availability
- Standard HTTP: Use existing infrastructure (reverse proxies, monitoring)
- Error codes: HTTP status codes indicate error types
- Versioning: Add
/v1/predictendpoints for API versioning
Trade-offs:
- Complexity: Must deploy and manage separate service
- Network overhead: HTTP adds ~1-5ms latency
- Port management: Need to configure ports and firewalls
- Service discovery: PHP needs to know API location
Troubleshooting
Section titled “Troubleshooting”Error: “Connection refused”
Flask server isn’t running. Start it:
python3 flask_server.pyVerify it’s listening:
curl http://127.0.0.1:5000/healthError: “Model not found”
Train the model first:
cd ../03-sentiment-analysisphp analyze.php # This trains the modelPort already in use
Change port in both files:
app.run(host='127.0.0.1', port=5001) # Changed from 5000$client = new MLApiClient(baseUrl: 'http://127.0.0.1:5001');Slow batch predictions
Flask development server is single-threaded. For production, use a production WSGI server:
# Install gunicornpython3 -m pip install gunicorn
# Run with 4 worker processesgunicorn -w 4 -b 127.0.0.1:5000 flask_server:appAPI crashes under load
Add error handling and request validation:
@app.errorhandler(Exception)def handle_error(e): return jsonify({'error': str(e)}), 500
# Add request size limitsapp.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024 # 16MB maxStep 5: Message Queue Pattern (~15 min)
Section titled “Step 5: Message Queue Pattern (~15 min)”Understand asynchronous processing using message queues for long-running ML tasks.
Actions
Section titled “Actions”- Create async queue example:
<?php
declare(strict_types=1);
/** * Message Queue Pattern for Async ML Processing * * Use this pattern when: * - ML task takes >5 seconds (model training, large predictions) * - User doesn't need immediate result * - High throughput is required * - Results can be delivered later (callback, polling, email) * * This example uses Redis as the message queue. * Production alternatives: RabbitMQ, AWS SQS, Google Pub/Sub */
class AsyncMLQueue{ private Redis $redis;
public function __construct( string $host = '127.0.0.1', int $port = 6379 ) { if (!extension_loaded('redis')) { throw new RuntimeException( 'Redis extension required. Install: pecl install redis' ); }
$this->redis = new Redis(); if (!$this->redis->connect($host, $port)) { throw new RuntimeException("Failed to connect to Redis at {$host}:{$port}"); } }
/** * Submit ML task to queue for async processing. */ public function submitTask(string $taskType, array $data, ?string $callbackUrl = null): string { $taskId = $this->generateTaskId();
$task = [ 'id' => $taskId, 'type' => $taskType, 'data' => $data, 'callback_url' => $callbackUrl, 'submitted_at' => time(), 'status' => 'pending' ];
// Add task to queue $this->redis->lPush('ml_tasks', json_encode($task));
// Store task metadata for status checking $this->redis->setex( "task:{$taskId}", 3600, // 1 hour TTL json_encode($task) );
return $taskId; }
/** * Check status of submitted task. */ public function getTaskStatus(string $taskId): ?array { $data = $this->redis->get("task:{$taskId}"); return $data ? json_decode($data, true) : null; }
/** * Get result of completed task. */ public function getTaskResult(string $taskId): ?array { $data = $this->redis->get("result:{$taskId}"); return $data ? json_decode($data, true) : null; }
/** * Worker: Process tasks from queue (this would run in Python). * * This is PHP pseudocode showing the worker pattern. * Real implementation would be in Python worker process. */ public function processTasksWorker(): void { echo "Worker started. Listening for tasks...\n";
while (true) { // Blocking pop with 1 second timeout $taskData = $this->redis->brPop(['ml_tasks'], 1);
if (!$taskData) { continue; // No task, keep waiting }
$task = json_decode($taskData[1], true); echo "Processing task {$task['id']} ({$task['type']})...\n";
try { // Update status to processing $task['status'] = 'processing'; $task['started_at'] = time(); $this->redis->setex("task:{$task['id']}", 3600, json_encode($task));
// Process task (call Python script, do ML work) $result = $this->executeMLTask($task);
// Store result $this->redis->setex( "result:{$task['id']}", 3600, json_encode($result) );
// Update task status $task['status'] = 'completed'; $task['completed_at'] = time(); $this->redis->setex("task:{$task['id']}", 3600, json_encode($task));
// Callback if URL provided if ($task['callback_url']) { $this->sendCallback($task['callback_url'], $result); }
echo "✅ Task {$task['id']} completed\n";
} catch (Exception $e) { echo "❌ Task {$task['id']} failed: {$e->getMessage()}\n";
$task['status'] = 'failed'; $task['error'] = $e->getMessage(); $this->redis->setex("task:{$task['id']}", 3600, json_encode($task)); } } }
private function executeMLTask(array $task): array { // In reality, this would call Python script or API // For demo, simulate processing sleep(2); // Simulate long ML task
return [ 'task_id' => $task['id'], 'result' => 'Task completed', 'processed_at' => time() ]; }
private function sendCallback(string $url, array $result): void { $ch = curl_init($url); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($result)); curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_exec($ch); curl_close($ch); }
private function generateTaskId(): string { return bin2hex(random_bytes(16)); }}
// Example usagetry { echo "=== Async ML Queue Pattern ===\n\n";
$queue = new AsyncMLQueue();
// Submit tasks echo "Submitting tasks...\n";
$taskId1 = $queue->submitTask('sentiment_analysis', [ 'text' => 'Analyze this review for sentiment' ]); echo "Task 1 submitted: {$taskId1}\n";
$taskId2 = $queue->submitTask('image_classification', [ 'image_url' => 'https://example.com/image.jpg' ]); echo "Task 2 submitted: {$taskId2}\n\n";
// Check status echo "Checking task status...\n"; $status1 = $queue->getTaskStatus($taskId1); echo "Task {$taskId1}: {$status1['status']}\n"; echo "Type: {$status1['type']}\n"; echo "Submitted: " . date('Y-m-d H:i:s', $status1['submitted_at']) . "\n\n";
echo "✅ Tasks queued for async processing\n\n";
echo "In production:\n"; echo " 1. Python workers continuously poll queue\n"; echo " 2. Workers process tasks and store results\n"; echo " 3. PHP checks results by task ID or receives callback\n"; echo " 4. Scale by adding more workers\n";
} catch (Exception $e) { echo "❌ Error: {$e->getMessage()}\n";
if (!extension_loaded('redis')) { echo "\nRedis extension not installed. This is normal for demo.\n"; echo "For production use, install Redis:\n"; echo " brew install redis # macOS\n"; echo " apt install redis-server # Ubuntu\n"; echo " pecl install redis # PHP extension\n"; }}- Create Python worker example:
"""Python worker that processes tasks from Redis queue.
This runs continuously in the background:1. Polls Redis queue for new tasks2. Executes ML tasks (prediction, training, etc.)3. Stores results back in Redis4. Sends callbacks if provided
Usage: python3 worker.py
Run multiple workers for parallel processing: python3 worker.py & python3 worker.py & python3 worker.py &"""
import redisimport jsonimport timeimport sysfrom pathlib import Path
# Add sentiment analysis to pathsys.path.insert(0, str(Path(__file__).parent.parent / '03-sentiment-analysis'))
try: import joblib SENTIMENT_MODEL = None SENTIMENT_VECTORIZER = None
# Try to load sentiment model model_path = Path(__file__).parent.parent / '03-sentiment-analysis' / 'models' / 'sentiment_model.pkl' if model_path.exists(): SENTIMENT_MODEL = joblib.load(model_path) vectorizer_path = model_path.parent / 'vectorizer.pkl' SENTIMENT_VECTORIZER = joblib.load(vectorizer_path) print("✅ Sentiment model loaded")except ImportError: print("⚠️ joblib not installed. Sentiment tasks will be simulated.")
def process_sentiment_analysis(data): """Process sentiment analysis task.""" text = data.get('text', '')
if SENTIMENT_MODEL and SENTIMENT_VECTORIZER: # Real prediction text_vec = SENTIMENT_VECTORIZER.transform([text]) prediction = SENTIMENT_MODEL.predict(text_vec)[0] probabilities = SENTIMENT_MODEL.predict_proba(text_vec)[0]
return { 'text': text, 'sentiment': prediction, 'confidence': float(probabilities.max()) } else: # Simulated prediction return { 'text': text, 'sentiment': 'positive', 'confidence': 0.85, 'note': 'Simulated (model not loaded)' }
def process_task(task): """Process a task based on its type.""" task_type = task['type'] data = task['data']
if task_type == 'sentiment_analysis': return process_sentiment_analysis(data) elif task_type == 'image_classification': # Simulated time.sleep(1) return {'label': 'cat', 'confidence': 0.92} else: raise ValueError(f"Unknown task type: {task_type}")
def main(): """Main worker loop.""" print("🔧 Starting ML Worker...")
# Connect to Redis try: r = redis.Redis(host='127.0.0.1', port=6379, decode_responses=True) r.ping() print("✅ Connected to Redis") except redis.ConnectionError: print("❌ Failed to connect to Redis") print(" Start Redis: brew services start redis # macOS") return
print("👂 Listening for tasks on queue 'ml_tasks'...\n")
while True: try: # Blocking pop with 1 second timeout task_data = r.brpop('ml_tasks', timeout=1)
if not task_data: continue # No task, keep waiting
task = json.loads(task_data[1]) task_id = task['id']
print(f"📥 Received task {task_id} ({task['type']})")
# Update status task['status'] = 'processing' task['started_at'] = int(time.time()) r.setex(f"task:{task_id}", 3600, json.dumps(task))
# Process task result = process_task(task)
# Store result r.setex(f"result:{task_id}", 3600, json.dumps(result))
# Update task status task['status'] = 'completed' task['completed_at'] = int(time.time()) r.setex(f"task:{task_id}", 3600, json.dumps(task))
print(f"✅ Completed task {task_id}\n")
except KeyboardInterrupt: print("\n🛑 Worker stopped") break except Exception as e: print(f"❌ Error processing task: {e}\n") if 'task_id' in locals(): task['status'] = 'failed' task['error'] = str(e) r.setex(f"task:{task_id}", 3600, json.dumps(task))
if __name__ == '__main__': main()Expected Result
Section titled “Expected Result”This pattern is for production async processing. To demo:
- Start Redis (if installed):
redis-server- Run PHP to submit tasks:
php async_queue.phpOutput:
=== Async ML Queue Pattern ===
Submitting tasks...Task 1 submitted: a3f2c91b4d8e...Task 2 submitted: 7e8f1c2a9b6d...
Checking task status...Task a3f2c91b4d8e...: pendingType: sentiment_analysisSubmitted: 2024-10-28 14:30:25
✅ Tasks queued for async processing- Run Python worker:
python3 worker.pyWorker output:
🔧 Starting ML Worker...✅ Connected to Redis✅ Sentiment model loaded👂 Listening for tasks on queue 'ml_tasks'...
📥 Received task a3f2c91b4d8e (sentiment_analysis)✅ Completed task a3f2c91b4d8e
📥 Received task 7e8f1c2a9b6d (image_classification)✅ Completed task 7e8f1c2a9b6dWhy It Works
Section titled “Why It Works”Message queue pattern provides enterprise-grade async processing:
Architecture:
PHP Web App → Redis Queue → Python Workers → Redis Results → PHP Web AppBenefits:
- Decoupled: Web requests don’t wait for ML processing
- Scalable: Add workers to handle more load
- Resilient: Tasks survive crashes, can be retried
- Flexible: Different worker types for different tasks
Use cases:
- Model training (minutes to hours)
- Batch predictions (thousands of items)
- Video/audio processing
- Report generation
- Data pipeline jobs
Workflow:
- Submit: PHP creates task and adds to queue
- Queue: Task waits in Redis list
- Process: Worker pops task and executes
- Store: Result saved to Redis with TTL
- Retrieve: PHP polls for result or receives callback
Troubleshooting
Section titled “Troubleshooting”Redis not installed
This is normal for demos. In production:
# macOSbrew install redisbrew services start redis
# Ubuntusudo apt install redis-serversudo systemctl start redis
# PHP extensionpecl install redisTasks not being processed
Ensure worker is running:
python3 worker.py & # Run in backgroundCheck queue has tasks:
redis-cli LLEN ml_tasksResults expiring too quickly
Increase TTL:
// Store for 24 hours instead of 1 hour$this->redis->setex("result:{$taskId}", 86400, json_encode($result));Need guaranteed delivery
Use RabbitMQ instead of Redis for:
- Message acknowledgments
- Persistent queues
- Dead letter queues
- Priority queues
Exercises
Section titled “Exercises”Exercise 1: Multi-Model Sentiment Analyzer
Section titled “Exercise 1: Multi-Model Sentiment Analyzer”Goal: Extend the sentiment analyzer to compare multiple classifiers and choose the best one.
Modify 03-sentiment-analysis/train_model.py to train three classifiers:
- Naive Bayes (current)
- Logistic Regression
- Support Vector Machine (LinearSVC)
Requirements:
- Train all three models on the same data
- Compare accuracy, precision, recall
- Save the best-performing model
- Update predict.py to use the best model
Validation: Your training script should output:
Naive Bayes Accuracy: 95.0%Logistic Regression Accuracy: 98.0%LinearSVC Accuracy: 97.5%
Best model: Logistic RegressionSaved to models/sentiment_model.pklHint: Import from sklearn:
from sklearn.naive_bayes import MultinomialNBfrom sklearn.linear_model import LogisticRegressionfrom sklearn.svm import LinearSVCExercise 2: Caching Layer
Section titled “Exercise 2: Caching Layer”Goal: Add intelligent caching to reduce redundant ML calls.
Create cache_layer.php that:
- Caches prediction results using the text hash as key
- Sets TTL (time-to-live) of 1 hour
- Falls back to ML prediction on cache miss
- Tracks cache hit rate
Requirements:
- Use file-based caching (or Redis if available)
- Hash function:
md5($text) - Cache structure:
['text' => ..., 'sentiment' => ..., 'cached_at' => ...] - Print cache stats (hits/misses/hit_rate)
Validation: Running the same texts twice should show:
First run: Cache misses: 5 ML predictions: 5
Second run: Cache hits: 5 ML predictions: 0 Cache hit rate: 100%Exercise 3: Health Monitoring
Section titled “Exercise 3: Health Monitoring”Goal: Build a health check system that monitors Python service availability.
Create health_monitor.php that:
- Checks if Python is installed and working
- Verifies sentiment model files exist
- Tests prediction with sample text
- Measures response time
- Reports overall system health
Requirements:
- Check Python version >= 3.10
- Check required packages (scikit-learn, pandas, joblib)
- Test prediction latency < 200ms
- Output: healthy/degraded/unhealthy status
Validation: Should produce:
=== System Health Check ===
✅ Python 3.11.4 detected✅ Required packages installed✅ Model files found✅ Test prediction successful (45ms)
Overall Status: HEALTHYWrap-up
Section titled “Wrap-up”Congratulations! You’ve mastered integrating PHP with Python for machine learning. Let’s review what you’ve accomplished:
✅ Basic Integration: Built shell-based communication between PHP and Python using JSON
✅ Data Exchange: Passed complex, nested data structures reliably between languages
✅ Production ML System: Created a complete sentiment analyzer with training and prediction
✅ REST API: Deployed a Flask API for scalable, low-latency predictions
✅ Async Processing: Learned message queue patterns for background ML tasks
✅ Security: Applied input validation, shell escaping, and error handling
✅ Performance: Understood trade-offs and measured latency for each approach
✅ Real-world Patterns: Gained production-ready code for hybrid PHP-Python applications
Key Takeaways
Section titled “Key Takeaways”Choose the right integration strategy:
- Shell execution for simple tasks, development, low traffic
- REST API for production, high traffic, real-time predictions
- Message queues for async, long-running, batch processing
Always consider:
- Security (escape inputs, validate data)
- Error handling (Python failures, network issues)
- Performance (latency, throughput, scalability)
- Monitoring (health checks, logging, metrics)
Best practices:
- Keep Python processes alive when possible (API > shell)
- Cache expensive predictions
- Batch operations for efficiency
- Use proper data serialization (JSON)
- Handle failures gracefully with fallbacks
Real-World Applications
Section titled “Real-World Applications”You can now build systems like:
- E-commerce: Sentiment analysis on product reviews
- Content platforms: Automatic content moderation with Python NLP
- Marketing: Customer segmentation using Python clustering
- Analytics: Time series forecasting with Python statsmodels
- Computer vision: Image tagging using Python OpenCV/TensorFlow
- Recommendations: Collaborative filtering with Python surprise library
Next Steps
Section titled “Next Steps”In Chapter 12: Deep Learning with TensorFlow and PHP, you’ll apply these integration techniques to use cutting-edge deep learning models. You’ll load pre-trained neural networks, run inference from PHP, and build an image classification API using TensorFlow.
The patterns you learned here—REST APIs, async processing, proper error handling—are the foundation for enterprise ML systems. You’re ready to build production applications that leverage the best of PHP and Python!
Further Reading
Section titled “Further Reading”Python ML Libraries:
- scikit-learn Documentation — Comprehensive ML library docs
- pandas Documentation — Data manipulation and analysis
- Flask Documentation — Web framework for APIs
- FastAPI — Modern, fast web framework (alternative to Flask)
Integration Patterns:
- PHP proc_open Documentation — Advanced process control
- PHP cURL Documentation — HTTP client
- Redis Documentation — In-memory data store for queues
Security:
- OWASP Command Injection — Preventing shell injection
- PHP escapeshellarg Documentation — Proper escaping
Production Deployment:
- Gunicorn — Python WSGI HTTP Server
- Supervisor — Process manager for workers
- Docker — Containerization for deployment
Advanced Topics:
- gRPC — High-performance RPC framework (alternative to REST)
- Apache Kafka — Distributed event streaming for large-scale queues
- Celery — Distributed task queue for Python