Skip to content

16: Computer Vision Essentials for PHP Developers

Computer Vision Essentials for PHP Developers

Chapter 16: Computer Vision Essentials for PHP Developers

Section titled “Chapter 16: Computer Vision Essentials for PHP Developers”

Computer vision is the field of teaching computers to “see” and understand images. While humans instantly recognize objects, faces, and scenes in photos, computers see only grids of numbers representing pixel colors. Transforming these numeric arrays into meaningful information—identifying a cat in a photo, detecting text in a document, or recognizing a person’s face—requires specialized techniques that bridge the gap between raw pixels and high-level understanding.

For PHP developers building modern web applications, computer vision capabilities are increasingly essential. Users upload profile photos that need cropping and resizing. Content moderation systems must identify inappropriate images. E-commerce sites want to enable visual search. Analytics dashboards visualize data with charts that need processing. All these scenarios require working with image data programmatically.

This chapter introduces the fundamentals of computer vision using PHP’s built-in capabilities. You’ll learn how images are represented as data, how to manipulate them (resizing, cropping, filtering), and—most importantly—how to extract numeric features that machine learning algorithms can understand. Unlike later chapters that will use pre-trained deep learning models, this chapter focuses on the foundational skills: working with pixels, color channels, and statistical features using PHP’s GD extension.

By mastering these essentials, you’ll be prepared to tackle image classification in Chapter 17, implement object detection in Chapter 18, and integrate computer vision features into production PHP applications. Whether you’re building a photo gallery, implementing visual search, or preparing images for machine learning, the techniques you learn here form the foundation of every computer vision project.

Before starting this chapter, you should have:

  • PHP 8.4+ installed and confirmed working with php --version
  • GD extension enabled (check with php -m | grep gd)
  • Completion of Chapter 15 or equivalent understanding of preprocessing data for ML
  • Understanding of PHP arrays, classes, and object-oriented programming
  • Basic familiarity with ML concepts from Chapter 3
  • A text editor or IDE with PHP support
  • Command line access for running examples

Estimated Time: ~85-100 minutes

Verify your setup:

Terminal window
# Check PHP version
php --version
# Should show PHP 8.4.x
# Check if GD extension is loaded
php -m | grep gd
# Should output: gd
# Check GD support
php -r "var_dump(gd_info());"
# Should show array with version and format support

If GD is not installed:

Terminal window
# Ubuntu/Debian
sudo apt-get install php8.4-gd
# macOS (Homebrew)
brew install php@8.4
# Windows: Enable extension=gd in php.ini

By the end of this chapter, you will have created:

  • An ImageLoader class supporting JPEG, PNG, GIF, and WEBP formats with error handling
  • An ImageProcessor class for resizing, cropping, rotating, and transforming images
  • A ColorConverter class for grayscale conversion, channel extraction, and color analysis
  • A FeatureExtractor class extracting numeric features (statistics, histograms, edge density)
  • An ImageFilter class applying blur, sharpen, edge detection, and artistic effects
  • An ImageAugmentor class generating training variations through data augmentation
  • 9 complete working examples demonstrating each computer vision technique including augmentation
  • 3 practical exercises with solutions: image analyzer, thumbnail generator, feature comparison
  • A comprehensive understanding of how to prepare images for machine learning
  • Reusable, production-ready code following PHP 8.4 best practices

All code examples are fully functional, tested, and include sample images you can run immediately.

::: info Code Examples Complete, runnable examples for this chapter:

Core Classes:

Step-by-Step Examples:

Supporting Files:

All files in docs/series/ai-ml-php-developers/code/chapter-16/ :::

Want to see image processing in action? Run this 5-minute example:

Terminal window
cd docs/series/ai-ml-php-developers/code/chapter-16
# Generate sample images
php generate-sample-images.php
# Verify your setup
php 02-check-extensions.php
# Try image analysis
php 01-image-representation.php

Expected output:

============================================================
Understanding Images as Data
============================================================
✓ GD extension is loaded
GD Version: bundled (2.1.0 compatible)
Supported formats:
JPEG: ✓
PNG: ✓
GIF: ✓
WEBP: ✓
Loading image: data/sample.jpg
Image Information:
Dimensions: 400 × 300 pixels
Total pixels: 120,000
Format: JPEG
MIME type: image/jpeg
Color depth: 8 bits
File size: 15.2 KB

This demonstrates that your environment is set up correctly and you can load and inspect images!

By completing this chapter, you will:

  • Understand how images are represented as 2D arrays of pixels with RGB color channels
  • Implement classes for loading, saving, and manipulating images in multiple formats
  • Master common image operations: resizing, cropping, rotating, and color conversions
  • Extract numeric features from images suitable for machine learning algorithms
  • Apply filters and effects for preprocessing and enhancement
  • Generate augmented training variations to expand limited datasets
  • Prepare standardized image datasets ready for training ML models
  • Build practical tools like thumbnail generators and image analyzers
  • Recognize when to use PHP for computer vision vs. when to integrate Python/OpenCV

Step 1: Understanding Images as Data (~10 min)

Section titled “Step 1: Understanding Images as Data (~10 min)”

Understand how images are represented as 2D grids of pixels, each with RGB color values, and how PHP’s GD extension provides access to this data.

  1. Run the first example to see image representation in action:
Terminal window
cd docs/series/ai-ml-php-developers/code/chapter-16
php 01-image-representation.php
  1. Observe the output showing:

    • Image dimensions (width × height)
    • Total number of pixels
    • File format and MIME type
    • Color depth (bits per channel)
    • Sample pixel colors from different regions
  2. Understanding the numbers: The example loads data/sample.jpg and samples pixels from various locations, showing their RGB values (0-255 for each channel).

============================================================
Understanding Images as Data
============================================================
✓ GD extension is loaded
GD Version: bundled (2.1.0 compatible)
...
Image Information:
Dimensions: 400 × 300 pixels
Total pixels: 120,000
Format: JPEG
File size: 15.2 KB
============================================================
Sampling Pixel Colors
============================================================
Top-left region (x:50, y:50):
Red channel: 255 (100.0%)
Green channel: 0 (0.0%)
Blue channel: 0 (0.0%)
Center (x:200, y:150):
Red channel: 0 (0.0%)
Green channel: 255 (100.0%)
Blue channel: 0 (0.0%)
...

Images are 2D grids where each pixel stores color information. For RGB images, each pixel has three values (Red, Green, Blue), each ranging from 0 to 255. A 400×300 pixel image contains 120,000 pixels, which means 360,000 individual color values (120,000 × 3 channels).

PHP’s GD extension represents images as GdImage objects (PHP 8.0+). The imagecolorat() function retrieves a pixel’s color index, and imagecolorsforindex() converts that to RGB values. This low-level access lets you analyze and manipulate every pixel programmatically.

The memory footprint is significant: uncompressed, our 400×300 image requires ~469 KB (400 × 300 × 4 bytes per pixel in RGBA). JPEG compression reduces this to ~15 KB on disk—a 31:1 compression ratio.

“Call to undefined function imagecolorat()”

  • GD extension is not loaded. Run php -m | grep gd to verify. Install with sudo apt-get install php8.4-gd (Linux) or enable in php.ini (Windows).

“Image file not found”

  • Run php generate-sample-images.php first to create the sample images in the data/ directory.

Memory errors with large images

  • Large images consume significant memory. A 4000×3000 pixel image needs ~46 MB uncompressed. Increase memory_limit in php.ini or use ini_set('memory_limit', '256M').

Step 2: Setting Up PHP Image Processing (~10 min)

Section titled “Step 2: Setting Up PHP Image Processing (~10 min)”

Verify that your PHP installation has everything needed for image processing and understand what capabilities are available.

  1. Run the setup verification script:
Terminal window
php 02-check-extensions.php
  1. Review the output to confirm:

    • PHP 8.4+ is installed
    • GD extension is loaded
    • All required image formats are supported (JPEG, PNG, GIF, WEBP)
    • Memory limit is adequate
    • Sample images exist
    • Output directory is writable
  2. If any checks fail, follow the installation instructions provided by the script.

============================================================
PHP Image Processing Setup Check
============================================================
PHP Version Check:
Current: 8.4.10
Required: 8.4.0+
Status: ✓ OK
============================================================
GD Extension Check
============================================================
✓ GD extension is installed
GD Library Information:
Version: bundled (2.1.0 compatible)
FreeType Support: ✓
Image Format Support:
JPEG: ✓ Yes
PNG: ✓ Yes
GIF: ✓ Yes
WEBP: ✓ Yes
BMP: ✓ Yes
...
============================================================
Setup Summary
============================================================
🎉 Your environment is ready for Chapter 16!
You can now:
✓ Load and save images in multiple formats
✓ Process images with GD library
✓ Run all chapter examples

The GD extension is PHP’s built-in image manipulation library. Most PHP installations include GD by default, but it’s compiled as a shared module that must be enabled. The gd_info() function returns an array describing GD’s capabilities, including supported formats.

Format support depends on how GD was compiled:

  • JPEG support requires libjpeg
  • PNG support requires libpng and zlib
  • WEBP support requires libwebp (PHP 7.0+)
  • GIF support is usually built-in

The script also checks PHP’s memory_limit setting. Image processing is memory-intensive: loading a 10 MP image requires ~38 MB (10,000,000 pixels × 4 bytes). A limit of 128M+ is recommended for typical use.

GD extension not found

Ubuntu/Debian:

Terminal window
sudo apt-get install php8.4-gd
sudo systemctl restart php8.4-fpm # If using PHP-FPM

macOS (Homebrew):

Terminal window
brew install php@8.4
# GD is included by default

Windows:

  1. Open php.ini
  2. Find ;extension=gd
  3. Remove the semicolon: extension=gd
  4. Restart web server

WEBP support missing

WEBP requires GD 2.0+ and libwebp. On older systems, you may need to compile PHP from source or use a PPA/repository with modern libraries.

Memory limit warnings

Edit php.ini:

memory_limit = 256M

Or set per-script:

ini_set('memory_limit', '256M');

Step 3: Loading and Saving Images (~8 min)

Section titled “Step 3: Loading and Saving Images (~8 min)”

Learn to load images from files, save them in different formats with quality settings, and understand format tradeoffs.

  1. Run the load/save example:
Terminal window
php 03-load-save-images.php
  1. Examine the ImageLoader class in ImageLoader.php:
# filename: ImageLoader.php (excerpt)
public function load(string $filepath): \GdImage
{
if (!file_exists($filepath)) {
throw new \RuntimeException("Image file not found: {$filepath}");
}
$imageInfo = getimagesize($filepath);
if ($imageInfo === false) {
throw new \RuntimeException("Invalid image file: {$filepath}");
}
[$width, $height, $type] = $imageInfo;
$image = match ($type) {
IMAGETYPE_JPEG => imagecreatefromjpeg($filepath),
IMAGETYPE_PNG => imagecreatefrompng($filepath),
IMAGETYPE_GIF => imagecreatefromgif($filepath),
IMAGETYPE_WEBP => imagecreatefromwebp($filepath),
default => throw new \RuntimeException("Unsupported image type")
};
return $image;
}
  1. Notice the pattern: getimagesize() identifies the format, then the appropriate imagecreatefrom*() function loads it. The match expression (PHP 8.0+) makes this elegant and type-safe.

  2. Check the output directory to see saved files in various formats and quality levels.

============================================================
Loading and Saving Images
============================================================
Loading: data/landscape.jpg
Image loading completed in 3.45 ms
Image Information:
width : 600
height : 400
type : JPEG
mime : image/jpeg
...
============================================================
Saving in Multiple Formats
============================================================
Saving JPEG (quality 100) completed in 12.34 ms
Size: 89.4 KB
✓ Saved: output/landscape_q100.jpg
Saving JPEG (quality 75) completed in 10.12 ms
Size: 45.2 KB
✓ Saved: output/landscape_q75.jpg
...
============================================================
Format Comparison
============================================================
Original JPEG: 65.3 KB (100%)
JPEG Q75: 45.2 KB (69%)
PNG: 234.5 KB (359%)
WEBP: 38.7 KB (59%)

GD provides separate functions for each format because they have different requirements:

  • imagejpeg($image, $filename, $quality): Quality 0-100, lossy compression
  • imagepng($image, $filename, $compression): Compression 0-9 (inverted: 9 = max), lossless
  • imagegif($image, $filename): No quality parameter, lossless for ≤256 colors
  • imagewebp($image, $filename, $quality): Quality 0-100, lossy or lossless

The save() method in ImageLoader uses file extension to determine the output format, making it easy to convert between formats:

$loader->save($image, 'output/photo.webp', 80); // Converts to WEBP

Format choice matters:

  • JPEG: Best for photos, lossy, small file sizes, no transparency
  • PNG: Best for graphics/logos, lossless, supports transparency, larger files
  • WEBP: Modern format, smaller than JPEG/PNG, supports transparency, not universally supported
  • GIF: Animated images, limited to 256 colors, large for photos

“Failed to create image from file”

  • File may be corrupted. Try opening in an image viewer first.
  • Format may not be supported. Run 02-check-extensions.php to verify format support.

“Unsupported save format”

  • Check file extension. Only .jpg, .jpeg, .png, .gif, .webp are supported.
  • Ensure the format is supported by your GD installation.

Saved images are blank or corrupted

  • Verify you call imagedestroy($image) AFTER saving, not before.
  • Check file permissions on output directory: chmod 755 output/
  • Ensure output directory exists: mkdir -p output

PNG quality seems backwards

  • PNG compression is inverted: 0 = no compression (large), 9 = maximum compression (small, slower). The save() method handles this conversion automatically.

Step 4: Basic Image Manipulations (~10 min)

Section titled “Step 4: Basic Image Manipulations (~10 min)”

Master resizing, cropping, rotating, and other transformations essential for preparing images for ML and web display.

  1. Run the manipulations example:
Terminal window
php 04-image-manipulations.php

This demonstrates 7 different manipulation techniques and saves each result.

  1. Study the ImageProcessor class in ImageProcessor.php. Key methods:
# filename: ImageProcessor.php (excerpt)
public function resize(
\GdImage $image,
int $newWidth,
int $newHeight,
bool $maintainAspectRatio = true
): \GdImage {
$originalWidth = imagesx($image);
$originalHeight = imagesy($image);
if ($maintainAspectRatio) {
$aspectRatio = $originalWidth / $originalHeight;
if ($newWidth / $newHeight > $aspectRatio) {
$newWidth = (int)($newHeight * $aspectRatio);
} else {
$newHeight = (int)($newWidth / $aspectRatio);
}
}
$resized = imagecreatetruecolor($newWidth, $newHeight);
imagecopyresampled(
$resized, $image,
0, 0, 0, 0,
$newWidth, $newHeight,
$originalWidth, $originalHeight
);
return $resized;
}
  1. Understand imagecopyresampled(): This is GD’s high-quality resizing function. It performs bilinear interpolation, producing smooth results by averaging surrounding pixels. The simpler imagecopyresized() uses nearest-neighbor sampling (faster but lower quality).

  2. Try chaining operations:

$processor = new ImageProcessor();
$result = $processor->resize($image, 800, 600, true);
$result = $processor->cropCenter($result, 400, 400);
$result = $processor->rotate($result, 90);
============================================================
Basic Image Manipulations
============================================================
Original image: 600×400
============================================================
1. Resizing (Maintaining Aspect Ratio)
============================================================
Target size: 300×200
Actual size: 300×200
(Aspect ratio maintained)
✓ Saved: output/landscape_resized.jpg
...
============================================================
4. Rotating
============================================================
Rotated 45°: 707×707
✓ Saved: output/landscape_rotated_45.jpg
Rotated 90°: 400×600
✓ Saved: output/landscape_rotated_90.jpg
Rotated 180°: 600×400
✓ Saved: output/landscape_rotated_180.jpg
...
============================================================
7. Chaining Operations
============================================================
Creating a profile picture: resize → crop center → save
✓ Saved: output/landscape_profile.jpg (300×300 square)

Image manipulations create new GdImage objects rather than modifying the original. This preserves the source and lets you create multiple variants.

Resizing uses imagecopyresampled() which performs pixel interpolation. When reducing an image from 600×400 to 300×200, each new pixel averages 4 original pixels (2×2 grid). This antialiasing prevents jagged edges.

Aspect ratio preservation calculates which dimension (width or height) should be adjusted to maintain the original proportions. For a 600×400 image (1.5:1 aspect ratio) fitted to 300×200:

  • Target aspect: 300/200 = 1.5
  • Original aspect: 600/400 = 1.5
  • They match, so no adjustment needed
  • If target was 300×300 (1:1), we’d resize to 300×200 to maintain 1.5:1

Cropping extracts a rectangle from the source image using imagecopy(). Center cropping calculates the starting position: x = (originalWidth - cropWidth) / 2.

Rotation uses imagerotate($image, $degrees, $bgColor). Positive degrees rotate counterclockwise. The canvas expands to fit the rotated image (45° rotation increases dimensions to fit the diagonal).

These operations are fundamental for:

  • Responsive images: Create multiple sizes for different devices
  • Thumbnails: Generate preview images for galleries
  • ML preprocessing: Standardize input dimensions
  • User uploads: Normalize user-submitted content
  • Storage optimization: Reduce file sizes

Resized images are blurry

  • This is expected when significantly upscaling (enlarging). You cannot add detail that wasn’t in the original.
  • For small upscaling (10-20%), the blur is minimal
  • Use higher quality source images when possible
  • Consider using imagescale() for simpler cases (PHP 5.5+)

Rotated images have black corners

  • Default background is black. Specify a custom color:
    $bgColor = imagecolorallocate($image, 255, 255, 255); // White
    $rotated = imagerotate($image, 45, $bgColor);
  • For transparency: imagecolorallocatealpha($image, 255, 255, 255, 127)

Images lose quality after multiple operations

  • Each JPEG save loses quality (lossy compression)
  • Perform all operations first, then save once
  • Use PNG during intermediate steps if quality is critical

Memory errors with large images

  • Each image operation creates a new image in memory
  • Destroy intermediate images: imagedestroy($tempImage)
  • Process images at appropriate sizes (resize large images first)

Convert images between color spaces (RGB, grayscale), extract individual color channels, and analyze color properties—essential preprocessing steps for many ML algorithms.

  1. Run the color conversions example:
Terminal window
php 05-color-conversions.php
  1. Understand why grayscale matters for ML: Converting RGB (3 channels) to grayscale (1 channel) reduces input dimensions by 66%, speeding up training and reducing model complexity without significant information loss for many tasks.

  2. Study the ColorConverter class in ColorConverter.php:

# filename: ColorConverter.php (excerpt)
public function toGrayscaleLuminosity(\GdImage $image): \GdImage
{
$width = imagesx($image);
$height = imagesy($image);
$gray = imagecreatetruecolor($width, $height);
for ($y = 0; $y < $height; $y++) {
for ($x = 0; $x < $width; $x++) {
$rgb = imagecolorat($image, $x, $y);
$colors = imagecolorsforindex($image, $rgb);
// Luminosity formula: weighted RGB values
$luminosity = (int)(
$colors['red'] * 0.299 +
$colors['green'] * 0.587 +
$colors['blue'] * 0.114
);
$grayColor = imagecolorallocate($gray, $luminosity, $luminosity, $luminosity);
imagesetpixel($gray, $x, $y, $grayColor);
}
}
return $gray;
}
  1. Compare grayscale methods: The luminosity method (above) weighs green more heavily than red or blue, matching human eye sensitivity. The simpler IMG_FILTER_GRAYSCALE averages all channels equally.
============================================================
Color Space Conversions
============================================================
Loading: data/sample.jpg
============================================================
1. Grayscale Conversion
============================================================
Converting to grayscale...
✓ Saved: output/sample_grayscale.jpg
✓ Saved: output/sample_grayscale_luminosity.jpg (luminosity method)
Grayscale is useful for:
• Reducing data dimensions (3 channels → 1)
• Simplifying ML models
• Edge detection preprocessing
============================================================
2. Color Channel Extraction
============================================================
✓ Saved: output/sample_red_channel.jpg
✓ Saved: output/sample_green_channel.jpg
✓ Saved: output/sample_blue_channel.jpg
Channel extraction reveals:
• Which colors dominate the image
• Color distribution patterns
============================================================
3. Average Color Analysis
============================================================
Average Color: RGB(127, 95, 89)
This represents the overall color tone of the image.
...
============================================================
6. Color Comparison Across Images
============================================================
sample.jpg:
Average: RGB(127, 95, 89)
Dominant: RGB(255, 0, 0)
landscape.jpg:
Average: RGB(87, 138, 89)
Dominant: RGB(34, 139, 34)
face.jpg:
Average: RGB(198, 175, 152)
Dominant: RGB(255, 220, 177)

Grayscale conversion discards color information while retaining luminance (brightness). The luminosity method uses weights (0.299R + 0.587G + 0.114B) based on human perception research—our eyes are more sensitive to green than red, and more to red than blue.

For ML, grayscale is advantageous when color isn’t meaningful (e.g., handwritten digit recognition, text OCR). A 28×28 RGB image has 2,352 features (28×28×3); grayscale reduces this to 784 (28×28×1), requiring 1/3 the memory and training time.

Channel extraction isolates each color component. This reveals:

  • Dominant channel: Which color predominates (green in landscapes, red/yellow in skin tones)
  • Channel patterns: Sky is high blue, vegetation is high green
  • Feature engineering: Use channel histograms or statistics as ML features

Color analysis extracts statistical features:

  • Average color: Mean RGB across all pixels (overall tone)
  • Dominant color: Most frequent color after quantization (primary hue)
  • Color distance: Euclidean distance in RGB space measures similarity

These metrics enable:

  • Image categorization: Landscapes (green), sunsets (orange), underwater (blue)
  • Duplicate detection: Similar average colors suggest similar content
  • Color-based search: Find images matching a color palette

Grayscale conversion is slow for large images

  • The luminosity method iterates every pixel (nested loops)
  • For 4000×3000 images, that’s 12 million iterations
  • Use the faster imagefilter($image, IMG_FILTER_GRAYSCALE) if luminosity weighting isn’t critical
  • Or resize images before conversion

Colors look different in extracted channels

  • This is expected: red channel shows only red intensity (displayed as grayscale)
  • Pure red (255, 0, 0) appears white in red channel, black in green/blue
  • These images are for visualization; actual data is just the intensity values

Average color doesn’t match what I see

  • Average color is mathematical mean, not perceptual dominance
  • A few bright pixels can skew the average
  • Use getDominantColor() for the most frequent color instead

Memory errors during color analysis

  • Color analysis loads full images into memory
  • For large batches, resize images first: $processor->resize($image, 800, 600)
  • Process images sequentially, destroying each before loading the next

Step 6: Extracting Image Features (~12 min)

Section titled “Step 6: Extracting Image Features (~12 min)”

Extract numeric features from images (statistics, histograms, edge density) that machine learning algorithms can use for classification and analysis.

  1. Run the feature extraction example:
Terminal window
php 06-feature-extraction.php
  1. Study the FeatureExtractor class in FeatureExtractor.php:
# filename: FeatureExtractor.php (excerpt)
public function extractBasicFeatures(\GdImage $image): array
{
$width = imagesx($image);
$height = imagesy($image);
$stats = $this->calculateColorStatistics($image);
return [
'width' => $width,
'height' => $height,
'aspect_ratio' => $width / $height,
'avg_red' => $stats['avg_red'],
'avg_green' => $stats['avg_green'],
'avg_blue' => $stats['avg_blue'],
'avg_brightness' => $stats['avg_brightness'],
'std_red' => $stats['std_red'],
'std_green' => $stats['std_green'],
'std_blue' => $stats['std_blue'],
];
}
  1. Understand the two feature types:

    • Statistical features: 11 values summarizing the image (fast, interpretable)
    • Raw pixel features: Every pixel as a feature (1024+ values for 32×32, detailed but large)
  2. Try feature extraction on your own images by modifying the image paths in the script.

============================================================
Extracting Image Features for ML
============================================================
============================================================
1. Basic Statistical Features
============================================================
Analyzing sample.jpg:
width : 400
height : 300
aspect_ratio : 1.3333
avg_red : 127.35
avg_green : 95.21
avg_blue : 89.47
avg_brightness : 104.01
std_red : 72.14
std_green : 68.93
std_blue : 61.28
...
============================================================
4. Edge Density Analysis
============================================================
Sample : 0.287 (28.7% edge pixels)
Landscape : 0.156 (15.6% edge pixels)
Face : 0.092 (9.2% edge pixels)
Edge density indicates:
• Higher values = more complex/detailed images
• Lower values = smooth/simple images
============================================================
5. Flattening Images to Feature Vectors
============================================================
Original image dimensions: 300×300
Flattened to 32x32 (grayscale): 1024 features
Expected: 1024 features ✓
Sample values: [0.800, 0.863, 0.855, ..., 0.612]
Flattened to 16x16 (grayscale): 256 features
Expected: 256 features ✓
Sample values: [0.798, 0.862, 0.854, ..., 0.614]
Flattened to 16x16 (RGB): 768 features
Expected: 768 features ✓
Sample values: [1.000, 0.863, 0.867, ..., 0.671]
Flattened vectors can be used as input to ML algorithms:
• Each pixel becomes a feature
• Values normalized to 0-1 range
• Smaller sizes = faster training
• Grayscale = 1/3 the features

Feature extraction transforms images (2D pixel grids) into numeric vectors that ML algorithms can process. Two approaches exist:

1. Statistical Features (Compact Representation)

Extract summary statistics describing the image:

  • Dimensions: Width, height, aspect ratio
  • Color statistics: Mean and standard deviation of each channel
  • Brightness: Overall luminance
  • Edge density: Proportion of edge pixels (texture measure)

Advantages:

  • Small: 11 values regardless of image size
  • Fast: Constant time extraction
  • Interpretable: Each feature has clear meaning
  • Effective: Sufficient for many classification tasks

2. Raw Pixel Features (Detailed Representation)

Flatten the entire image into a 1D array:

32×32 grayscale image → 1024 features (32 × 32 × 1)
32×32 RGB image → 3072 features (32 × 32 × 3)

The flattenToVector() method:

  1. Resizes image to target dimensions (e.g., 32×32)
  2. Converts to grayscale (optional)
  3. Normalizes pixel values to 0-1 range
  4. Concatenates all pixels into a 1D array

Advantages:

  • Detailed: Preserves spatial information
  • Flexible: Works with neural networks
  • No feature engineering: Let the model learn patterns

Trade-offs:

  • Large: 1024+ features vs. 11
  • Slower: More data to process
  • Less interpretable: Features are individual pixels

Edge Density measures image complexity using edge detection. Edges represent boundaries between regions (object edges, texture patterns). High edge density suggests:

  • Complex scenes (urban environments, forests)
  • High texture (fabric, terrain)
  • Multiple objects

Low edge density suggests:

  • Simple scenes (clear sky, smooth surfaces)
  • Portraits (smooth skin tones)
  • Minimal texture

ML use cases for features:

  • K-Nearest Neighbors: Distance-based on feature vectors
  • Decision Trees/Random Forests: Split on feature thresholds
  • SVM: Find separating hyperplane in feature space
  • Neural Networks: Learn features automatically from raw pixels

Feature extraction is very slow

  • Large images take longer to process
  • Use sampling: the extractor samples every nth pixel for images >100K pixels
  • Resize images first: $processor->resize($image, 640, 480)

Standard deviation is zero or very low

  • Image has uniform color (solid color background)
  • This is valid—low std means low variation
  • Consider this as information: smooth images vs. textured images

Flattened vectors are too large for my algorithm

  • Reduce target size: flattenToVector($image, 16, 16) → 256 features
  • Use grayscale instead of RGB: 768 → 256 features
  • Use statistical features instead: 3072 → 11 features
  • Apply dimensionality reduction (PCA) after extraction

Histograms have uneven distributions

  • This is normal and informative!
  • Outdoor images: high in mid-range greens
  • Indoor images: peaks at specific lighting levels
  • High-contrast images: peaks at extremes (0 and 255)
  • Histogram shape itself is useful data for classification

Step 7: Image Filters and Effects (~10 min)

Section titled “Step 7: Image Filters and Effects (~10 min)”

Apply filters (blur, sharpen, edge detection) and effects (sepia, sketch) for preprocessing, enhancement, and feature extraction.

  1. Run the filters example:
Terminal window
php 07-image-filters.php

This applies 20+ different filters and saves the results.

  1. Study the ImageFilter class in ImageFilter.php:
# filename: ImageFilter.php (excerpt)
public function edgeDetect(\GdImage $image): \GdImage
{
imagefilter($image, IMG_FILTER_EDGEDETECT);
return $image;
}
public function convolution(\GdImage $image, array $matrix, float $divisor = 1, float $offset = 0): \GdImage
{
imageconvolution($image, $matrix, $divisor, $offset);
return $image;
}
  1. Understand convolution: Filters work by applying a convolution matrix (kernel) to each pixel. The kernel is a small matrix (typically 3×3) that defines how surrounding pixels influence the result.

Edge detection kernel:

[-1, -1, -1]
[-1, 8, -1]
[-1, -1, -1]

This emphasizes pixels that differ from their neighbors (edges).

  1. Try creating custom filters using the convolution() method with your own matrices.
============================================================
Image Filters and Effects
============================================================
============================================================
1. Blur Effects
============================================================
✓ Saved: output/landscape_blur_1.jpg (1 blur passes)
✓ Saved: output/landscape_blur_3.jpg (3 blur passes)
✓ Saved: output/landscape_blur_5.jpg (5 blur passes)
✓ Saved: output/landscape_selective_blur.jpg (preserves edges better)
Blur is useful for:
• Noise reduction
• Background effects
• Privacy (blurring faces/plates)
============================================================
2. Sharpening
============================================================
✓ Saved: output/landscape_sharpen.jpg (built-in sharpen)
✓ Saved: output/landscape_sharpen_custom.jpg (custom sharpen)
============================================================
3. Edge Detection
============================================================
✓ Saved: output/landscape_edges.jpg
✓ Saved: output/landscape_edge_enhance.jpg (enhanced, not isolated)
Edge detection is crucial for:
• Object detection preprocessing
• Feature extraction
• Shape recognition
...
============================================================
7. Custom Convolution Filters
============================================================
✓ Saved: output/landscape_custom_edge.jpg (custom edge matrix)
Convolution allows custom filters for:
• Edge detection variants
• Custom sharpening/blurring
• Special effects

Image filters modify pixel values based on their neighbors using convolution. For each pixel, the filter:

  1. Extracts a 3×3 neighborhood of surrounding pixels
  2. Multiplies each neighbor by the corresponding kernel value
  3. Sums the results and divides by the divisor
  4. Adds the offset and sets the new pixel value

Example: Gaussian blur kernel (simplified):

[1, 2, 1]
[2, 4, 2]
[1, 2, 1]
Divisor: 16 (sum of all values)

This averages each pixel with its neighbors, smoothing the image.

Built-in filters (via imagefilter()) include:

  • IMG_FILTER_GAUSSIAN_BLUR: Reduces noise and detail
  • IMG_FILTER_EDGEDETECT: Highlights edges, used in CV pipelines
  • IMG_FILTER_MEAN_REMOVAL: Sharpens (emphasizes differences)
  • IMG_FILTER_EMBOSS: Creates 3D-like appearance
  • IMG_FILTER_GRAYSCALE: Converts to grayscale
  • IMG_FILTER_BRIGHTNESS: Adjusts overall lightness
  • IMG_FILTER_CONTRAST: Adjusts light/dark differences

ML preprocessing uses filters for:

  • Noise reduction: Blur before feature extraction
  • Edge detection: Detect shapes and boundaries
  • Data augmentation: Create training variations (rotated, blurred, sharpened)
  • Normalization: Standardize image appearance

Filters make images look worse

  • Some filters are destructive (edge detection discards most information)
  • Save filtered versions separately; don’t overwrite originals
  • Multiple blur passes compound (3 passes ≈ 1 strong blur)

Custom convolution has weird artifacts

  • Kernel values must be balanced
  • Divisor should usually equal sum of positive values
  • Negative values can cause unexpected results (use carefully)
  • Test kernels on small regions first

Edge detection output is mostly black

  • This is correct! Most pixels aren’t edges
  • The white pixels show detected edges
  • Invert with imagefilter($image, IMG_FILTER_NEGATE) if needed

Filters are slow on large images

  • Convolution is O(width × height × kernel_size²)
  • Resize images first if appropriate
  • Use simpler filters (Gaussian blur vs. multiple selective blurs)

Step 8: Preparing Images for Machine Learning (~12 min)

Section titled “Step 8: Preparing Images for Machine Learning (~12 min)”

Create a complete preprocessing pipeline that standardizes images, extracts features, normalizes values, and outputs ML-ready datasets.

  1. Run the ML preparation example:
Terminal window
php 08-ml-preparation.php
  1. Understand the preprocessing pipeline:
# filename: 08-ml-preparation.php (excerpt)
function preprocessImageForML(
string $imagePath,
ImageLoader $loader,
ImageProcessor $processor,
ColorConverter $converter,
FeatureExtractor $extractor,
int $targetSize = 32,
bool $useStatistical = false
): array {
// 1. Load image
$img = $loader->load($imagePath);
// 2. Standardize size
$img = $processor->resize($img, $targetSize, $targetSize, false);
// 3. Convert to grayscale
$img = $converter->toGrayscaleLuminosity($img);
// 4. Extract features
if ($useStatistical) {
$features = $extractor->extractAllFeatures($img);
} else {
$features = $extractor->flattenToVector($img, $targetSize, $targetSize, true);
}
imagedestroy($img);
return $features;
}
  1. Try both feature extraction methods (statistical vs. raw pixels) and understand when to use each.

  2. Create your own preprocessing pipeline for a specific use case (face detection, object classification, etc.).

============================================================
Preparing Images for Machine Learning
============================================================
============================================================
1. Standardizing Image Dimensions
============================================================
ML models require consistent input dimensions.
Let's standardize all images to 128×128:
sample.jpg : 400× 300 → 128×128
landscape.jpg : 600× 400 → 128×128
face.jpg : 300× 300 → 128×128
Benefits of standardization:
• All images have same dimensions
• Fixed input size for neural networks
• Batch processing becomes possible
• Consistent feature vector length
============================================================
2. Converting to Grayscale
============================================================
✓ sample.jpg → grayscale
✓ landscape.jpg → grayscale
✓ face.jpg → grayscale
Data reduction: 128×128×3 = 49152 values → 128×128×1 = 16384 values
That's 66% less data!
============================================================
3. Flattening to Feature Vectors
============================================================
Converting images to 1D vectors for ML algorithms:
sample.jpg : 1024 features [0.804, 0.863, ..., 0.612]
landscape.jpg : 1024 features [0.337, 0.459, ..., 0.576]
face.jpg : 1024 features [0.765, 0.678, ..., 0.596]
These vectors can be used with:
• K-Nearest Neighbors (KNN)
• Support Vector Machines (SVM)
• Neural Networks
...
============================================================
6. Feature Normalization
============================================================
Before normalization (sample.jpg):
[400.0, 300.0, 1.3, 127.4, 95.2, ...]
After normalization (sample.jpg):
[0.667, 0.500, 0.500, 0.500, 0.374, ...]
Why normalize?
• Features on same scale (0-1)
• Prevents large values from dominating
• Improves ML algorithm convergence
• Required for many algorithms (SVM, neural networks)

ML preprocessing transforms raw images into standardized, numeric representations that algorithms can process. The pipeline consists of:

1. Standardization

  • All images resized to same dimensions (e.g., 128×128)
  • Enables batch processing (process multiple images together)
  • Fixed input size required by neural networks
  • Consistent feature vector length for distance-based algorithms

2. Color Reduction

  • Convert RGB to grayscale (3 channels → 1 channel)
  • Reduces dimensionality by 66%
  • Faster training, lower memory usage
  • Sufficient when color isn’t discriminative (digits, text, shapes)

3. Feature Extraction

  • Raw pixels: Flatten 32×32 grayscale = 1024 features
  • Statistical: Extract 11 summary features
  • Choice depends on algorithm and task complexity

4. Normalization

  • Scale all features to same range (typically 0-1)
  • Formula: normalized = (value - min) / (max - min)
  • Prevents features with large values (like width=1920) from dominating small values (like average brightness=0.5)
  • Essential for:
    • Gradient descent (neural networks)
    • Distance metrics (KNN, SVM)
    • Regularization effectiveness

5. Dataset Format

$dataset = [
['features' => [0.5, 0.7, ...], 'label' => 'cat'],
['features' => [0.2, 0.9, ...], 'label' => 'dog'],
// ...
];

Complete ML workflow:

  1. Collect images with labels
  2. Preprocess using pipeline
  3. Split into train/test sets (80%/20%)
  4. Train model on training set
  5. Evaluate on test set
  6. Deploy for predictions on new images

When to use each feature type:

Feature TypeBest ForAdvantagesDisadvantages
Statistical (11 features)Simple classification, quick prototypingFast, interpretable, smallLimited detail
Raw Pixels (1024+ features)Complex patterns, neural networksDetailed, spatial infoLarger, slower

Normalization causes all features to become 0 or 1

  • Features have no variation (all same value)
  • Check if images are identical or very similar
  • Verify feature extraction is working correctly
  • Some features naturally have low range (edge density 0-0.3)

Different images produce identical feature vectors

  • Images may be too similar after preprocessing
  • Increase target size (32×32 → 64×64) for more detail
  • Use raw pixels instead of statistical features
  • Check if grayscale conversion removes discriminative color info

Model training is very slow

  • Large feature vectors (use smaller resize dimensions)
  • Too many features (use statistical instead of raw pixels)
  • No normalization (causes slow convergence)
  • Try dimensionality reduction (PCA)

Poor classification accuracy

  • Images may need more preprocessing:
    • Increase contrast
    • Apply edge detection
    • Extract different features (color histograms)
  • Target size may be too small (information loss)
  • Labels may be incorrect or inconsistent
  • Need more training data

Step 9: Image Augmentation for ML Training (~10 min)

Section titled “Step 9: Image Augmentation for ML Training (~10 min)”

Learn how to generate multiple variations of training images to expand limited datasets and improve model generalization through data augmentation.

An ImageAugmentor class that creates realistic variations of images through random transformations, significantly expanding training datasets without collecting new images.

When training machine learning models from scratch, you often face the cold start problem: not enough data. Collecting thousands of labeled images is expensive and time-consuming. Data augmentation solves this by generating new training examples from existing ones.

Real-world impact:

  • Turn 100 images into 1,000+ training samples
  • Teach models that a cat is still a cat when flipped, rotated, or in different lighting
  • Reduce overfitting (when model memorizes training data instead of learning patterns)
  • Improve model robustness to real-world variations

When to use augmentation:

✓ Training models from scratch with limited data
✓ Fine-tuning pre-trained models on custom datasets
✓ Creating robust models for varied real-world conditions
✓ Balancing imbalanced datasets (generate more examples of rare classes)

When NOT to use:

✗ Inference/prediction (always use original images)
✗ Testing/validation sets (need unmodified data to measure true performance)
✗ When augmented variations aren’t realistic (e.g., don’t flip text images)
✗ Tasks where orientation is critical (medical scans, document OCR)

Augmentation techniques simulate real-world variations you’d encounter naturally:

1. Geometric Transformations

  • Flips: Horizontal (common), vertical (rare in natural photos)
  • Rotations: 90°, 180°, 270° (preserve content), or small random angles (±15°)
  • Crops: Random sections of image (simulates different viewing distances)
  • Zoom: Scale in/out (simulates different focal lengths)

2. Color/Brightness Adjustments

  • Brightness: Simulate different lighting conditions (±20-40 levels)
  • Contrast: Enhance or reduce differences between light/dark areas
  • Saturation: Adjust color intensity
  • Hue shifts: Slight color variations (use sparingly)

3. Filtering Effects

  • Blur: Simulate motion or focus issues
  • Noise: Add random pixel variations (sensor noise)
  • Sharpening: Enhance edges slightly

Strategy considerations:

TechniqueUse ForAvoid For
Horizontal flipMost natural objects, scenesText, asymmetric objects
Vertical flipAbstract patternsMost natural scenes (rare)
90° rotationsObjects without natural “up”Faces, buildings, landscapes
BrightnessOutdoor scenes, lightingMedical images (diagnostic)
Random cropsObject detection, zoomFull-scene classification

Let’s build a complete augmentation pipeline:

09-data-augmentation.php
<?php
declare(strict_types=1);
require_once __DIR__ . '/helpers.php';
require_once __DIR__ . '/ImageLoader.php';
require_once __DIR__ . '/ImageProcessor.php';
require_once __DIR__ . '/ColorConverter.php';
require_once __DIR__ . '/ImageAugmentor.php';
$loader = new ImageLoader();
$processor = new ImageProcessor();
$converter = new ColorConverter();
$augmentor = new ImageAugmentor($processor, $converter);
// Load original image
$original = $loader->load(__DIR__ . '/data/face.jpg');
// 1. Generate standard augmentation set (reproducible)
$standardSet = $augmentor->generateStandardSet($original);
// Returns: original, flip_horizontal, flip_vertical, rotate_90,
// rotate_180, brightness_+30, brightness_-30, contrast_high
echo "Generated " . count($standardSet) . " standard variations\n";
foreach ($standardSet as $name => $augmentedImage) {
$loader->save($augmentedImage, "output/augmented/face_{$name}.jpg");
imagedestroy($augmentedImage);
}
// 2. Generate random augmentations with custom config
$config = [
'flip' => true,
'rotate' => true,
'rotation_angles' => [0, 15, 30, 45, 90, 180, 270],
'rotate_probability' => 70, // 70% chance
'brightness' => true,
'brightness_range' => ['min' => -40, 'max' => 40],
'brightness_probability' => 60,
'contrast' => true,
'contrast_range' => ['min' => -25, 'max' => 25],
'contrast_probability' => 60,
'crop' => true,
'crop_scale_range' => ['min' => 0.8, 'max' => 1.0], // 80-100%
'crop_probability' => 50,
'zoom' => true,
'zoom_range' => ['min' => 0.9, 'max' => 1.2], // 90-120%
'zoom_probability' => 40,
];
// Generate 10 random augmented variations
$randomAugmented = $augmentor->augment($original, 10, $config);
foreach ($randomAugmented as $i => $augmentedImage) {
$filename = sprintf("face_random_%02d.jpg", $i + 1);
$loader->save($augmentedImage, "output/augmented/{$filename}");
imagedestroy($augmentedImage);
}
echo "Generated 10 random augmented variations\n";
// 3. Batch processing multiple images
$allImages = glob(__DIR__ . '/data/*.jpg');
$totalGenerated = 0;
foreach ($allImages as $imagePath) {
$img = $loader->load($imagePath);
$augmented = $augmentor->augment($img, 5); // 5 per image
$basename = basename($imagePath, '.jpg');
foreach ($augmented as $i => $augImg) {
$filename = sprintf("%s_aug_%02d.jpg", $basename, $i + 1);
$loader->save($augImg, "output/batch/{$filename}");
imagedestroy($augImg);
$totalGenerated++;
}
imagedestroy($img);
}
echo "Batch processed " . count($allImages) . " images\n";
echo "Generated {$totalGenerated} augmented training samples\n";
imagedestroy($original);

Run the example:

Terminal window
php code/chapter-16/09-data-augmentation.php

Expected output:

═══════════════════════════════════════════
Image Augmentation for ML Training
═══════════════════════════════════════════
Loading: /path/to/face.jpg
Original image: 400×400
───────────────────────────────────────────
1. Standard Augmentation Set
───────────────────────────────────────────
Generating standard augmentation variations...
Standard set generation: 0.12s
Generated 8 variations:
✓ face_original.jpg (400×400)
✓ face_flip_horizontal.jpg (400×400)
✓ face_flip_vertical.jpg (400×400)
✓ face_rotate_90.jpg (400×400)
✓ face_rotate_180.jpg (400×400)
✓ face_brightness_+30.jpg (400×400)
✓ face_brightness_-30.jpg (400×400)
✓ face_contrast_high.jpg (400×400)
Standard augmentations include:
• Original (baseline)
• Horizontal and vertical flips
• 90° and 180° rotations
• Brightness adjustments (±30)
• Contrast variations
───────────────────────────────────────────
2. Random Augmentation Pipeline
───────────────────────────────────────────
Generating 10 random augmented variations...
Random augmentation (10 images): 0.28s
Saving random augmentations...
✓ face_random_01.jpg
✓ face_random_02.jpg
✓ face_random_03.jpg
...
───────────────────────────────────────────
3. Dataset Expansion Example
───────────────────────────────────────────
Original dataset: 1 image
After augmentation: 18 images
Expansion factor: 18x
Benefits of augmentation:
• Increases effective dataset size
• Teaches model to recognize objects from different angles
• Reduces overfitting by introducing variations
• Improves model generalization
• Makes models robust to lighting changes

1. ImageAugmentor Class

final class ImageAugmentor
{
public function augment(
\GdImage $image,
int $count = 5,
array $config = []
): array {
$augmented = [];
for ($i = 0; $i < $count; $i++) {
$variant = $this->cloneImage($image);
// Apply random transformations based on probability
if ($config['flip'] && rand(0, 1)) {
$variant = $this->randomFlip($variant);
}
if ($config['rotate'] &&
rand(0, 100) < $config['rotate_probability']) {
$variant = $this->randomRotation(
$variant,
$config['rotation_angles']
);
}
// ... more transformations
$augmented[] = $variant;
}
return $augmented;
}
}

2. Random Transformations

Each transformation applies with a specified probability, creating natural variation:

// 50% chance of flip, 70% chance of rotation, etc.
if (rand(0, 100) < $probability) {
$image = applyTransformation($image);
}

3. Preserving Image Quality

Important: Clone images before transforming to preserve original:

private function cloneImage(\GdImage $image): \GdImage
{
$width = imagesx($image);
$height = imagesy($image);
$clone = imagecreatetruecolor($width, $height);
// Preserve transparency
imagealphablending($clone, false);
imagesavealpha($clone, true);
imagecopy($clone, $image, 0, 0, 0, 0, $width, $height);
return $clone;
}

Data augmentation is effective because:

1. Simulates Real-World Variations

  • Photos taken from different angles → rotations
  • Different camera settings → brightness/contrast
  • Various distances → crops and zoom
  • Different lighting conditions → brightness adjustments

Models trained on augmented data perform better on real-world images because they’ve “seen” similar variations during training.

2. Regularization Effect

Augmentation acts as a form of regularization:

  • Model can’t memorize augmented images (they’re different each epoch)
  • Forces learning of robust features that work across variations
  • Reduces overfitting without reducing model capacity
  • Similar effect to dropout but at the data level

3. Class Balance

If you have 1,000 cat images but only 100 dog images:

  • Augment dogs 10x → 1,000 dog variations
  • Balanced training prevents model bias toward cats
  • Better performance on minority classes

4. Computational Efficiency

Strategy A: Collect more data

  • Expensive (time, cost, labeling effort)
  • May require months to gather
  • Limited by availability

Strategy B: Augment existing data

  • Free (computation cost only)
  • Instant dataset expansion
  • Under your control

Example: Training with augmentation

// Training loop with on-the-fly augmentation
$epochs = 10;
$augmentations_per_image = 5;
for ($epoch = 0; $epoch < $epochs; $epoch++) {
foreach ($trainingImages as $image) {
// Generate augmented variations each epoch
$augmented = $augmentor->augment($image, $augmentations_per_image);
foreach ($augmented as $augImg) {
$features = extractFeatures($augImg);
$model->trainOnSample($features, $image['label']);
}
}
}
// Result: Model trained on original_count × augmentations_per_image × epochs
// distinct variations, learning robust patterns

Conservative (Medical, Scientific Images)

$conservativeConfig = [
'flip' => true, // Only horizontal
'rotate' => false, // Orientation matters
'brightness' => true,
'brightness_range' => ['min' => -10, 'max' => 10], // Minimal
'brightness_probability' => 30,
'contrast' => false, // Diagnostic info
'crop' => false, // Need full image
'zoom' => false,
];

Aggressive (Object Detection, Robust Recognition)

$aggressiveConfig = [
'flip' => true,
'rotate' => true,
'rotation_angles' => range(0, 360, 15), // Every 15 degrees
'rotate_probability' => 90,
'brightness' => true,
'brightness_range' => ['min' => -50, 'max' => 50],
'brightness_probability' => 80,
'contrast' => true,
'contrast_range' => ['min' => -30, 'max' => 30],
'contrast_probability' => 80,
'crop' => true,
'crop_probability' => 70,
'zoom' => true,
'zoom_probability' => 60,
];

Balanced (General Purpose)

$balancedConfig = [
'flip' => true,
'rotate' => true,
'rotation_angles' => [0, 90, 180, 270],
'rotate_probability' => 50,
'brightness' => true,
'brightness_range' => ['min' => -30, 'max' => 30],
'brightness_probability' => 50,
'contrast' => true,
'contrast_range' => ['min' => -20, 'max' => 20],
'contrast_probability' => 50,
'crop' => true,
'crop_probability' => 40,
'zoom' => false,
];

1. Apply During Training Only

// ✓ CORRECT
$trainingImages = augment($originalTrainingSet);
$testImages = $originalTestSet; // NO augmentation
// ✗ WRONG
$trainingImages = augment($originalTrainingSet);
$testImages = augment($originalTestSet); // Don't augment test data!

2. Keep Validation/Test Sets Pristine

  • Augmentation artificially improves metrics if applied to test data
  • Test on unmodified images to measure real-world performance
  • Split data BEFORE augmentation: train (80%) → augment, test (20%) → no augmentation

3. Save or Generate On-The-Fly?

Pre-generated (save augmented images):

✓ Faster training (no computation during training)
✓ Reproducible experiments
✗ More disk space (10x the images)
✗ Fixed variations (same each epoch)

On-the-fly (generate during training):

✓ Less disk space (store only originals)
✓ Different variations each epoch (better regularization)
✗ Slower training (augmentation overhead)
✗ Requires careful random seed management

Recommendation: Pre-generate for small datasets (<10K images), on-the-fly for large datasets.

4. Match Augmentation to Domain

DomainAugmentation StrategyAvoid
Natural photosAll techniques (flips, rotate, lighting)Extreme rotations (>30°)
Text/DocumentsSlight rotation (±5°), brightnessFlips, large rotations
Medical scansMinimal (brightness only)Most transformations
Satellite imageryRotations, flips, zoomColor adjustments
Product photosLighting, zoom, slight rotationVertical flips
FacesFlips, lighting, slight rotation (±15°)Vertical flips, extreme rotation

5. Validate Augmentations Visually

Always inspect augmented images to ensure they look realistic:

// Save first few augmented samples for manual review
$samples = $augmentor->augment($image, 5);
foreach ($samples as $i => $sample) {
$loader->save($sample, "review/sample_{$i}.jpg");
}
// Open review/ directory and check if augmentations look natural

Augmented images look unrealistic

  • Reduce transformation intensity (smaller brightness range, fewer rotation angles)
  • Lower probabilities (apply transformations less frequently)
  • Disable problematic transformations (e.g., vertical flips for natural scenes)
  • Check if rotations are creating black borders (increase fill color)

Model accuracy doesn’t improve with augmentation

  • You may already have sufficient data
  • Augmentations may not match real-world variations your model encounters
  • Try different augmentation strategies (e.g., focus on lighting if that’s the main variation)
  • Ensure test set isn’t augmented (would hide problems)
  • Model may be underfitting (too simple for data complexity)

Training takes much longer

  • Reduce augmentations per image (10 → 5)
  • Pre-generate augmented dataset instead of on-the-fly
  • Disable expensive transformations (complex rotations, large crops)
  • Consider using GPU for augmentation if available

Memory errors during batch augmentation

  • Process images one at a time instead of loading all
  • Destroy images immediately after saving: imagedestroy($augmentedImage)
  • Reduce output quality (JPEG quality 85 instead of 95)
  • Check PHP memory_limit (increase if needed)

Augmented images have quality loss

  • Use PNG for lossless storage instead of JPEG
  • Increase JPEG quality parameter (85-95)
  • Minimize cascading transformations (each operation degrades quality)
  • Apply all transformations in one pass when possible

Test your understanding with these practical exercises. Solutions are in the solutions/ directory.

Goal: Create a comprehensive image analysis tool that extracts and displays all available information about an image.

Create a command-line script called image-analyzer.php that:

Requirements:

  1. Accepts an image filepath as a command-line argument
  2. Loads the image and displays:
    • Basic information (dimensions, format, file size)
    • Color analysis (average color, dominant color)
    • Statistical features (mean and std dev of each channel)
    • Edge density (texture complexity)
    • Color histogram for each channel
    • Brightness analysis and classification
    • Color balance assessment
  3. Provides an assessment of image type (landscape, portrait, abstract, etc.) based on features
  4. Suggests preprocessing steps for ML based on image characteristics
  5. Handles errors gracefully (file not found, unsupported format)

Validation: Test with all three sample images:

Terminal window
php image-analyzer.php data/landscape.jpg
php image-analyzer.php data/face.jpg
php image-analyzer.php data/sample.jpg

Expected output should include all metrics and a summary assessment.

Solution: solutions/exercise1-image-analyzer.php

Goal: Build a production-ready thumbnail generator that creates multiple sizes while maintaining aspect ratio.

Create thumbnail-generator.php that:

Requirements:

  1. Accepts an image path and optional output directory
  2. Generates thumbnails in these sizes:
    • Small: 150×150 (maintaining aspect ratio)
    • Medium: 300×300 (maintaining aspect ratio)
    • Large: 600×600 (maintaining aspect ratio)
    • Square Small: 100×100 (center crop)
    • Square Medium: 250×250 (center crop)
  3. Saves thumbnails with descriptive filenames (e.g., landscape_small.jpg)
  4. Displays file sizes and compression ratios
  5. Optionally generates an HTML preview page showing all thumbnails
  6. Reports processing time and space savings

Validation: Run with a sample image:

Terminal window
php thumbnail-generator.php data/landscape.jpg output/

Verify that 5 thumbnail files are created with correct dimensions.

Solution: solutions/exercise2-thumbnail-generator.php

Goal: Compare two images by extracting and analyzing their features to calculate a similarity score.

Create feature-comparison.php that:

Requirements:

  1. Accepts two image paths as command-line arguments
  2. Extracts features from both images:
    • Dimensions and aspect ratio
    • Average and dominant colors
    • Color histograms
    • Edge density
    • Statistical features
  3. Calculates similarity scores for each feature category:
    • Aspect ratio similarity
    • Color similarity (Euclidean distance in RGB space)
    • Histogram correlation
    • Texture similarity
  4. Computes an overall weighted similarity score
  5. Provides interpretation (very similar, somewhat similar, different)
  6. Suggests use cases based on similarity level

Validation: Compare similar and dissimilar images:

Terminal window
# Compare different images (should be low similarity)
php feature-comparison.php data/sample.jpg data/landscape.jpg
# Compare resized versions of same image (should be high similarity)
php feature-comparison.php data/sample.jpg output/sample_resized.jpg

Solution: solutions/exercise3-feature-comparison.php

Build a tool that processes an entire directory of images, applying standardized preprocessing:

Requirements:

  • Scan directory for image files
  • Resize all to standard dimensions (e.g., 1920×1080)
  • Optimize quality (JPEG 85%)
  • Convert to consistent format (WEBP)
  • Generate thumbnails automatically
  • Create a manifest file with metadata
  • Show progress and statistics

This exercise combines all chapter concepts into a practical production tool!

Common issues and solutions for image processing in PHP:

Problem: “Call to undefined function imagecreatefromjpeg()”

Cause: GD extension is not installed or not enabled.

Solution:

Ubuntu/Debian:

Terminal window
sudo apt-get update
sudo apt-get install php8.4-gd
sudo systemctl restart apache2 # or php8.4-fpm

macOS (Homebrew):

Terminal window
brew install php@8.4 # Includes GD
brew services restart php@8.4

Windows:

  1. Edit php.ini
  2. Uncomment: extension=gd
  3. Restart web server

Verify:

Terminal window
php -m | grep gd
php -r "var_dump(gd_info());"

Problem: “Allowed memory size exhausted”

Cause: Image processing is memory-intensive. A 4000×3000 image requires ~46 MB uncompressed in memory.

Solution:

Temporary (per-script):

ini_set('memory_limit', '256M');

Permanent (php.ini):

memory_limit = 256M

Best practices:

  • Resize images before processing: smaller dimensions = less memory
  • Destroy images when done: imagedestroy($image)
  • Process images sequentially, not all at once
  • Use streaming for very large files

Problem: Images become blurry or lose quality after processing

Cause: Multiple JPEG saves compound lossy compression; low quality settings; excessive resizing.

Solution:

  • Use PNG for intermediate steps (lossless)
  • Save JPEG once at the end
  • Use quality 85-95 for final saves
  • Don’t upscale significantly (adds no detail)
// Good: Save once with high quality
$processor->resize($image, 800, 600);
$converter->toGrayscale($image);
$loader->save($image, 'output.jpg', 90);
// Bad: Multiple saves lose quality
$loader->save($image, 'temp.jpg', 75);
$image = $loader->load('temp.jpg');
$loader->save($image, 'final.jpg', 75);

Problem: “Failed to create image from file” or “Unsupported image type”

Cause: File corrupted, wrong extension, unsupported format.

Solution:

  1. Verify file is valid: open in image viewer
  2. Check format support: php -r "var_dump(gd_info());"
  3. Convert if needed: convert input.tiff output.jpg (ImageMagick)
  4. Validate with getimagesize():
$info = @getimagesize($filepath);
if ($info === false) {
throw new \RuntimeException("Not a valid image file");
}

Problem: “Failed to save image” or “Permission denied”

Cause: Output directory doesn’t exist or isn’t writable.

Solution:

Terminal window
# Create directory with proper permissions
mkdir -p output
chmod 755 output
# Or in PHP
if (!is_dir('output')) {
mkdir('output', 0755, true);
}
if (!is_writable('output')) {
throw new \RuntimeException('Output directory not writable');
}

Problem: PNG images with transparency show black background after processing

Cause: Need to preserve alpha channel.

Solution:

// Enable alpha blending and save alpha channel
imagealphablending($newImage, false);
imagesavealpha($newImage, true);
// Then copy/process the image
imagecopyresampled(/* ... */);

Problem: Image processing is very slow

Solutions:

  • Resize first: Process smaller dimensions when possible
  • Sample pixels: For analysis, sample every nth pixel
  • Batch efficiently: Reuse objects, avoid repeated initialization
  • Use appropriate formats: JPEG faster than PNG for photos
  • Consider alternatives: For heavy CV, integrate Python/OpenCV
// Efficient: Resize before expensive operations
$small = $processor->resize($image, 640, 480);
$features = $extractor->extractAllFeatures($small);
// Inefficient: Extract from full resolution
$features = $extractor->extractAllFeatures($image); // Slow!

Congratulations! You’ve mastered the fundamentals of computer vision in PHP. Let’s recap what you’ve accomplished:

✓ Core Concepts

  • Understood images as 2D arrays of RGB pixels
  • Learned how PHP’s GD extension provides low-level image access
  • Recognized the memory and performance implications of image processing

✓ Image Manipulation

  • Loaded and saved images in multiple formats (JPEG, PNG, GIF, WEBP)
  • Resized images while maintaining or ignoring aspect ratios
  • Cropped, rotated, flipped, and scaled images
  • Applied filters for blur, sharpen, and edge detection
  • Created artistic effects (sepia, sketch, emboss)

✓ Color Processing

  • Converted images to grayscale using luminosity weighting
  • Extracted individual color channels (R, G, B)
  • Analyzed average and dominant colors
  • Adjusted brightness and contrast
  • Understood why grayscale reduces ML complexity

✓ Data Augmentation

  • Generated training variations through random transformations
  • Applied flips, rotations, brightness, contrast, crops, and zoom
  • Expanded limited datasets (1 image → 18+ variations)
  • Learned when to use augmentation (training) vs. when not (testing)
  • Implemented conservative vs. aggressive augmentation strategies

✓ Feature Extraction

  • Extracted statistical features (mean, std dev, aspect ratio)
  • Calculated edge density as a texture measure
  • Generated color histograms showing distribution
  • Flattened images to feature vectors for ML algorithms
  • Compared statistical vs. raw pixel features

✓ ML Preparation

  • Standardized image dimensions for consistent input
  • Normalized feature values to 0-1 range
  • Created complete preprocessing pipelines
  • Built ML-ready datasets with features and labels
  • Understood when to use each feature type

✓ Practical Skills

  • Built an image analyzer extracting comprehensive metrics
  • Created a thumbnail generator with multiple sizes
  • Implemented feature comparison for similarity detection
  • Learned to debug common GD extension issues
  • Recognized PHP’s capabilities and limitations for CV

You can now build:

Content Management

  • User photo uploads with automatic resizing
  • Thumbnail generation for galleries
  • Image optimization for web delivery
  • Automatic format conversion

Machine Learning Preparation

  • Image preprocessing pipelines
  • Feature extraction for classification
  • Dataset standardization
  • Batch processing for training data

Image Analysis

  • Color-based categorization
  • Duplicate image detection
  • Similarity search
  • Quality assessment

Content Moderation

  • Blur detection (blurry images)
  • Brightness analysis (over/underexposed)
  • Dimension validation
  • Format verification

Chapter 17 builds on these foundations:

  • Use pre-trained deep learning models for classification
  • Implement object detection in images
  • Integrate cloud vision APIs
  • Build production CV pipelines

But the skills you’ve learned here—loading images, extracting features, preprocessing for ML—underpin everything in computer vision. Whether you use PHP’s GD, integrate Python/OpenCV, or call cloud APIs, understanding pixels, channels, and features is essential.

1. Images are just data: 2D grids of numbers that you can analyze, transform, and extract information from programmatically.

2. Preprocessing matters: Standardizing dimensions, converting to grayscale, and normalizing values significantly impacts ML performance.

3. Feature engineering is powerful: Even simple statistical features (11 values) can enable effective classification without deep learning.

4. PHP has limitations: For heavy computer vision (real-time object detection, complex CNNs), integrate Python/OpenCV or use cloud services. But PHP excels at web-oriented image tasks.

5. Understand tradeoffs: Memory vs. quality, speed vs. accuracy, statistical vs. raw features, PHP vs. Python—choose based on requirements.

You’re now ready to tackle image classification, apply pre-trained models, and build intelligent image-processing features into your PHP applications!

  • Deep Learning for CV: Use Chapter 17’s pre-trained models
  • OpenCV with PHP: FFI or subprocess integration patterns
  • Cloud Vision APIs: Google Cloud Vision, AWS Rekognition, Azure Computer Vision
  • Real-time Processing: Video frame analysis, webcam integration
  • ImageMagick: More powerful than GD, CLI and PHP extension available
  • GraphicsMagick: Fork of ImageMagick, optimized for batch processing
  • VIPS: High-performance image processing library
  • WebP tools: cwebp, dwebp for WEBP conversion