# ADR-046: Image Processing Pipeline **Date**: 2026-01-09 **Status**: Accepted **Implemented**: 2026-01-09 ## Context The application handles significant image processing for flyer uploads: 1. **Privacy Protection**: Strip EXIF metadata (location, device info). 2. **Optimization**: Resize, compress, and convert images for web delivery. 3. **Icon Generation**: Create thumbnails for listing views. 4. **Format Support**: Handle JPEG, PNG, WebP, and PDF inputs. 5. **Storage Management**: Organize processed images on disk. These operations must be: - **Performant**: Large images should not block the request. - **Secure**: Prevent malicious file uploads. - **Consistent**: Produce predictable output quality. - **Testable**: Support unit testing without real files. ## Decision We will implement a modular image processing pipeline using: 1. **Sharp**: For image resizing, compression, and format conversion. 2. **EXIF Parsing**: For metadata extraction and stripping. 3. **UUID Naming**: For unique, non-guessable file names. 4. **Directory Structure**: Organized storage for originals and derivatives. ### Design Principles - **Pipeline Pattern**: Chain processing steps in a predictable order. - **Fail-Fast Validation**: Reject invalid files before processing. - **Idempotent Operations**: Same input produces same output. - **Resource Cleanup**: Delete temp files on error. ## Implementation Details ### Image Processor Module Located in `src/utils/imageProcessor.ts`: ```typescript import sharp from 'sharp'; import path from 'path'; import { v4 as uuidv4 } from 'uuid'; import fs from 'fs/promises'; import type { Logger } from 'pino'; // ============================================ // CONFIGURATION // ============================================ const IMAGE_CONFIG = { maxWidth: 2048, maxHeight: 2048, quality: 85, iconSize: 200, allowedFormats: ['jpeg', 'png', 'webp', 'avif'], outputFormat: 'webp' as const, }; // ============================================ // MAIN PROCESSING FUNCTION // ============================================ export async function processAndSaveImage( inputPath: string, outputDir: string, originalFileName: string, logger: Logger, ): Promise { const outputFileName = `${uuidv4()}.${IMAGE_CONFIG.outputFormat}`; const outputPath = path.join(outputDir, outputFileName); logger.info({ inputPath, outputPath }, 'Processing image'); try { // Create sharp instance and strip metadata await sharp(inputPath) .rotate() // Auto-rotate based on EXIF orientation .resize(IMAGE_CONFIG.maxWidth, IMAGE_CONFIG.maxHeight, { fit: 'inside', withoutEnlargement: true, }) .webp({ quality: IMAGE_CONFIG.quality }) .toFile(outputPath); logger.info({ outputPath }, 'Image processed successfully'); return outputFileName; } catch (error) { logger.error({ error, inputPath }, 'Image processing failed'); throw error; } } ``` ### Icon Generation ```typescript export async function generateFlyerIcon( inputPath: string, iconsDir: string, logger: Logger, ): Promise { // Ensure icons directory exists await fs.mkdir(iconsDir, { recursive: true }); const iconFileName = `${uuidv4()}-icon.webp`; const iconPath = path.join(iconsDir, iconFileName); logger.info({ inputPath, iconPath }, 'Generating icon'); await sharp(inputPath) .resize(IMAGE_CONFIG.iconSize, IMAGE_CONFIG.iconSize, { fit: 'cover', position: 'top', // Flyers usually have store name at top }) .webp({ quality: 80 }) .toFile(iconPath); logger.info({ iconPath }, 'Icon generated successfully'); return iconFileName; } ``` ### EXIF Metadata Extraction For audit/logging purposes before stripping: ```typescript import ExifParser from 'exif-parser'; export async function extractExifMetadata( filePath: string, logger: Logger, ): Promise { try { const buffer = await fs.readFile(filePath); const parser = ExifParser.create(buffer); const result = parser.parse(); const metadata: ExifMetadata = { make: result.tags?.Make, model: result.tags?.Model, dateTime: result.tags?.DateTimeOriginal, gpsLatitude: result.tags?.GPSLatitude, gpsLongitude: result.tags?.GPSLongitude, orientation: result.tags?.Orientation, }; // Log if GPS data was present (privacy concern) if (metadata.gpsLatitude || metadata.gpsLongitude) { logger.info({ filePath }, 'GPS data found in image, will be stripped during processing'); } return metadata; } catch (error) { logger.debug({ error, filePath }, 'No EXIF data found or parsing failed'); return null; } } ``` ### PDF to Image Conversion ```typescript import * as pdfjs from 'pdfjs-dist'; export async function convertPdfToImages( pdfPath: string, outputDir: string, logger: Logger, ): Promise { const pdfData = await fs.readFile(pdfPath); const pdf = await pdfjs.getDocument({ data: pdfData }).promise; const outputPaths: string[] = []; for (let i = 1; i <= pdf.numPages; i++) { const page = await pdf.getPage(i); const viewport = page.getViewport({ scale: 2.0 }); // 2x for quality // Create canvas and render const canvas = createCanvas(viewport.width, viewport.height); const context = canvas.getContext('2d'); await page.render({ canvasContext: context, viewport: viewport, }).promise; // Save as image const outputFileName = `${uuidv4()}-page-${i}.png`; const outputPath = path.join(outputDir, outputFileName); const buffer = canvas.toBuffer('image/png'); await fs.writeFile(outputPath, buffer); outputPaths.push(outputPath); logger.info({ page: i, outputPath }, 'PDF page converted to image'); } return outputPaths; } ``` ### File Validation ```typescript import { fileTypeFromBuffer } from 'file-type'; export async function validateImageFile( filePath: string, logger: Logger, ): Promise<{ valid: boolean; mimeType: string | null; error?: string }> { try { const buffer = await fs.readFile(filePath, { length: 4100 }); // Read header only const type = await fileTypeFromBuffer(buffer); if (!type) { return { valid: false, mimeType: null, error: 'Unknown file type' }; } const allowedMimes = ['image/jpeg', 'image/png', 'image/webp', 'image/avif', 'application/pdf']; if (!allowedMimes.includes(type.mime)) { return { valid: false, mimeType: type.mime, error: `File type ${type.mime} not allowed`, }; } return { valid: true, mimeType: type.mime }; } catch (error) { logger.error({ error, filePath }, 'File validation failed'); return { valid: false, mimeType: null, error: 'Validation error' }; } } ``` ### Storage Organization ``` flyer-images/ ├── originals/ # Uploaded files (if kept) │ └── {uuid}.{ext} ├── processed/ # Optimized images (or root level) │ └── {uuid}.webp ├── icons/ # Thumbnails │ └── {uuid}-icon.webp └── temp/ # Temporary processing files └── {uuid}.tmp ``` ### Cleanup Utilities ```typescript export async function cleanupTempFiles( tempDir: string, maxAgeMs: number, logger: Logger, ): Promise { const files = await fs.readdir(tempDir); const now = Date.now(); let deletedCount = 0; for (const file of files) { const filePath = path.join(tempDir, file); const stats = await fs.stat(filePath); const age = now - stats.mtimeMs; if (age > maxAgeMs) { await fs.unlink(filePath); deletedCount++; } } logger.info({ deletedCount, tempDir }, 'Cleaned up temp files'); return deletedCount; } ``` ### Integration with Flyer Processing ```typescript // In flyerProcessingService.ts export async function processUploadedFlyer( file: Express.Multer.File, logger: Logger, ): Promise<{ imageUrl: string; iconUrl: string }> { const flyerImageDir = 'flyer-images'; const iconsDir = path.join(flyerImageDir, 'icons'); // 1. Validate file const validation = await validateImageFile(file.path, logger); if (!validation.valid) { throw new ValidationError([{ path: 'file', message: validation.error! }]); } // 2. Extract and log EXIF before stripping await extractExifMetadata(file.path, logger); // 3. Process and optimize image const processedFileName = await processAndSaveImage( file.path, flyerImageDir, file.originalname, logger, ); // 4. Generate icon const processedImagePath = path.join(flyerImageDir, processedFileName); const iconFileName = await generateFlyerIcon(processedImagePath, iconsDir, logger); // 5. Construct URLs const baseUrl = process.env.BACKEND_URL || 'http://localhost:3001'; const imageUrl = `${baseUrl}/flyer-images/${processedFileName}`; const iconUrl = `${baseUrl}/flyer-images/icons/${iconFileName}`; // 6. Delete original upload (privacy) await fs.unlink(file.path); return { imageUrl, iconUrl }; } ``` ## Consequences ### Positive - **Privacy**: EXIF metadata (including GPS) is stripped automatically. - **Performance**: WebP output reduces file sizes by 25-35%. - **Consistency**: All images processed to standard format and dimensions. - **Security**: File type validation prevents malicious uploads. - **Organization**: Clear directory structure for storage management. ### Negative - **CPU Intensive**: Image processing can be slow for large files. - **Storage**: Keeping originals doubles storage requirements. - **Dependency**: Sharp requires native binaries. ### Mitigation - Process images in background jobs (BullMQ queue). - Configure whether to keep originals based on requirements. - Use pre-built Sharp binaries via npm. ## Key Files - `src/utils/imageProcessor.ts` - Core image processing functions - `src/services/flyer/flyerProcessingService.ts` - Integration with flyer workflow - `src/middleware/fileUpload.middleware.ts` - Multer configuration ## Related ADRs - [ADR-033](./0033-file-upload-and-storage-strategy.md) - File Upload Strategy - [ADR-006](./0006-background-job-processing-and-task-queues.md) - Background Jobs - [ADR-041](./0041-ai-gemini-integration-architecture.md) - AI Integration (uses processed images)