All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 16m0s
10 KiB
10 KiB
ADR-046: Image Processing Pipeline
Date: 2026-01-09
Status: Accepted
Implemented: 2026-01-09
Context
The application handles significant image processing for flyer uploads:
- Privacy Protection: Strip EXIF metadata (location, device info).
- Optimization: Resize, compress, and convert images for web delivery.
- Icon Generation: Create thumbnails for listing views.
- Format Support: Handle JPEG, PNG, WebP, and PDF inputs.
- Storage Management: Organize processed images on disk.
These operations must be:
- Performant: Large images should not block the request.
- Secure: Prevent malicious file uploads.
- Consistent: Produce predictable output quality.
- Testable: Support unit testing without real files.
Decision
We will implement a modular image processing pipeline using:
- Sharp: For image resizing, compression, and format conversion.
- EXIF Parsing: For metadata extraction and stripping.
- UUID Naming: For unique, non-guessable file names.
- Directory Structure: Organized storage for originals and derivatives.
Design Principles
- Pipeline Pattern: Chain processing steps in a predictable order.
- Fail-Fast Validation: Reject invalid files before processing.
- Idempotent Operations: Same input produces same output.
- Resource Cleanup: Delete temp files on error.
Implementation Details
Image Processor Module
Located in src/utils/imageProcessor.ts:
import sharp from 'sharp';
import path from 'path';
import { v4 as uuidv4 } from 'uuid';
import fs from 'fs/promises';
import type { Logger } from 'pino';
// ============================================
// CONFIGURATION
// ============================================
const IMAGE_CONFIG = {
maxWidth: 2048,
maxHeight: 2048,
quality: 85,
iconSize: 200,
allowedFormats: ['jpeg', 'png', 'webp', 'avif'],
outputFormat: 'webp' as const,
};
// ============================================
// MAIN PROCESSING FUNCTION
// ============================================
export async function processAndSaveImage(
inputPath: string,
outputDir: string,
originalFileName: string,
logger: Logger,
): Promise<string> {
const outputFileName = `${uuidv4()}.${IMAGE_CONFIG.outputFormat}`;
const outputPath = path.join(outputDir, outputFileName);
logger.info({ inputPath, outputPath }, 'Processing image');
try {
// Create sharp instance and strip metadata
await sharp(inputPath)
.rotate() // Auto-rotate based on EXIF orientation
.resize(IMAGE_CONFIG.maxWidth, IMAGE_CONFIG.maxHeight, {
fit: 'inside',
withoutEnlargement: true,
})
.webp({ quality: IMAGE_CONFIG.quality })
.toFile(outputPath);
logger.info({ outputPath }, 'Image processed successfully');
return outputFileName;
} catch (error) {
logger.error({ error, inputPath }, 'Image processing failed');
throw error;
}
}
Icon Generation
export async function generateFlyerIcon(
inputPath: string,
iconsDir: string,
logger: Logger,
): Promise<string> {
// Ensure icons directory exists
await fs.mkdir(iconsDir, { recursive: true });
const iconFileName = `${uuidv4()}-icon.webp`;
const iconPath = path.join(iconsDir, iconFileName);
logger.info({ inputPath, iconPath }, 'Generating icon');
await sharp(inputPath)
.resize(IMAGE_CONFIG.iconSize, IMAGE_CONFIG.iconSize, {
fit: 'cover',
position: 'top', // Flyers usually have store name at top
})
.webp({ quality: 80 })
.toFile(iconPath);
logger.info({ iconPath }, 'Icon generated successfully');
return iconFileName;
}
EXIF Metadata Extraction
For audit/logging purposes before stripping:
import ExifParser from 'exif-parser';
export async function extractExifMetadata(
filePath: string,
logger: Logger,
): Promise<ExifMetadata | null> {
try {
const buffer = await fs.readFile(filePath);
const parser = ExifParser.create(buffer);
const result = parser.parse();
const metadata: ExifMetadata = {
make: result.tags?.Make,
model: result.tags?.Model,
dateTime: result.tags?.DateTimeOriginal,
gpsLatitude: result.tags?.GPSLatitude,
gpsLongitude: result.tags?.GPSLongitude,
orientation: result.tags?.Orientation,
};
// Log if GPS data was present (privacy concern)
if (metadata.gpsLatitude || metadata.gpsLongitude) {
logger.info({ filePath }, 'GPS data found in image, will be stripped during processing');
}
return metadata;
} catch (error) {
logger.debug({ error, filePath }, 'No EXIF data found or parsing failed');
return null;
}
}
PDF to Image Conversion
import * as pdfjs from 'pdfjs-dist';
export async function convertPdfToImages(
pdfPath: string,
outputDir: string,
logger: Logger,
): Promise<string[]> {
const pdfData = await fs.readFile(pdfPath);
const pdf = await pdfjs.getDocument({ data: pdfData }).promise;
const outputPaths: string[] = [];
for (let i = 1; i <= pdf.numPages; i++) {
const page = await pdf.getPage(i);
const viewport = page.getViewport({ scale: 2.0 }); // 2x for quality
// Create canvas and render
const canvas = createCanvas(viewport.width, viewport.height);
const context = canvas.getContext('2d');
await page.render({
canvasContext: context,
viewport: viewport,
}).promise;
// Save as image
const outputFileName = `${uuidv4()}-page-${i}.png`;
const outputPath = path.join(outputDir, outputFileName);
const buffer = canvas.toBuffer('image/png');
await fs.writeFile(outputPath, buffer);
outputPaths.push(outputPath);
logger.info({ page: i, outputPath }, 'PDF page converted to image');
}
return outputPaths;
}
File Validation
import { fileTypeFromBuffer } from 'file-type';
export async function validateImageFile(
filePath: string,
logger: Logger,
): Promise<{ valid: boolean; mimeType: string | null; error?: string }> {
try {
const buffer = await fs.readFile(filePath, { length: 4100 }); // Read header only
const type = await fileTypeFromBuffer(buffer);
if (!type) {
return { valid: false, mimeType: null, error: 'Unknown file type' };
}
const allowedMimes = ['image/jpeg', 'image/png', 'image/webp', 'image/avif', 'application/pdf'];
if (!allowedMimes.includes(type.mime)) {
return {
valid: false,
mimeType: type.mime,
error: `File type ${type.mime} not allowed`,
};
}
return { valid: true, mimeType: type.mime };
} catch (error) {
logger.error({ error, filePath }, 'File validation failed');
return { valid: false, mimeType: null, error: 'Validation error' };
}
}
Storage Organization
flyer-images/
├── originals/ # Uploaded files (if kept)
│ └── {uuid}.{ext}
├── processed/ # Optimized images (or root level)
│ └── {uuid}.webp
├── icons/ # Thumbnails
│ └── {uuid}-icon.webp
└── temp/ # Temporary processing files
└── {uuid}.tmp
Cleanup Utilities
export async function cleanupTempFiles(
tempDir: string,
maxAgeMs: number,
logger: Logger,
): Promise<number> {
const files = await fs.readdir(tempDir);
const now = Date.now();
let deletedCount = 0;
for (const file of files) {
const filePath = path.join(tempDir, file);
const stats = await fs.stat(filePath);
const age = now - stats.mtimeMs;
if (age > maxAgeMs) {
await fs.unlink(filePath);
deletedCount++;
}
}
logger.info({ deletedCount, tempDir }, 'Cleaned up temp files');
return deletedCount;
}
Integration with Flyer Processing
// In flyerProcessingService.ts
export async function processUploadedFlyer(
file: Express.Multer.File,
logger: Logger,
): Promise<{ imageUrl: string; iconUrl: string }> {
const flyerImageDir = 'flyer-images';
const iconsDir = path.join(flyerImageDir, 'icons');
// 1. Validate file
const validation = await validateImageFile(file.path, logger);
if (!validation.valid) {
throw new ValidationError([{ path: 'file', message: validation.error! }]);
}
// 2. Extract and log EXIF before stripping
await extractExifMetadata(file.path, logger);
// 3. Process and optimize image
const processedFileName = await processAndSaveImage(
file.path,
flyerImageDir,
file.originalname,
logger,
);
// 4. Generate icon
const processedImagePath = path.join(flyerImageDir, processedFileName);
const iconFileName = await generateFlyerIcon(processedImagePath, iconsDir, logger);
// 5. Construct URLs
const baseUrl = process.env.BACKEND_URL || 'http://localhost:3001';
const imageUrl = `${baseUrl}/flyer-images/${processedFileName}`;
const iconUrl = `${baseUrl}/flyer-images/icons/${iconFileName}`;
// 6. Delete original upload (privacy)
await fs.unlink(file.path);
return { imageUrl, iconUrl };
}
Consequences
Positive
- Privacy: EXIF metadata (including GPS) is stripped automatically.
- Performance: WebP output reduces file sizes by 25-35%.
- Consistency: All images processed to standard format and dimensions.
- Security: File type validation prevents malicious uploads.
- Organization: Clear directory structure for storage management.
Negative
- CPU Intensive: Image processing can be slow for large files.
- Storage: Keeping originals doubles storage requirements.
- Dependency: Sharp requires native binaries.
Mitigation
- Process images in background jobs (BullMQ queue).
- Configure whether to keep originals based on requirements.
- Use pre-built Sharp binaries via npm.
Key Files
src/utils/imageProcessor.ts- Core image processing functionssrc/services/flyer/flyerProcessingService.ts- Integration with flyer workflowsrc/middleware/fileUpload.middleware.ts- Multer configuration