4.5 KiB
ADR-006: Background Job Processing and Task Queues
Date: 2025-12-12
Status: Accepted
Context
The application's core purpose involves long-running, resource-intensive tasks like OCR processing of flyers. Executing these tasks within the lifecycle of an API request is unreliable and does not scale. A failure during processing could be lost without a robust system.
Decision
We will implement a dedicated background job processing system using a task queue library like BullMQ (with Redis). All flyer ingestion, OCR processing, and other long-running tasks (e.g., sending bulk emails) MUST be dispatched as jobs to this queue.
Consequences
Positive: Decouples the API from heavy processing, allows for retries on failure, and enables scaling the processing workers independently. Increases application reliability and resilience. Negative: Introduces a new dependency (Redis) into the infrastructure. Requires refactoring of the flyer processing logic to work within a job queue structure.
Implementation Details
Queue Infrastructure
The implementation uses BullMQ v5.65.1 with ioredis v5.8.2 for Redis connectivity. Six distinct queues handle different job types:
| Queue Name | Purpose | Retry Attempts | Backoff Strategy |
|---|---|---|---|
flyer-processing |
OCR/AI processing of flyers | 3 | Exponential (5s base) |
email-sending |
Email delivery | 5 | Exponential (10s base) |
analytics-reporting |
Daily report generation | 2 | Exponential (60s base) |
weekly-analytics-reporting |
Weekly report generation | 2 | Exponential (1h base) |
file-cleanup |
Temporary file cleanup | 3 | Exponential (30s base) |
token-cleanup |
Expired token removal | 2 | Exponential (1h base) |
Key Files
src/services/queues.server.ts- Queue definitions and configurationsrc/services/workers.server.ts- Worker implementations with configurable concurrencysrc/services/redis.server.ts- Redis connection managementsrc/services/queueService.server.ts- Queue lifecycle and graceful shutdownsrc/services/flyerProcessingService.server.ts- 5-stage flyer processing pipelinesrc/types/job-data.ts- TypeScript interfaces for all job data types
API Design
Endpoints for long-running tasks return 202 Accepted immediately with a job ID:
POST /api/ai/upload-and-process → 202 { jobId: "..." }
GET /api/ai/jobs/:jobId/status → { state: "...", progress: ... }
Worker Configuration
Workers are configured via environment variables:
WORKER_CONCURRENCY- Flyer processing parallelism (default: 1)EMAIL_WORKER_CONCURRENCY- Email worker parallelism (default: 10)ANALYTICS_WORKER_CONCURRENCY- Analytics worker parallelism (default: 1)CLEANUP_WORKER_CONCURRENCY- Cleanup worker parallelism (default: 10)
Monitoring
- Bull Board UI available at
/api/admin/jobsfor admin users - Worker status endpoint:
GET /api/admin/workers/status - Queue status endpoint:
GET /api/admin/queues/status
Graceful Shutdown
Both API and worker processes implement graceful shutdown with a 30-second timeout, ensuring in-flight jobs complete before process termination.
Compliance Notes
Deprecated Synchronous Endpoints
The following endpoints process flyers synchronously and are deprecated:
POST /api/ai/upload-legacy- For integration testing onlyPOST /api/ai/flyers/process- Legacy workflow, should migrate to queue-based approach
New integrations MUST use POST /api/ai/upload-and-process for queue-based processing.
Email Handling
- Bulk emails (deal notifications): Enqueued via
emailQueue - Transactional emails (password reset): Sent synchronously for immediate user feedback
Future Enhancements
Potential improvements for consideration:
- Dead Letter Queue (DLQ): Move permanently failed jobs to a dedicated queue for analysis
- Job Priority Levels: Allow priority-based processing for different job types
- Real-time Progress: WebSocket/SSE for live job progress updates to clients
- Per-Queue Rate Limiting: Throttle job processing based on external API limits
- Job Dependencies: Support for jobs that depend on completion of other jobs
- Prometheus Metrics: Export queue metrics for observability dashboards