integration test fixes - claude for the win? try 3
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 30m3s
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 30m3s
This commit is contained in:
@@ -2,7 +2,7 @@
|
||||
|
||||
**Date**: 2025-12-12
|
||||
|
||||
**Status**: Proposed
|
||||
**Status**: Accepted
|
||||
|
||||
## Context
|
||||
|
||||
@@ -16,3 +16,82 @@ We will implement a dedicated background job processing system using a task queu
|
||||
|
||||
**Positive**: Decouples the API from heavy processing, allows for retries on failure, and enables scaling the processing workers independently. Increases application reliability and resilience.
|
||||
**Negative**: Introduces a new dependency (Redis) into the infrastructure. Requires refactoring of the flyer processing logic to work within a job queue structure.
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Queue Infrastructure
|
||||
|
||||
The implementation uses **BullMQ v5.65.1** with **ioredis v5.8.2** for Redis connectivity. Six distinct queues handle different job types:
|
||||
|
||||
| Queue Name | Purpose | Retry Attempts | Backoff Strategy |
|
||||
| ---------------------------- | --------------------------- | -------------- | ---------------------- |
|
||||
| `flyer-processing` | OCR/AI processing of flyers | 3 | Exponential (5s base) |
|
||||
| `email-sending` | Email delivery | 5 | Exponential (10s base) |
|
||||
| `analytics-reporting` | Daily report generation | 2 | Exponential (60s base) |
|
||||
| `weekly-analytics-reporting` | Weekly report generation | 2 | Exponential (1h base) |
|
||||
| `file-cleanup` | Temporary file cleanup | 3 | Exponential (30s base) |
|
||||
| `token-cleanup` | Expired token removal | 2 | Exponential (1h base) |
|
||||
|
||||
### Key Files
|
||||
|
||||
- `src/services/queues.server.ts` - Queue definitions and configuration
|
||||
- `src/services/workers.server.ts` - Worker implementations with configurable concurrency
|
||||
- `src/services/redis.server.ts` - Redis connection management
|
||||
- `src/services/queueService.server.ts` - Queue lifecycle and graceful shutdown
|
||||
- `src/services/flyerProcessingService.server.ts` - 5-stage flyer processing pipeline
|
||||
- `src/types/job-data.ts` - TypeScript interfaces for all job data types
|
||||
|
||||
### API Design
|
||||
|
||||
Endpoints for long-running tasks return **202 Accepted** immediately with a job ID:
|
||||
|
||||
```text
|
||||
POST /api/ai/upload-and-process → 202 { jobId: "..." }
|
||||
GET /api/ai/jobs/:jobId/status → { state: "...", progress: ... }
|
||||
```
|
||||
|
||||
### Worker Configuration
|
||||
|
||||
Workers are configured via environment variables:
|
||||
|
||||
- `WORKER_CONCURRENCY` - Flyer processing parallelism (default: 1)
|
||||
- `EMAIL_WORKER_CONCURRENCY` - Email worker parallelism (default: 10)
|
||||
- `ANALYTICS_WORKER_CONCURRENCY` - Analytics worker parallelism (default: 1)
|
||||
- `CLEANUP_WORKER_CONCURRENCY` - Cleanup worker parallelism (default: 10)
|
||||
|
||||
### Monitoring
|
||||
|
||||
- **Bull Board UI** available at `/api/admin/jobs` for admin users
|
||||
- Worker status endpoint: `GET /api/admin/workers/status`
|
||||
- Queue status endpoint: `GET /api/admin/queues/status`
|
||||
|
||||
### Graceful Shutdown
|
||||
|
||||
Both API and worker processes implement graceful shutdown with a 30-second timeout, ensuring in-flight jobs complete before process termination.
|
||||
|
||||
## Compliance Notes
|
||||
|
||||
### Deprecated Synchronous Endpoints
|
||||
|
||||
The following endpoints process flyers synchronously and are **deprecated**:
|
||||
|
||||
- `POST /api/ai/upload-legacy` - For integration testing only
|
||||
- `POST /api/ai/flyers/process` - Legacy workflow, should migrate to queue-based approach
|
||||
|
||||
New integrations MUST use `POST /api/ai/upload-and-process` for queue-based processing.
|
||||
|
||||
### Email Handling
|
||||
|
||||
- **Bulk emails** (deal notifications): Enqueued via `emailQueue`
|
||||
- **Transactional emails** (password reset): Sent synchronously for immediate user feedback
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Potential improvements for consideration:
|
||||
|
||||
1. **Dead Letter Queue (DLQ)**: Move permanently failed jobs to a dedicated queue for analysis
|
||||
2. **Job Priority Levels**: Allow priority-based processing for different job types
|
||||
3. **Real-time Progress**: WebSocket/SSE for live job progress updates to clients
|
||||
4. **Per-Queue Rate Limiting**: Throttle job processing based on external API limits
|
||||
5. **Job Dependencies**: Support for jobs that depend on completion of other jobs
|
||||
6. **Prometheus Metrics**: Export queue metrics for observability dashboards
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
**Date**: 2025-12-12
|
||||
|
||||
**Status**: Proposed
|
||||
**Status**: Accepted
|
||||
|
||||
## Context
|
||||
|
||||
@@ -20,3 +20,107 @@ We will implement a multi-layered caching strategy using an in-memory data store
|
||||
|
||||
**Positive**: Directly addresses application performance and scalability. Reduces database load and improves API response times for common requests.
|
||||
**Negative**: Introduces Redis as a dependency if not already used. Adds complexity to the data-fetching logic and requires careful management of cache invalidation to prevent stale data.
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Cache Service
|
||||
|
||||
A centralized cache service (`src/services/cacheService.server.ts`) provides reusable caching functionality:
|
||||
|
||||
- **`getOrSet<T>(key, fetcher, options)`**: Cache-aside pattern implementation
|
||||
- **`get<T>(key)`**: Retrieve cached value
|
||||
- **`set<T>(key, value, ttl)`**: Store value with TTL
|
||||
- **`del(key)`**: Delete specific key
|
||||
- **`invalidatePattern(pattern)`**: Delete keys matching a pattern
|
||||
|
||||
All cache operations are fail-safe - cache failures do not break the application.
|
||||
|
||||
### TTL Configuration
|
||||
|
||||
Different data types use different TTL values based on volatility:
|
||||
|
||||
| Data Type | TTL | Rationale |
|
||||
| ------------------- | --------- | -------------------------------------- |
|
||||
| Brands/Stores | 1 hour | Rarely changes, safe to cache longer |
|
||||
| Flyer lists | 5 minutes | Changes when new flyers are added |
|
||||
| Individual flyers | 10 minutes| Stable once created |
|
||||
| Flyer items | 10 minutes| Stable once created |
|
||||
| Statistics | 5 minutes | Can be slightly stale |
|
||||
| Frequent sales | 15 minutes| Aggregated data, updated periodically |
|
||||
| Categories | 1 hour | Rarely changes |
|
||||
|
||||
### Cache Key Strategy
|
||||
|
||||
Cache keys follow a consistent prefix pattern for pattern-based invalidation:
|
||||
|
||||
- `cache:brands` - All brands list
|
||||
- `cache:flyers:{limit}:{offset}` - Paginated flyer lists
|
||||
- `cache:flyer:{id}` - Individual flyer data
|
||||
- `cache:flyer-items:{flyerId}` - Items for a specific flyer
|
||||
- `cache:stats:*` - Statistics data
|
||||
- `geocode:{address}` - Geocoding results (30-day TTL)
|
||||
|
||||
### Cached Endpoints
|
||||
|
||||
The following repository methods implement server-side caching:
|
||||
|
||||
| Method | Cache Key Pattern | TTL |
|
||||
| ------ | ----------------- | --- |
|
||||
| `FlyerRepository.getAllBrands()` | `cache:brands` | 1 hour |
|
||||
| `FlyerRepository.getFlyers()` | `cache:flyers:{limit}:{offset}` | 5 minutes |
|
||||
| `FlyerRepository.getFlyerItems()` | `cache:flyer-items:{flyerId}` | 10 minutes |
|
||||
|
||||
### Cache Invalidation
|
||||
|
||||
**Event-based invalidation** is triggered on write operations:
|
||||
|
||||
- **Flyer creation** (`FlyerPersistenceService.saveFlyer`): Invalidates all `cache:flyers*` keys
|
||||
- **Flyer deletion** (`FlyerRepository.deleteFlyer`): Invalidates specific flyer and flyer items cache, plus flyer lists
|
||||
|
||||
**Manual invalidation** via admin endpoints:
|
||||
|
||||
- `POST /api/admin/system/clear-cache` - Clears all application cache (flyers, brands, stats)
|
||||
- `POST /api/admin/system/clear-geocode-cache` - Clears geocoding cache
|
||||
|
||||
### Client-Side Caching
|
||||
|
||||
TanStack React Query provides client-side caching with configurable stale times:
|
||||
|
||||
| Query Type | Stale Time |
|
||||
| ----------------- | ----------- |
|
||||
| Categories | 1 hour |
|
||||
| Master Items | 10 minutes |
|
||||
| Flyer Items | 5 minutes |
|
||||
| Flyers | 2 minutes |
|
||||
| Shopping Lists | 1 minute |
|
||||
| Activity Log | 30 seconds |
|
||||
|
||||
### Multi-Layer Cache Architecture
|
||||
|
||||
```text
|
||||
Client Request
|
||||
↓
|
||||
[TanStack React Query] ← Client-side cache (staleTime-based)
|
||||
↓
|
||||
[Express API]
|
||||
↓
|
||||
[CacheService.getOrSet()] ← Server-side Redis cache (TTL-based)
|
||||
↓
|
||||
[PostgreSQL Database]
|
||||
```
|
||||
|
||||
## Key Files
|
||||
|
||||
- `src/services/cacheService.server.ts` - Centralized cache service
|
||||
- `src/services/db/flyer.db.ts` - Repository with caching for brands, flyers, flyer items
|
||||
- `src/services/flyerPersistenceService.server.ts` - Cache invalidation on flyer creation
|
||||
- `src/routes/admin.routes.ts` - Admin cache management endpoints
|
||||
- `src/config/queryClient.ts` - Client-side query cache configuration
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Recipe caching**: Add caching to expensive recipe queries (by-sale-percentage, etc.)
|
||||
2. **Cache warming**: Pre-populate cache on startup for frequently accessed static data
|
||||
3. **Cache metrics**: Add hit/miss rate monitoring for observability
|
||||
4. **Conditional caching**: Skip cache for authenticated user-specific data
|
||||
5. **Cache compression**: Compress large cached payloads to reduce Redis memory usage
|
||||
|
||||
Reference in New Issue
Block a user