Files
flyer-crawler.projectium.com/docs/adr/0009-caching-strategy-for-read-heavy-operations.md
Torben Sorensen 664ad291be
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 30m3s
integration test fixes - claude for the win? try 3
2026-01-09 03:41:57 -08:00

127 lines
5.2 KiB
Markdown

# ADR-009: Caching Strategy for Read-Heavy Operations
**Date**: 2025-12-12
**Status**: Accepted
## Context
The application has several read-heavy endpoints (e.g., getting flyer items, recipes, brands). As traffic increases, these endpoints will put significant load on the database, even for data that changes infrequently.
## Decision
We will implement a multi-layered caching strategy using an in-memory data store like **Redis**.
1. **Define Cacheable Data**: Identify data suitable for caching (e.g., flyer data, recipe details, brand lists).
2. **Define Invalidation Strategy**: Determine the cache invalidation strategy (e.g., time-to-live (TTL), event-based invalidation on data update).
3. **Implement Cache-Aside Pattern**: The repository layer will be updated to implement a "Cache-Aside" pattern, where methods first check Redis for data before falling back to the database.
## Consequences
**Positive**: Directly addresses application performance and scalability. Reduces database load and improves API response times for common requests.
**Negative**: Introduces Redis as a dependency if not already used. Adds complexity to the data-fetching logic and requires careful management of cache invalidation to prevent stale data.
## Implementation Details
### Cache Service
A centralized cache service (`src/services/cacheService.server.ts`) provides reusable caching functionality:
- **`getOrSet<T>(key, fetcher, options)`**: Cache-aside pattern implementation
- **`get<T>(key)`**: Retrieve cached value
- **`set<T>(key, value, ttl)`**: Store value with TTL
- **`del(key)`**: Delete specific key
- **`invalidatePattern(pattern)`**: Delete keys matching a pattern
All cache operations are fail-safe - cache failures do not break the application.
### TTL Configuration
Different data types use different TTL values based on volatility:
| Data Type | TTL | Rationale |
| ------------------- | --------- | -------------------------------------- |
| Brands/Stores | 1 hour | Rarely changes, safe to cache longer |
| Flyer lists | 5 minutes | Changes when new flyers are added |
| Individual flyers | 10 minutes| Stable once created |
| Flyer items | 10 minutes| Stable once created |
| Statistics | 5 minutes | Can be slightly stale |
| Frequent sales | 15 minutes| Aggregated data, updated periodically |
| Categories | 1 hour | Rarely changes |
### Cache Key Strategy
Cache keys follow a consistent prefix pattern for pattern-based invalidation:
- `cache:brands` - All brands list
- `cache:flyers:{limit}:{offset}` - Paginated flyer lists
- `cache:flyer:{id}` - Individual flyer data
- `cache:flyer-items:{flyerId}` - Items for a specific flyer
- `cache:stats:*` - Statistics data
- `geocode:{address}` - Geocoding results (30-day TTL)
### Cached Endpoints
The following repository methods implement server-side caching:
| Method | Cache Key Pattern | TTL |
| ------ | ----------------- | --- |
| `FlyerRepository.getAllBrands()` | `cache:brands` | 1 hour |
| `FlyerRepository.getFlyers()` | `cache:flyers:{limit}:{offset}` | 5 minutes |
| `FlyerRepository.getFlyerItems()` | `cache:flyer-items:{flyerId}` | 10 minutes |
### Cache Invalidation
**Event-based invalidation** is triggered on write operations:
- **Flyer creation** (`FlyerPersistenceService.saveFlyer`): Invalidates all `cache:flyers*` keys
- **Flyer deletion** (`FlyerRepository.deleteFlyer`): Invalidates specific flyer and flyer items cache, plus flyer lists
**Manual invalidation** via admin endpoints:
- `POST /api/admin/system/clear-cache` - Clears all application cache (flyers, brands, stats)
- `POST /api/admin/system/clear-geocode-cache` - Clears geocoding cache
### Client-Side Caching
TanStack React Query provides client-side caching with configurable stale times:
| Query Type | Stale Time |
| ----------------- | ----------- |
| Categories | 1 hour |
| Master Items | 10 minutes |
| Flyer Items | 5 minutes |
| Flyers | 2 minutes |
| Shopping Lists | 1 minute |
| Activity Log | 30 seconds |
### Multi-Layer Cache Architecture
```text
Client Request
[TanStack React Query] ← Client-side cache (staleTime-based)
[Express API]
[CacheService.getOrSet()] ← Server-side Redis cache (TTL-based)
[PostgreSQL Database]
```
## Key Files
- `src/services/cacheService.server.ts` - Centralized cache service
- `src/services/db/flyer.db.ts` - Repository with caching for brands, flyers, flyer items
- `src/services/flyerPersistenceService.server.ts` - Cache invalidation on flyer creation
- `src/routes/admin.routes.ts` - Admin cache management endpoints
- `src/config/queryClient.ts` - Client-side query cache configuration
## Future Enhancements
1. **Recipe caching**: Add caching to expensive recipe queries (by-sale-percentage, etc.)
2. **Cache warming**: Pre-populate cache on startup for frequently accessed static data
3. **Cache metrics**: Add hit/miss rate monitoring for observability
4. **Conditional caching**: Skip cache for authenticated user-specific data
5. **Cache compression**: Compress large cached payloads to reduce Redis memory usage