# Flyer Crawler - System Architecture Overview **Version**: 0.12.20 **Last Updated**: 2026-01-28 **Platform**: Linux (Production and Development) --- ## Table of Contents 1. [Executive Summary](#executive-summary) 2. [System Architecture Diagram](#system-architecture-diagram) 3. [Technology Stack](#technology-stack) 4. [System Components](#system-components) 5. [Data Flow](#data-flow) 6. [Architecture Layers](#architecture-layers) 7. [Key Entities](#key-entities) 8. [Authentication Flow](#authentication-flow) 9. [Background Processing](#background-processing) 10. [Deployment Architecture](#deployment-architecture) 11. [Design Principles and ADRs](#design-principles-and-adrs) 12. [Key Files Reference](#key-files-reference) --- ## Executive Summary **Flyer Crawler** is a grocery deal extraction and analysis platform that uses AI-powered processing to extract deals from grocery store flyer images and PDFs. The system provides users with features including watchlists, price history tracking, shopping lists, deal alerts, and recipe management. ### Core Capabilities | Domain | Description | | ------------------------- | --------------------------------------------------------------------------------------- | | **Deal Extraction** | AI-powered extraction of deals from grocery store flyer images/PDFs using Google Gemini | | **Price Tracking** | Historical price data, trend analysis, and price alerts | | **User Features** | Watchlists, shopping lists, recipes, pantry management, achievements | | **Real-time Updates** | WebSocket-based notifications for price alerts and processing status | | **Background Processing** | Asynchronous job queues for flyer processing, emails, and analytics | --- ## System Architecture Diagram ```text +-----------------------------------------------------------------------------------+ | CLIENT LAYER | +-----------------------------------------------------------------------------------+ | | | +-------------------+ +-------------------+ +-------------------+ | | | Web Browser | | Mobile PWA | | API Clients | | | | (React SPA) | | (React SPA) | | (REST/JSON) | | | +--------+----------+ +--------+----------+ +--------+----------+ | | | | | | +-------------|-------------------------|-------------------------|------------------+ | | | v v v +-----------------------------------------------------------------------------------+ | NGINX REVERSE PROXY | | - SSL/TLS Termination - Rate Limiting - Static Asset Serving | | - Load Balancing - Compression - WebSocket Proxying | | - Flyer Images (/flyer-images/) with 7-day cache | +----------------------------------+------------------------------------------------+ | v +-----------------------------------------------------------------------------------+ | APPLICATION LAYER | +-----------------------------------------------------------------------------------+ | | | +-----------------------------------------------------------------------------+ | | | EXPRESS.JS SERVER (Node.js) | | | | | | | | +-------------------------+ +-------------------------+ | | | | | Routes Layer | | Middleware Chain | | | | | | - API Endpoints | | - Authentication | | | | | | - Request Validation | | - Rate Limiting | | | | | | - Response Formatting | | - Logging | | | | | +------------+------------+ | - Error Handling | | | | | | +-------------------------+ | | | | v | | | | +-------------------------+ +-------------------------+ | | | | | Services Layer | | External Services | | | | | | - Business Logic | | - Google Gemini AI | | | | | | - Transaction Coord. | | - Google Maps API | | | | | | - Event Publishing | | - OAuth Providers | | | | | +------------+------------+ | - Email (SMTP) | | | | | | +-------------------------+ | | | | v | | | | +-------------------------+ | | | | | Repository Layer | | | | | | - Database Access | | | | | | - Query Construction | | | | | | - Entity Mapping | | | | | +------------+------------+ | | | | | | | | +---------------|-------------------------------------------------------------+ | | | | +------------------|----------------------------------------------------------------+ | v +-----------------------------------------------------------------------------------+ | DATA LAYER | +-----------------------------------------------------------------------------------+ | | | +---------------------------+ +---------------------------+ | | | PostgreSQL 16 | | Redis 7 | | | | (with PostGIS) | | | | | | | | - Session Cache | | | | - Primary Data Store | | - Query Cache | | | | - Geographic Queries | | - Job Queue Backing | | | | - Full-Text Search | | - Rate Limit Counters | | | | - Stored Functions | | - Real-time Pub/Sub | | | +---------------------------+ +---------------------------+ | | | +-----------------------------------------------------------------------------------+ +-----------------------------------------------------------------------------------+ | BACKGROUND PROCESSING LAYER | +-----------------------------------------------------------------------------------+ | | | +---------------------------+ +---------------------------+ | | | PM2 Process | | BullMQ Workers | | | | Manager | | | | | | | | - Flyer Processing | | | | - Process Clustering | | - Receipt Processing | | | | - Auto-restart | | - Email Sending | | | | - Log Management | | - Analytics Reports | | | | - Health Monitoring | | - File Cleanup | | | +---------------------------+ | - Token Cleanup | | | | - Expiry Alerts | | | | - Barcode Detection | | | +---------------------------+ | | | +-----------------------------------------------------------------------------------+ +-----------------------------------------------------------------------------------+ | OBSERVABILITY LAYER | +-----------------------------------------------------------------------------------+ | | | +------------------+ +------------------+ +------------------+ | | | Bugsink/Sentry | | Pino Logger | | Logstash | | | | (Error Track) | | (Structured) | | (Aggregation) | | | +------------------+ +------------------+ +------------------+ | | | +-----------------------------------------------------------------------------------+ ``` --- ## Technology Stack ### Core Technologies | Component | Technology | Version | Purpose | | ---------------------- | ---------- | -------- | -------------------------------- | | **Runtime** | Node.js | 22.x LTS | Server-side JavaScript runtime | | **Language** | TypeScript | 5.9.3 | Type-safe JavaScript superset | | **Web Framework** | Express.js | 5.1.0 | HTTP server and routing | | **Frontend Framework** | React | 19.2.0 | UI component library | | **Build Tool** | Vite | 7.2.4 | Frontend bundling and dev server | ### Data Storage | Component | Technology | Version | Purpose | | --------------------- | ---------------- | ------- | ---------------------------------------------- | | **Primary Database** | PostgreSQL | 16.x | Relational data storage | | **Spatial Extension** | PostGIS | 3.x | Geographic queries and store location features | | **Cache & Queues** | Redis | 7.x | Caching, session storage, job queue backing | | **File Storage** | Local Filesystem | - | Uploaded flyers and processed images | ### AI and External Services | Component | Technology | Purpose | | --------------- | --------------------------- | --------------------------------------- | | **AI Provider** | Google Gemini | Flyer data extraction, image analysis | | **Geocoding** | Google Maps API / Nominatim | Address geocoding and location services | | **OAuth** | Google, GitHub | Social authentication | | **Email** | Nodemailer (SMTP) | Transactional emails | ### Background Processing Stack | Component | Technology | Version | Purpose | | ------------------- | ---------- | ------- | --------------------------------- | | **Job Queues** | BullMQ | 5.65.1 | Reliable async job processing | | **Process Manager** | PM2 | Latest | Process management and clustering | | **Scheduler** | node-cron | 4.2.1 | Scheduled tasks | ### Frontend Stack | Component | Technology | Version | Purpose | | -------------------- | -------------- | ------- | ---------------------------------------- | | **State Management** | TanStack Query | 5.90.12 | Server state caching and synchronization | | **Routing** | React Router | 7.9.6 | Client-side routing | | **Styling** | Tailwind CSS | 4.1.17 | Utility-first CSS framework | | **Icons** | Lucide React | 0.555.0 | Icon components | | **Charts** | Recharts | 3.4.1 | Data visualization | ### Observability and Quality | Component | Technology | Purpose | | --------------------- | ---------------- | ----------------------------- | | **Error Tracking** | Sentry / Bugsink | Error monitoring and alerting | | **Logging** | Pino | Structured JSON logging | | **Log Aggregation** | Logstash | Centralized log collection | | **Testing** | Vitest | Unit and integration testing | | **API Documentation** | Swagger/OpenAPI | Interactive API documentation | --- ## System Components ### Frontend (React/Vite) The frontend is a single-page application (SPA) built with React 19 and Vite. **Key Characteristics**: - Server state management via TanStack Query - Neo-Brutalism design system (ADR-012) - Responsive design for mobile and desktop - PWA-capable for offline access **Directory Structure**: ```text src/ +-- components/ # Reusable UI components +-- contexts/ # React context providers +-- features/ # Feature-specific modules (ADR-047) +-- hooks/ # Custom React hooks +-- layouts/ # Page layout components +-- pages/ # Route page components +-- services/ # API client services ``` ### Backend (Express/Node.js) The backend is a RESTful API server built with Express.js 5. **Key Characteristics**: - Layered architecture (Routes -> Services -> Repositories) - JWT-based authentication with OAuth support - Request validation via Zod schemas - Structured logging with Pino - Standardized error handling (ADR-001) **API Route Modules** (all versioned under `/api/v1/*`): | Route | Purpose | | ------------------------- | ----------------------------------------------- | | `/api/v1/auth` | Authentication (login, register, OAuth) | | `/api/v1/health` | Health checks and monitoring | | `/api/v1/system` | System administration (PM2 status, server info) | | `/api/v1/users` | User profile management | | `/api/v1/ai` | AI-powered features and flyer processing | | `/api/v1/admin` | Administrative functions | | `/api/v1/budgets` | Budget management and spending analysis | | `/api/v1/achievements` | Gamification and achievement system | | `/api/v1/flyers` | Flyer CRUD and processing | | `/api/v1/recipes` | Recipe management and recommendations | | `/api/v1/personalization` | Master items and user preferences | | `/api/v1/price-history` | Price tracking and trend analysis | | `/api/v1/stats` | Public statistics and analytics | | `/api/v1/upc` | UPC barcode scanning and product lookup | | `/api/v1/inventory` | Inventory and expiry tracking | | `/api/v1/receipts` | Receipt scanning and purchase history | | `/api/v1/deals` | Best prices and deal discovery | | `/api/v1/reactions` | Social features (reactions, sharing) | | `/api/v1/stores` | Store management and location services | | `/api/v1/categories` | Category browsing and product categorization | ### Database (PostgreSQL/PostGIS) PostgreSQL serves as the primary data store with PostGIS extension for geographic queries. **Key Features**: - UUID primary keys for user data - BIGINT IDENTITY for auto-incrementing IDs - PostGIS geography types for store locations - Stored functions for complex business logic - Triggers for automated updates (e.g., `item_count` maintenance) ### Cache (Redis) Redis provides caching and backing for the job queue system. **Usage Patterns**: - Query result caching (flyers, prices, stats) - Rate limiting counters - BullMQ job queue storage - Session token storage ### AI (Google Gemini) Google Gemini powers the AI extraction capabilities. **Capabilities**: - Flyer image analysis and data extraction - Store name and logo detection - Deal item parsing (name, price, quantity) - Date range extraction - Category classification ### Background Workers (BullMQ/PM2) BullMQ workers handle asynchronous processing tasks. PM2 manages both the API server and worker processes in production and development environments (ADR-014). **Dev Container PM2 Processes**: | Process | Script | Purpose | | -------------------------- | ---------------------------------- | -------------------------- | | `flyer-crawler-api-dev` | `tsx watch server.ts` | API server with hot reload | | `flyer-crawler-worker-dev` | `tsx watch src/services/worker.ts` | Background job processing | | `flyer-crawler-vite-dev` | `vite --host` | Frontend dev server | **Production PM2 Processes**: | Process | Script | Purpose | | -------------------------------- | ------------------ | --------------------------- | | `flyer-crawler-api` | `tsx server.ts` | API server (cluster mode) | | `flyer-crawler-worker` | `tsx worker.ts` | Background job processing | | `flyer-crawler-analytics-worker` | `tsx analytics.ts` | Analytics report generation | **Job Queues**: | Queue | Purpose | Retry Strategy | | ---------------------------- | -------------------------------- | ------------------------------------- | | `flyer-processing` | Process uploaded flyers with AI | 3 attempts, exponential backoff (5s) | | `receipt-processing` | OCR and parse receipts | 3 attempts, exponential backoff (10s) | | `email-sending` | Send transactional emails | 5 attempts, exponential backoff (10s) | | `analytics-reporting` | Generate daily analytics | 2 attempts, exponential backoff (60s) | | `weekly-analytics-reporting` | Generate weekly reports | 2 attempts, exponential backoff (1h) | | `file-cleanup` | Remove temporary files | 3 attempts, exponential backoff (30s) | | `token-cleanup` | Expire old refresh tokens | 2 attempts, exponential backoff (1h) | | `expiry-alerts` | Send pantry expiry notifications | 2 attempts, exponential backoff (5m) | | `barcode-detection` | Process barcode scans | 2 attempts, exponential backoff (5s) | --- ## Data Flow ### Flyer Processing Pipeline ```text +-------------+ +----------------+ +------------------+ +---------------+ | User | | Express | | BullMQ | | PostgreSQL | | Upload +---->+ Route +---->+ Queue +---->+ Storage | +-------------+ +-------+--------+ +--------+---------+ +-------+-------+ | | | v v v +-------+--------+ +--------+---------+ +-------+-------+ | Validate | | Worker | | Cache | | & Store | | Process | | Invalidate | | Temp File | | | | | +----------------+ +--------+---------+ +---------------+ | v +--------+---------+ | Google | | Gemini AI | | Extraction | +--------+---------+ | v +--------+---------+ | Transform | | & Validate | | Data | +--------+---------+ | v +--------+---------+ | Persist to | | Database | | (Transaction) | +--------+---------+ | v +--------+---------+ | WebSocket | | Notification | +------------------+ ``` ### Detailed Processing Steps 1. **Upload**: User uploads flyer image via `/api/flyers/upload` 2. **Validation**: Server validates file type, size, and generates checksum 3. **Queueing**: Job added to `flyer-processing` queue with file path 4. **Worker Pickup**: BullMQ worker picks up job for processing 5. **AI Extraction**: Google Gemini analyzes image and extracts: - Store name - Valid date range - Store address (if present) - Deal items (name, price, quantity, category) 6. **Data Transformation**: Raw AI output transformed to database schema 7. **Persistence**: Transactional insert of flyer + items + store 8. **Cache Invalidation**: Redis cache cleared for affected queries 9. **Notification**: WebSocket message sent to user with results 10. **Cleanup**: Temporary files scheduled for deletion --- ## Architecture Layers The application follows a strict layered architecture as defined in ADR-035. ```text +-----------------------------------------------------------------------+ | ROUTES LAYER | | Responsibilities: | | - HTTP request/response handling | | - Input validation (via middleware) | | - Authentication/authorization checks | | - Rate limiting | | - Response formatting (sendSuccess, sendPaginated, sendError) | +----------------------------------+------------------------------------+ | v +-----------------------------------------------------------------------+ | SERVICES LAYER | | Responsibilities: | | - Business logic orchestration | | - Transaction coordination (withTransaction) | | - External API integration | | - Cross-repository operations | | - Event publishing | +----------------------------------+------------------------------------+ | v +-----------------------------------------------------------------------+ | REPOSITORY LAYER | | Responsibilities: | | - Direct database access | | - Query construction | | - Entity mapping | | - Error translation (handleDbError) | +-----------------------------------------------------------------------+ ``` ### Layer Communication Rules 1. **Routes MUST NOT** directly access repositories (except simple CRUD) 2. **Repositories MUST NOT** call other repositories (use services) 3. **Services MAY** call other services 4. **Infrastructure services MAY** be called from any layer ### Service Types and Naming Conventions | Type | Suffix | Example | Location | | ------------------- | ------------- | --------------------- | ------------------ | | Business Service | `*Service.ts` | `authService.ts` | `src/services/` | | Server-Only Service | `*.server.ts` | `aiService.server.ts` | `src/services/` | | Database Repository | `*.db.ts` | `user.db.ts` | `src/services/db/` | | Infrastructure | Descriptive | `logger.server.ts` | `src/services/` | ### Repository Method Naming (ADR-034) | Prefix | Behavior | Return Type | | ------- | ----------------------------------- | -------------- | | `get*` | Throws `NotFoundError` if not found | Entity | | `find*` | Returns `null` if not found | Entity or null | | `list*` | Returns empty array if none found | Entity[] | --- ## Key Entities ### Entity Relationship Overview ```text +------------------+ +------------------+ +------------------+ | users | | profiles | | addresses | |------------------| |------------------| |------------------| | user_id (PK) |<-------->| user_id (PK,FK) |--------->| address_id (PK) | | email | | full_name | | address_line_1 | | password_hash | | avatar_url | | city | | refresh_token | | points | | province_state | +--------+---------+ | role | | latitude | | +------------------+ | longitude | | | location (GIS) | | +--------+---------+ | ^ v | +--------+---------+ +------------------+ +--------+---------+ | stores |--------->| store_locations |--------->| | |------------------| |------------------| | | | store_id (PK) | | store_location_id| | | | name | | store_id (FK) | | | | logo_url | | address_id (FK) | | | +--------+---------+ +------------------+ +------------------+ | v +--------+---------+ +------------------+ +------------------+ | flyers |--------->| flyer_items |--------->| master_grocery_ | |------------------| |------------------| | items | | flyer_id (PK) | | flyer_item_id | |------------------| | store_id (FK) | | flyer_id (FK) | | master_grocery_ | | file_name | | item | | item_id (PK) | | image_url | | price_display | | name | | valid_from | | price_in_cents | | category_id (FK) | | valid_to | | quantity | | is_allergen | | status | | master_item_id | +------------------+ | item_count | | category_id (FK) | +------------------+ +------------------+ ``` ### Core Entities | Entity | Table | Purpose | | --------------------- | ---------------------- | --------------------------------------------- | | **User** | `users` | Authentication credentials and login tracking | | **Profile** | `profiles` | Public user data, preferences, points | | **Store** | `stores` | Grocery store chains (Safeway, Kroger, etc.) | | **StoreLocation** | `store_locations` | Physical store locations with addresses | | **Address** | `addresses` | Normalized address storage with geocoding | | **Flyer** | `flyers` | Uploaded flyer metadata and status | | **FlyerItem** | `flyer_items` | Individual deals extracted from flyers | | **MasterGroceryItem** | `master_grocery_items` | Canonical grocery item dictionary | | **Category** | `categories` | Item categorization (Produce, Dairy, etc.) | ### User Feature Entities | Entity | Table | Purpose | | -------------------- | --------------------- | ------------------------------------ | | **UserWatchedItem** | `user_watched_items` | Items user wants to track prices for | | **UserAlert** | `user_alerts` | Price alert thresholds | | **ShoppingList** | `shopping_lists` | User shopping lists | | **ShoppingListItem** | `shopping_list_items` | Items on shopping lists | | **Recipe** | `recipes` | User recipes with ingredients | | **RecipeIngredient** | `recipe_ingredients` | Recipe ingredient list | | **PantryItem** | `pantry_items` | User pantry inventory | | **Receipt** | `receipts` | Scanned receipt data | | **ReceiptItem** | `receipt_items` | Items parsed from receipts | ### Gamification Entities | Entity | Table | Purpose | | ------------------- | ------------------- | ------------------------------------- | | **Achievement** | `achievements` | Defined achievements | | **UserAchievement** | `user_achievements` | Achievements earned by users | | **ActivityLog** | `activity_log` | User activity for feeds and analytics | --- ## Authentication Flow ### JWT Token Architecture ```text +-------------------+ +-------------------+ +-------------------+ | Login Request | | Server | | Database | | (email/pass) +---->+ Validates +---->+ Verify User | +-------------------+ +--------+----------+ +-------------------+ | v +--------+----------+ | Generate | | JWT Tokens | | - Access (15m) | | - Refresh (7d) | +--------+----------+ | v +-------------------+ +--------+----------+ | Client Storage |<----+ Return Tokens | | - Access: Memory| | - Access: Body | | - Refresh: HTTP | | - Refresh: Cookie| | Only Cookie | +-------------------+ +-------------------+ ``` ### Authentication Methods 1. **Local Authentication**: Email/password with bcrypt hashing 2. **Google OAuth 2.0**: Social login via Google account 3. **GitHub OAuth 2.0**: Social login via GitHub account ### Security Features (ADR-016, ADR-048) - **Rate Limiting**: Login attempts rate-limited per IP - **Account Lockout**: 15-minute lockout after 5 failed attempts - **Password Requirements**: Strength validation via zxcvbn - **JWT Rotation**: Access tokens are short-lived, refresh tokens are rotated - **HTTPS Only**: All production traffic encrypted ### Protected Route Flow ```text +-------------------+ +-------------------+ +-------------------+ | API Request | | requireAuth | | JWT Strategy | | + Bearer Token +---->+ Middleware +---->+ Validate | +-------------------+ +--------+----------+ +--------+----------+ | | | +-------------------+ | | v v +--------+-----+----+ | req.user | | populated | +--------+----------+ | v +--------+----------+ | Route Handler | | Executes | +-------------------+ ``` --- ## Background Processing ### Worker Architecture ```text +-------------------+ +-------------------+ +-------------------+ | API Server | | Redis | | Worker Process | | (Queue Producer)| | (Job Storage) | | (Consumer) | +--------+----------+ +--------+----------+ +--------+----------+ | ^ | | Add Job | Poll/Process | +------------------------>+<------------------------+ | | +-------------------------+-------------------------+ | | | v v v +--------+----------+ +--------+----------+ +--------+----------+ | Flyer Worker | | Email Worker | | Analytics | | Concurrency: 1 | | Concurrency: 10 | | Worker | +-------------------+ +-------------------+ | Concurrency: 1 | +-------------------+ ``` ### Job Lifecycle 1. **Queued**: Job added to queue with data payload 2. **Active**: Worker picks up job and begins processing 3. **Completed**: Job finishes successfully 4. **Failed**: Job encounters error, may retry 5. **Delayed**: Job waiting for retry backoff ### Retry Strategy Jobs use exponential backoff for retries: ```text Attempt 1: Immediate Attempt 2: Initial delay (e.g., 5 seconds) Attempt 3: 2x delay (e.g., 10 seconds) Attempt 4: 4x delay (e.g., 20 seconds) ... ``` ### Scheduled Jobs (ADR-037) | Schedule | Job | Purpose | | --------------------- | ---------------- | ------------------------------------------ | | Daily 2:00 AM | Analytics Report | Generate daily usage statistics | | Weekly Sunday 3:00 AM | Weekly Analytics | Generate weekly summary reports | | Every 6 hours | Token Cleanup | Remove expired refresh tokens | | Every hour | Expiry Alerts | Check and send pantry expiry notifications | --- ## Deployment Architecture ### Environment Overview ```text +-----------------------------------------------------------------------------------+ | DEVELOPMENT | +-----------------------------------------------------------------------------------+ | | | +-----------------------------------+ +-----------------------------------+ | | | Windows Host Machine | | Linux Dev Container | | | | - VS Code | | (flyer-crawler-dev) | | | | - Podman Desktop +---->+ - Node.js 22 | | | | - Git | | - PM2 (process manager) | | | +-----------------------------------+ | - PostgreSQL 16 | | | | - Redis 7 | | | | - Bugsink (local) | | | | - Logstash (log aggregation) | | | +-----------------------------------+ | +-----------------------------------------------------------------------------------+ +-----------------------------------------------------------------------------------+ | TEST SERVER | +-----------------------------------------------------------------------------------+ | | | +-----------------------------------+ +-----------------------------------+ | | | NGINX Reverse Proxy | | Application Server | | | | flyer-crawler-test.projectium.com | - PM2 Process Manager | | | | - SSL/TLS (Let's Encrypt) +---->+ - Node.js 22 | | | | - Rate Limiting | | - PostgreSQL 16 | | | +-----------------------------------+ | - Redis 7 | | | +-----------------------------------+ | +-----------------------------------------------------------------------------------+ +-----------------------------------------------------------------------------------+ | PRODUCTION | +-----------------------------------------------------------------------------------+ | | | +-----------------------------------+ +-----------------------------------+ | | | NGINX Reverse Proxy | | Application Server | | | | flyer-crawler.projectium.com | - PM2 Process Manager | | | | - SSL/TLS (Let's Encrypt) +---->+ - Node.js 22 (Clustered) | | | | - Rate Limiting | | - PostgreSQL 16 | | | | - Gzip Compression | | - Redis 7 | | | +-----------------------------------+ +-----------------------------------+ | | | | +-----------------------------------+ | | | Monitoring | | | | - Bugsink (Error Tracking) | | | | - Logstash (Log Aggregation) | | | +-----------------------------------+ | +-----------------------------------------------------------------------------------+ ``` ### Deployment Pipeline (ADR-017) ```text +------------+ +------------+ +------------+ +------------+ | Push to | | Gitea | | Build & | | Deploy | | main +---->+ Actions +---->+ Test +---->+ to Prod | +------------+ +------------+ +------------+ +------------+ | v +------+------+ | Type | | Check | +------+------+ | v +------+------+ | Unit | | Tests | +------+------+ | v +------+------+ | Build | | Assets | +-------------+ ``` ### Server Paths | Environment | Web Root | Data Storage | Flyer Images | | ----------- | --------------------------------------------- | ----------------------------------------------------- | ---------------------------------------------------------- | | Production | `/var/www/flyer-crawler.projectium.com/` | `/var/www/flyer-crawler.projectium.com/uploads/` | `/var/www/flyer-crawler.projectium.com/flyer-images/` | | Test | `/var/www/flyer-crawler-test.projectium.com/` | `/var/www/flyer-crawler-test.projectium.com/uploads/` | `/var/www/flyer-crawler-test.projectium.com/flyer-images/` | | Development | Container-local | Container-local | `/app/public/flyer-images/` | Flyer images are served by NGINX as static files at `/flyer-images/` with 7-day browser caching. --- ## Design Principles and ADRs The system architecture is governed by Architecture Decision Records (ADRs). Key decisions include: ### Core Infrastructure | ADR | Title | Status | | ------- | ------------------------------------ | -------- | | ADR-001 | Standardized Error Handling | Accepted | | ADR-002 | Standardized Transaction Management | Accepted | | ADR-007 | Configuration and Secrets Management | Accepted | | ADR-020 | Health Checks and Probes | Accepted | ### API and Integration | ADR | Title | Status | | ------- | ----------------------------- | ---------------- | | ADR-003 | Standardized Input Validation | Accepted | | ADR-008 | API Versioning Strategy | Phase 1 Complete | | ADR-022 | Real-time Notification System | Proposed | | ADR-028 | API Response Standardization | Implemented | **Implementation Guide**: [API Versioning Infrastructure](./api-versioning-infrastructure.md) (Phase 2) ### Security | ADR | Title | Status | | ------- | ----------------------- | --------------------- | | ADR-016 | API Security Hardening | Accepted | | ADR-032 | Rate Limiting Strategy | Accepted | | ADR-048 | Authentication Strategy | Partially Implemented | ### Architecture Patterns | ADR | Title | Status | | ------- | ---------------------------------- | -------- | | ADR-034 | Repository Pattern Standards | Accepted | | ADR-035 | Service Layer Architecture | Accepted | | ADR-036 | Event Bus and Pub/Sub Pattern | Accepted | | ADR-041 | AI/Gemini Integration Architecture | Accepted | ### Operations | ADR | Title | Status | | ------- | ------------------------------- | --------------------- | | ADR-006 | Background Job Processing | Accepted | | ADR-014 | Containerization and Deployment | Partially Implemented | | ADR-037 | Scheduled Jobs and Cron Pattern | Accepted | | ADR-038 | Graceful Shutdown Pattern | Accepted | ### Observability | ADR | Title | Status | | ------- | --------------------------------- | -------- | | ADR-004 | Structured Logging | Accepted | | ADR-015 | APM and Error Tracking | Proposed | | ADR-050 | PostgreSQL Function Observability | Accepted | **Full ADR Index**: [docs/adr/index.md](../adr/index.md) --- ## Key Files Reference ### Configuration Files | File | Purpose | | ------------------------ | ------------------------------------------------------ | | `server.ts` | Express application setup and middleware configuration | | `src/config/env.ts` | Environment variable validation (Zod schema) | | `src/config/passport.ts` | Authentication strategies (Local, JWT, OAuth) | | `ecosystem.config.cjs` | PM2 process manager configuration | | `vite.config.ts` | Vite build and dev server configuration | ### Route Files | File | API Prefix | | ----------------------------- | -------------- | | `src/routes/auth.routes.ts` | `/api/auth` | | `src/routes/user.routes.ts` | `/api/users` | | `src/routes/flyer.routes.ts` | `/api/flyers` | | `src/routes/recipe.routes.ts` | `/api/recipes` | | `src/routes/deals.routes.ts` | `/api/deals` | | `src/routes/store.routes.ts` | `/api/stores` | | `src/routes/admin.routes.ts` | `/api/admin` | | `src/routes/health.routes.ts` | `/api/health` | ### Service Files | File | Purpose | | ----------------------------------------------- | --------------------------------------- | | `src/services/flyerProcessingService.server.ts` | Flyer processing pipeline orchestration | | `src/services/flyerAiProcessor.server.ts` | AI extraction for flyers | | `src/services/aiService.server.ts` | Google Gemini AI integration | | `src/services/cacheService.server.ts` | Redis caching abstraction | | `src/services/emailService.server.ts` | Email sending | | `src/services/queues.server.ts` | BullMQ queue definitions | | `src/services/queueService.server.ts` | Queue management and scheduling | | `src/services/workers.server.ts` | BullMQ worker definitions | | `src/services/websocketService.server.ts` | Real-time WebSocket notifications | | `src/services/receiptService.server.ts` | Receipt scanning and OCR | | `src/services/upcService.server.ts` | UPC barcode lookup | | `src/services/expiryService.server.ts` | Pantry expiry tracking | | `src/services/geocodingService.server.ts` | Address geocoding | | `src/services/analyticsService.server.ts` | Analytics and reporting | | `src/services/monitoringService.server.ts` | Health monitoring | | `src/services/barcodeService.server.ts` | Barcode detection | | `src/services/logger.server.ts` | Structured logging (Pino) | | `src/services/redis.server.ts` | Redis connection management | | `src/services/sentry.server.ts` | Error tracking (Sentry/Bugsink) | ### Database Files | File | Purpose | | --------------------------------------- | -------------------------------------------- | | `src/services/db/connection.db.ts` | Database pool and transaction management | | `src/services/db/errors.db.ts` | Database error types | | `src/services/db/index.db.ts` | Repository exports | | `src/services/db/user.db.ts` | User repository | | `src/services/db/flyer.db.ts` | Flyer repository | | `src/services/db/store.db.ts` | Store repository | | `src/services/db/storeLocation.db.ts` | Store location repository | | `src/services/db/recipe.db.ts` | Recipe repository | | `src/services/db/category.db.ts` | Category repository | | `src/services/db/personalization.db.ts` | Master items and personalization | | `src/services/db/shopping.db.ts` | Shopping lists repository | | `src/services/db/deals.db.ts` | Deals and best prices repository | | `src/services/db/price.db.ts` | Price history repository | | `src/services/db/receipt.db.ts` | Receipt repository | | `src/services/db/upc.db.ts` | UPC scan history repository | | `src/services/db/expiry.db.ts` | Expiry tracking repository | | `src/services/db/gamification.db.ts` | Achievements repository | | `src/services/db/budget.db.ts` | Budget repository | | `src/services/db/reaction.db.ts` | User reactions repository | | `src/services/db/notification.db.ts` | Notifications repository | | `src/services/db/address.db.ts` | Address repository | | `src/services/db/admin.db.ts` | Admin operations repository | | `src/services/db/conversion.db.ts` | Unit conversion repository | | `src/services/db/flyerLocation.db.ts` | Flyer locations repository | | `sql/master_schema_rollup.sql` | Complete database schema (for test DB setup) | | `sql/initial_schema.sql` | Fresh installation schema | ### Type Definitions | File | Purpose | | ----------------------- | ---------------------------- | | `src/types.ts` | Core entity type definitions | | `src/types/job-data.ts` | BullMQ job payload types | --- ## Additional Resources - **API Documentation**: Available at `/docs/api-docs` in development environments - **Testing Guide**: [docs/tests/](../tests/) - **Getting Started**: [docs/getting-started/](../getting-started/) - **Operations Guide**: [docs/operations/](../operations/) - **Authentication Details**: [docs/architecture/AUTHENTICATION.md](./AUTHENTICATION.md) - **Database Schema**: [docs/architecture/DATABASE.md](./DATABASE.md) - **WebSocket Usage**: [docs/architecture/WEBSOCKET_USAGE.md](./WEBSOCKET_USAGE.md) --- _This document is maintained as part of the Flyer Crawler project documentation. For updates, contact the development team or submit a pull request._