Files
flyer-crawler.projectium.com/docs/architecture/OVERVIEW.md
Torben Sorensen f10c6c0cd6
All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 17m56s
Complete ADR-008 Phase 2
2026-01-27 11:06:09 -08:00

47 KiB

Flyer Crawler - System Architecture Overview

Version: 0.12.5 Last Updated: 2026-01-22 Platform: Linux (Production and Development)


Table of Contents

  1. Executive Summary
  2. System Architecture Diagram
  3. Technology Stack
  4. System Components
  5. Data Flow
  6. Architecture Layers
  7. Key Entities
  8. Authentication Flow
  9. Background Processing
  10. Deployment Architecture
  11. Design Principles and ADRs
  12. Key Files Reference

Executive Summary

Flyer Crawler is a grocery deal extraction and analysis platform that uses AI-powered processing to extract deals from grocery store flyer images and PDFs. The system provides users with features including watchlists, price history tracking, shopping lists, deal alerts, and recipe management.

Core Capabilities

Domain Description
Deal Extraction AI-powered extraction of deals from grocery store flyer images/PDFs using Google Gemini
Price Tracking Historical price data, trend analysis, and price alerts
User Features Watchlists, shopping lists, recipes, pantry management, achievements
Real-time Updates WebSocket-based notifications for price alerts and processing status
Background Processing Asynchronous job queues for flyer processing, emails, and analytics

System Architecture Diagram

+-----------------------------------------------------------------------------------+
|                                   CLIENT LAYER                                     |
+-----------------------------------------------------------------------------------+
|                                                                                   |
|    +-------------------+     +-------------------+     +-------------------+       |
|    |   Web Browser     |     |   Mobile PWA      |     |   API Clients     |       |
|    |   (React SPA)     |     |   (React SPA)     |     |   (REST/JSON)     |       |
|    +--------+----------+     +--------+----------+     +--------+----------+       |
|             |                         |                         |                  |
+-------------|-------------------------|-------------------------|------------------+
              |                         |                         |
              v                         v                         v
+-----------------------------------------------------------------------------------+
|                              NGINX REVERSE PROXY                                   |
|    - SSL/TLS Termination    - Rate Limiting    - Static Asset Serving             |
|    - Load Balancing         - Compression      - WebSocket Proxying               |
|    - Flyer Images (/flyer-images/) with 7-day cache                               |
+----------------------------------+------------------------------------------------+
                                   |
                                   v
+-----------------------------------------------------------------------------------+
|                              APPLICATION LAYER                                     |
+-----------------------------------------------------------------------------------+
|                                                                                   |
|  +-----------------------------------------------------------------------------+  |
|  |                         EXPRESS.JS SERVER (Node.js)                         |  |
|  |                                                                             |  |
|  |  +-------------------------+  +-------------------------+                   |  |
|  |  |     Routes Layer        |  |    Middleware Chain     |                   |  |
|  |  |  - API Endpoints        |  |  - Authentication       |                   |  |
|  |  |  - Request Validation   |  |  - Rate Limiting        |                   |  |
|  |  |  - Response Formatting  |  |  - Logging              |                   |  |
|  |  +------------+------------+  |  - Error Handling       |                   |  |
|  |               |               +-------------------------+                   |  |
|  |               v                                                             |  |
|  |  +-------------------------+  +-------------------------+                   |  |
|  |  |    Services Layer       |  |  External Services      |                   |  |
|  |  |  - Business Logic       |  |  - Google Gemini AI     |                   |  |
|  |  |  - Transaction Coord.   |  |  - Google Maps API      |                   |  |
|  |  |  - Event Publishing     |  |  - OAuth Providers      |                   |  |
|  |  +------------+------------+  |  - Email (SMTP)         |                   |  |
|  |               |               +-------------------------+                   |  |
|  |               v                                                             |  |
|  |  +-------------------------+                                                |  |
|  |  |   Repository Layer      |                                                |  |
|  |  |  - Database Access      |                                                |  |
|  |  |  - Query Construction   |                                                |  |
|  |  |  - Entity Mapping       |                                                |  |
|  |  +------------+------------+                                                |  |
|  |               |                                                             |  |
|  +---------------|-------------------------------------------------------------+  |
|                  |                                                                |
+------------------|----------------------------------------------------------------+
                   |
                   v
+-----------------------------------------------------------------------------------+
|                               DATA LAYER                                          |
+-----------------------------------------------------------------------------------+
|                                                                                   |
|  +---------------------------+     +---------------------------+                  |
|  |      PostgreSQL 16        |     |        Redis 7           |                  |
|  |   (with PostGIS)          |     |                          |                  |
|  |                           |     |  - Session Cache         |                  |
|  |  - Primary Data Store     |     |  - Query Cache           |                  |
|  |  - Geographic Queries     |     |  - Job Queue Backing     |                  |
|  |  - Full-Text Search       |     |  - Rate Limit Counters   |                  |
|  |  - Stored Functions       |     |  - Real-time Pub/Sub     |                  |
|  +---------------------------+     +---------------------------+                  |
|                                                                                   |
+-----------------------------------------------------------------------------------+

+-----------------------------------------------------------------------------------+
|                           BACKGROUND PROCESSING LAYER                             |
+-----------------------------------------------------------------------------------+
|                                                                                   |
|  +---------------------------+     +---------------------------+                  |
|  |      PM2 Process          |     |     BullMQ Workers        |                  |
|  |      Manager              |     |                           |                  |
|  |                           |     |  - Flyer Processing       |                  |
|  |  - Process Clustering     |     |  - Receipt Processing     |                  |
|  |  - Auto-restart           |     |  - Email Sending          |                  |
|  |  - Log Management         |     |  - Analytics Reports      |                  |
|  |  - Health Monitoring      |     |  - File Cleanup           |                  |
|  +---------------------------+     |  - Token Cleanup          |                  |
|                                    |  - Expiry Alerts          |                  |
|                                    |  - Barcode Detection      |                  |
|                                    +---------------------------+                  |
|                                                                                   |
+-----------------------------------------------------------------------------------+

+-----------------------------------------------------------------------------------+
|                            OBSERVABILITY LAYER                                    |
+-----------------------------------------------------------------------------------+
|                                                                                   |
|  +------------------+  +------------------+  +------------------+                  |
|  |   Bugsink/Sentry |  |   Pino Logger    |  |   Logstash       |                  |
|  |   (Error Track)  |  |   (Structured)   |  |   (Aggregation)  |                  |
|  +------------------+  +------------------+  +------------------+                  |
|                                                                                   |
+-----------------------------------------------------------------------------------+

Technology Stack

Core Technologies

Component Technology Version Purpose
Runtime Node.js 22.x LTS Server-side JavaScript runtime
Language TypeScript 5.9.x Type-safe JavaScript superset
Web Framework Express.js 5.1.x HTTP server and routing
Frontend Framework React 19.2.x UI component library
Build Tool Vite 7.2.x Frontend bundling and dev server

Data Storage

Component Technology Version Purpose
Primary Database PostgreSQL 16.x Relational data storage
Spatial Extension PostGIS 3.x Geographic queries and store location features
Cache & Queues Redis 7.x Caching, session storage, job queue backing
File Storage Local Filesystem - Uploaded flyers and processed images

AI and External Services

Component Technology Purpose
AI Provider Google Gemini Flyer data extraction, image analysis
Geocoding Google Maps API / Nominatim Address geocoding and location services
OAuth Google, GitHub Social authentication
Email Nodemailer (SMTP) Transactional emails

Background Processing

Component Technology Version Purpose
Job Queues BullMQ 5.65.x Reliable async job processing
Process Manager PM2 Latest Process management and clustering
Scheduler node-cron 4.2.x Scheduled tasks

Frontend Stack

Component Technology Version Purpose
State Management TanStack Query 5.90.x Server state caching and synchronization
Routing React Router 7.9.x Client-side routing
Styling Tailwind CSS 4.1.x Utility-first CSS framework
Icons Lucide React 0.555.x Icon components
Charts Recharts 3.4.x Data visualization

Observability and Quality

Component Technology Purpose
Error Tracking Sentry / Bugsink Error monitoring and alerting
Logging Pino Structured JSON logging
Log Aggregation Logstash Centralized log collection
Testing Vitest Unit and integration testing
API Documentation Swagger/OpenAPI Interactive API documentation

System Components

Frontend (React/Vite)

The frontend is a single-page application (SPA) built with React 19 and Vite.

Key Characteristics:

  • Server state management via TanStack Query
  • Neo-Brutalism design system (ADR-012)
  • Responsive design for mobile and desktop
  • PWA-capable for offline access

Directory Structure:

src/
+-- components/      # Reusable UI components
+-- contexts/        # React context providers
+-- features/        # Feature-specific modules (ADR-047)
+-- hooks/           # Custom React hooks
+-- layouts/         # Page layout components
+-- pages/           # Route page components
+-- services/        # API client services

Backend (Express/Node.js)

The backend is a RESTful API server built with Express.js 5.

Key Characteristics:

  • Layered architecture (Routes -> Services -> Repositories)
  • JWT-based authentication with OAuth support
  • Request validation via Zod schemas
  • Structured logging with Pino
  • Standardized error handling (ADR-001)

API Route Modules:

Route Purpose
/api/auth Authentication (login, register, OAuth)
/api/users User profile management
/api/flyers Flyer CRUD and processing
/api/recipes Recipe management
/api/deals Best prices and deal discovery
/api/stores Store management
/api/admin Administrative functions
/api/health Health checks and monitoring

Database (PostgreSQL/PostGIS)

PostgreSQL serves as the primary data store with PostGIS extension for geographic queries.

Key Features:

  • UUID primary keys for user data
  • BIGINT IDENTITY for auto-incrementing IDs
  • PostGIS geography types for store locations
  • Stored functions for complex business logic
  • Triggers for automated updates (e.g., item_count maintenance)

Cache (Redis)

Redis provides caching and backing for the job queue system.

Usage Patterns:

  • Query result caching (flyers, prices, stats)
  • Rate limiting counters
  • BullMQ job queue storage
  • Session token storage

AI (Google Gemini)

Google Gemini powers the AI extraction capabilities.

Capabilities:

  • Flyer image analysis and data extraction
  • Store name and logo detection
  • Deal item parsing (name, price, quantity)
  • Date range extraction
  • Category classification

Background Workers (BullMQ/PM2)

BullMQ workers handle asynchronous processing tasks. PM2 manages both the API server and worker processes in production and development environments (ADR-014).

Dev Container PM2 Processes:

Process Script Purpose
flyer-crawler-api-dev tsx watch server.ts API server with hot reload
flyer-crawler-worker-dev tsx watch src/services/worker.ts Background job processing
flyer-crawler-vite-dev vite --host Frontend dev server

Production PM2 Processes:

Process Script Purpose
flyer-crawler-api tsx server.ts API server (cluster mode)
flyer-crawler-worker tsx worker.ts Background job processing
flyer-crawler-analytics-worker tsx analytics.ts Analytics report generation

Job Queues:

Queue Purpose Retry Strategy
flyer-processing Process uploaded flyers with AI 3 attempts, exponential backoff (5s)
receipt-processing OCR and parse receipts 3 attempts, exponential backoff (10s)
email-sending Send transactional emails 5 attempts, exponential backoff (10s)
analytics-reporting Generate daily analytics 2 attempts, exponential backoff (60s)
weekly-analytics-reporting Generate weekly reports 2 attempts, exponential backoff (1h)
file-cleanup Remove temporary files 3 attempts, exponential backoff (30s)
token-cleanup Expire old refresh tokens 2 attempts, exponential backoff (1h)
expiry-alerts Send pantry expiry notifications 2 attempts, exponential backoff (5m)
barcode-detection Process barcode scans 2 attempts, exponential backoff (5s)

Data Flow

Flyer Processing Pipeline

+-------------+     +----------------+     +------------------+     +---------------+
|   User      |     |   Express      |     |   BullMQ         |     |   PostgreSQL  |
|   Upload    +---->+   Route        +---->+   Queue          +---->+   Storage     |
+-------------+     +-------+--------+     +--------+---------+     +-------+-------+
                            |                       |                       |
                            v                       v                       v
                    +-------+--------+     +--------+---------+     +-------+-------+
                    |   Validate     |     |   Worker         |     |   Cache       |
                    |   & Store      |     |   Process        |     |   Invalidate  |
                    |   Temp File    |     |                  |     |               |
                    +----------------+     +--------+---------+     +---------------+
                                                   |
                                                   v
                                          +--------+---------+
                                          |   Google         |
                                          |   Gemini AI      |
                                          |   Extraction     |
                                          +--------+---------+
                                                   |
                                                   v
                                          +--------+---------+
                                          |   Transform      |
                                          |   & Validate     |
                                          |   Data           |
                                          +--------+---------+
                                                   |
                                                   v
                                          +--------+---------+
                                          |   Persist to     |
                                          |   Database       |
                                          |   (Transaction)  |
                                          +--------+---------+
                                                   |
                                                   v
                                          +--------+---------+
                                          |   WebSocket      |
                                          |   Notification   |
                                          +------------------+

Detailed Processing Steps

  1. Upload: User uploads flyer image via /api/flyers/upload
  2. Validation: Server validates file type, size, and generates checksum
  3. Queueing: Job added to flyer-processing queue with file path
  4. Worker Pickup: BullMQ worker picks up job for processing
  5. AI Extraction: Google Gemini analyzes image and extracts:
    • Store name
    • Valid date range
    • Store address (if present)
    • Deal items (name, price, quantity, category)
  6. Data Transformation: Raw AI output transformed to database schema
  7. Persistence: Transactional insert of flyer + items + store
  8. Cache Invalidation: Redis cache cleared for affected queries
  9. Notification: WebSocket message sent to user with results
  10. Cleanup: Temporary files scheduled for deletion

Architecture Layers

The application follows a strict layered architecture as defined in ADR-035.

+-----------------------------------------------------------------------+
|                          ROUTES LAYER                                  |
|  Responsibilities:                                                     |
|  - HTTP request/response handling                                      |
|  - Input validation (via middleware)                                   |
|  - Authentication/authorization checks                                 |
|  - Rate limiting                                                       |
|  - Response formatting (sendSuccess, sendPaginated, sendError)         |
+----------------------------------+------------------------------------+
                                   |
                                   v
+-----------------------------------------------------------------------+
|                         SERVICES LAYER                                 |
|  Responsibilities:                                                     |
|  - Business logic orchestration                                        |
|  - Transaction coordination (withTransaction)                          |
|  - External API integration                                            |
|  - Cross-repository operations                                         |
|  - Event publishing                                                    |
+----------------------------------+------------------------------------+
                                   |
                                   v
+-----------------------------------------------------------------------+
|                        REPOSITORY LAYER                                |
|  Responsibilities:                                                     |
|  - Direct database access                                              |
|  - Query construction                                                  |
|  - Entity mapping                                                      |
|  - Error translation (handleDbError)                                   |
+-----------------------------------------------------------------------+

Layer Communication Rules

  1. Routes MUST NOT directly access repositories (except simple CRUD)
  2. Repositories MUST NOT call other repositories (use services)
  3. Services MAY call other services
  4. Infrastructure services MAY be called from any layer

Service Types and Naming Conventions

Type Suffix Example Location
Business Service *Service.ts authService.ts src/services/
Server-Only Service *.server.ts aiService.server.ts src/services/
Database Repository *.db.ts user.db.ts src/services/db/
Infrastructure Descriptive logger.server.ts src/services/

Repository Method Naming (ADR-034)

Prefix Behavior Return Type
get* Throws NotFoundError if not found Entity
find* Returns null if not found Entity or null
list* Returns empty array if none found Entity[]

Key Entities

Entity Relationship Overview

+------------------+          +------------------+          +------------------+
|      users       |          |     profiles     |          |    addresses     |
|------------------|          |------------------|          |------------------|
| user_id (PK)     |<-------->| user_id (PK,FK)  |--------->| address_id (PK)  |
| email            |          | full_name        |          | address_line_1   |
| password_hash    |          | avatar_url       |          | city             |
| refresh_token    |          | points           |          | province_state   |
+--------+---------+          | role             |          | latitude         |
         |                    +------------------+          | longitude        |
         |                                                  | location (GIS)   |
         |                                                  +--------+---------+
         |                                                           ^
         v                                                           |
+--------+---------+          +------------------+          +--------+---------+
|      stores      |--------->| store_locations  |--------->|                  |
|------------------|          |------------------|          |                  |
| store_id (PK)    |          | store_location_id|          |                  |
| name             |          | store_id (FK)    |          |                  |
| logo_url         |          | address_id (FK)  |          |                  |
+--------+---------+          +------------------+          +------------------+
         |
         v
+--------+---------+          +------------------+          +------------------+
|      flyers      |--------->|   flyer_items    |--------->| master_grocery_  |
|------------------|          |------------------|          |     items        |
| flyer_id (PK)    |          | flyer_item_id    |          |------------------|
| store_id (FK)    |          | flyer_id (FK)    |          | master_grocery_  |
| file_name        |          | item             |          |   item_id (PK)   |
| image_url        |          | price_display    |          | name             |
| valid_from       |          | price_in_cents   |          | category_id (FK) |
| valid_to         |          | quantity         |          | is_allergen      |
| status           |          | master_item_id   |          +------------------+
| item_count       |          | category_id (FK) |
+------------------+          +------------------+

Core Entities

Entity Table Purpose
User users Authentication credentials and login tracking
Profile profiles Public user data, preferences, points
Store stores Grocery store chains (Safeway, Kroger, etc.)
StoreLocation store_locations Physical store locations with addresses
Address addresses Normalized address storage with geocoding
Flyer flyers Uploaded flyer metadata and status
FlyerItem flyer_items Individual deals extracted from flyers
MasterGroceryItem master_grocery_items Canonical grocery item dictionary
Category categories Item categorization (Produce, Dairy, etc.)

User Feature Entities

Entity Table Purpose
UserWatchedItem user_watched_items Items user wants to track prices for
UserAlert user_alerts Price alert thresholds
ShoppingList shopping_lists User shopping lists
ShoppingListItem shopping_list_items Items on shopping lists
Recipe recipes User recipes with ingredients
RecipeIngredient recipe_ingredients Recipe ingredient list
PantryItem pantry_items User pantry inventory
Receipt receipts Scanned receipt data
ReceiptItem receipt_items Items parsed from receipts

Gamification Entities

Entity Table Purpose
Achievement achievements Defined achievements
UserAchievement user_achievements Achievements earned by users
ActivityLog activity_log User activity for feeds and analytics

Authentication Flow

JWT Token Architecture

+-------------------+     +-------------------+     +-------------------+
|   Login Request   |     |   Server          |     |   Database        |
|   (email/pass)    +---->+   Validates       +---->+   Verify User     |
+-------------------+     +--------+----------+     +-------------------+
                                   |
                                   v
                          +--------+----------+
                          |   Generate        |
                          |   JWT Tokens      |
                          |   - Access (15m)  |
                          |   - Refresh (7d)  |
                          +--------+----------+
                                   |
                                   v
+-------------------+     +--------+----------+
|   Client Storage  |<----+   Return Tokens   |
|   - Access: Memory|     |   - Access: Body  |
|   - Refresh: HTTP |     |   - Refresh: Cookie|
|     Only Cookie   |     +-------------------+
+-------------------+

Authentication Methods

  1. Local Authentication: Email/password with bcrypt hashing
  2. Google OAuth 2.0: Social login via Google account
  3. GitHub OAuth 2.0: Social login via GitHub account

Security Features (ADR-016, ADR-048)

  • Rate Limiting: Login attempts rate-limited per IP
  • Account Lockout: 15-minute lockout after 5 failed attempts
  • Password Requirements: Strength validation via zxcvbn
  • JWT Rotation: Access tokens are short-lived, refresh tokens are rotated
  • HTTPS Only: All production traffic encrypted

Protected Route Flow

+-------------------+     +-------------------+     +-------------------+
|   API Request     |     |   requireAuth     |     |   JWT Strategy    |
|   + Bearer Token  +---->+   Middleware      +---->+   Validate        |
+-------------------+     +--------+----------+     +--------+----------+
                                   |                         |
                                   |     +-------------------+
                                   |     |
                                   v     v
                          +--------+-----+----+
                          |   req.user        |
                          |   populated       |
                          +--------+----------+
                                   |
                                   v
                          +--------+----------+
                          |   Route Handler   |
                          |   Executes        |
                          +-------------------+

Background Processing

Worker Architecture

+-------------------+     +-------------------+     +-------------------+
|   API Server      |     |   Redis           |     |   Worker Process  |
|   (Queue Producer)|     |   (Job Storage)   |     |   (Consumer)      |
+--------+----------+     +--------+----------+     +--------+----------+
         |                         ^                         |
         |    Add Job              |    Poll/Process         |
         +------------------------>+<------------------------+
                                   |
                                   |
         +-------------------------+-------------------------+
         |                         |                         |
         v                         v                         v
+--------+----------+     +--------+----------+     +--------+----------+
|   Flyer Worker    |     |   Email Worker    |     |   Analytics       |
|   Concurrency: 1  |     |   Concurrency: 10 |     |   Worker          |
+-------------------+     +-------------------+     |   Concurrency: 1  |
                                                   +-------------------+

Job Lifecycle

  1. Queued: Job added to queue with data payload
  2. Active: Worker picks up job and begins processing
  3. Completed: Job finishes successfully
  4. Failed: Job encounters error, may retry
  5. Delayed: Job waiting for retry backoff

Retry Strategy

Jobs use exponential backoff for retries:

Attempt 1: Immediate
Attempt 2: Initial delay (e.g., 5 seconds)
Attempt 3: 2x delay (e.g., 10 seconds)
Attempt 4: 4x delay (e.g., 20 seconds)
...

Scheduled Jobs (ADR-037)

Schedule Job Purpose
Daily 2:00 AM Analytics Report Generate daily usage statistics
Weekly Sunday 3:00 AM Weekly Analytics Generate weekly summary reports
Every 6 hours Token Cleanup Remove expired refresh tokens
Every hour Expiry Alerts Check and send pantry expiry notifications

Deployment Architecture

Environment Overview

+-----------------------------------------------------------------------------------+
|                              DEVELOPMENT                                           |
+-----------------------------------------------------------------------------------+
|                                                                                   |
|  +-----------------------------------+     +-----------------------------------+  |
|  |   Windows Host Machine            |     |   Linux Dev Container            |  |
|  |   - VS Code                       |     |   (flyer-crawler-dev)            |  |
|  |   - Podman Desktop                +---->+   - Node.js 22                   |  |
|  |   - Git                           |     |   - PM2 (process manager)        |  |
|  +-----------------------------------+     |   - PostgreSQL 16                |  |
|                                            |   - Redis 7                      |  |
|                                            |   - Bugsink (local)              |  |
|                                            |   - Logstash (log aggregation)   |  |
|                                            +-----------------------------------+  |
+-----------------------------------------------------------------------------------+

+-----------------------------------------------------------------------------------+
|                              TEST SERVER                                           |
+-----------------------------------------------------------------------------------+
|                                                                                   |
|  +-----------------------------------+     +-----------------------------------+  |
|  |   NGINX Reverse Proxy             |     |   Application Server              |  |
|  |   flyer-crawler-test.projectium.com     |   - PM2 Process Manager          |  |
|  |   - SSL/TLS (Let's Encrypt)       +---->+   - Node.js 22                   |  |
|  |   - Rate Limiting                 |     |   - PostgreSQL 16                |  |
|  +-----------------------------------+     |   - Redis 7                      |  |
|                                            +-----------------------------------+  |
+-----------------------------------------------------------------------------------+

+-----------------------------------------------------------------------------------+
|                              PRODUCTION                                            |
+-----------------------------------------------------------------------------------+
|                                                                                   |
|  +-----------------------------------+     +-----------------------------------+  |
|  |   NGINX Reverse Proxy             |     |   Application Server              |  |
|  |   flyer-crawler.projectium.com          |   - PM2 Process Manager          |  |
|  |   - SSL/TLS (Let's Encrypt)       +---->+   - Node.js 22 (Clustered)       |  |
|  |   - Rate Limiting                 |     |   - PostgreSQL 16                |  |
|  |   - Gzip Compression              |     |   - Redis 7                      |  |
|  +-----------------------------------+     +-----------------------------------+  |
|                                                                                   |
|                                            +-----------------------------------+  |
|                                            |   Monitoring                      |  |
|                                            |   - Bugsink (Error Tracking)     |  |
|                                            |   - Logstash (Log Aggregation)   |  |
|                                            +-----------------------------------+  |
+-----------------------------------------------------------------------------------+

Deployment Pipeline (ADR-017)

+------------+     +------------+     +------------+     +------------+
|   Push to  |     |   Gitea    |     |   Build &  |     |   Deploy   |
|   main     +---->+   Actions  +---->+   Test     +---->+   to Prod  |
+------------+     +------------+     +------------+     +------------+
                                             |
                                             v
                                      +------+------+
                                      |   Type      |
                                      |   Check     |
                                      +------+------+
                                             |
                                             v
                                      +------+------+
                                      |   Unit      |
                                      |   Tests     |
                                      +------+------+
                                             |
                                             v
                                      +------+------+
                                      |   Build     |
                                      |   Assets    |
                                      +-------------+

Server Paths

Environment Web Root Data Storage Flyer Images
Production /var/www/flyer-crawler.projectium.com/ /var/www/flyer-crawler.projectium.com/uploads/ /var/www/flyer-crawler.projectium.com/flyer-images/
Test /var/www/flyer-crawler-test.projectium.com/ /var/www/flyer-crawler-test.projectium.com/uploads/ /var/www/flyer-crawler-test.projectium.com/flyer-images/
Development Container-local Container-local /app/public/flyer-images/

Flyer images are served by NGINX as static files at /flyer-images/ with 7-day browser caching.


Design Principles and ADRs

The system architecture is governed by Architecture Decision Records (ADRs). Key decisions include:

Core Infrastructure

ADR Title Status
ADR-001 Standardized Error Handling Accepted
ADR-002 Standardized Transaction Management Accepted
ADR-007 Configuration and Secrets Management Accepted
ADR-020 Health Checks and Probes Accepted

API and Integration

ADR Title Status
ADR-003 Standardized Input Validation Accepted
ADR-008 API Versioning Strategy Phase 1 Complete
ADR-022 Real-time Notification System Proposed
ADR-028 API Response Standardization Implemented

Implementation Guide: API Versioning Infrastructure (Phase 2)

Security

ADR Title Status
ADR-016 API Security Hardening Accepted
ADR-032 Rate Limiting Strategy Accepted
ADR-048 Authentication Strategy Partially Implemented

Architecture Patterns

ADR Title Status
ADR-034 Repository Pattern Standards Accepted
ADR-035 Service Layer Architecture Accepted
ADR-036 Event Bus and Pub/Sub Pattern Accepted
ADR-041 AI/Gemini Integration Architecture Accepted

Operations

ADR Title Status
ADR-006 Background Job Processing Accepted
ADR-014 Containerization and Deployment Partially Implemented
ADR-037 Scheduled Jobs and Cron Pattern Accepted
ADR-038 Graceful Shutdown Pattern Accepted

Observability

ADR Title Status
ADR-004 Structured Logging Accepted
ADR-015 APM and Error Tracking Proposed
ADR-050 PostgreSQL Function Observability Accepted

Full ADR Index: docs/adr/index.md


Key Files Reference

Configuration Files

File Purpose
server.ts Express application setup and middleware configuration
src/config/env.ts Environment variable validation (Zod schema)
src/config/passport.ts Authentication strategies (Local, JWT, OAuth)
ecosystem.config.cjs PM2 process manager configuration
vite.config.ts Vite build and dev server configuration

Route Files

File API Prefix
src/routes/auth.routes.ts /api/auth
src/routes/user.routes.ts /api/users
src/routes/flyer.routes.ts /api/flyers
src/routes/recipe.routes.ts /api/recipes
src/routes/deals.routes.ts /api/deals
src/routes/store.routes.ts /api/stores
src/routes/admin.routes.ts /api/admin
src/routes/health.routes.ts /api/health

Service Files

File Purpose
src/services/flyerProcessingService.server.ts Flyer processing pipeline orchestration
src/services/aiService.server.ts Google Gemini AI integration
src/services/cacheService.server.ts Redis caching abstraction
src/services/emailService.server.ts Email sending
src/services/queues.server.ts BullMQ queue definitions
src/services/workers.server.ts BullMQ worker definitions

Database Files

File Purpose
src/services/db/connection.db.ts Database pool and transaction management
src/services/db/errors.db.ts Database error types
src/services/db/user.db.ts User repository
src/services/db/flyer.db.ts Flyer repository
sql/master_schema_rollup.sql Complete database schema (for test DB setup)
sql/initial_schema.sql Fresh installation schema

Type Definitions

File Purpose
src/types.ts Core entity type definitions
src/types/job-data.ts BullMQ job payload types

Additional Resources


This document is maintained as part of the Flyer Crawler project documentation. For updates, contact the development team or submit a pull request.