Database Architecture
Version: 0.12.20
Last Updated: 2026-01-28
Flyer Crawler uses PostgreSQL 16 with PostGIS for geographic data, pg_trgm for fuzzy text search, and uuid-ossp for UUID generation. The database contains 65 tables organized into logical domains.
Table of Contents
- Schema Overview
- Database Setup
- Schema Reference
- Related Documentation
Schema Overview
The database is organized into the following domains:
Core Infrastructure (6 tables)
| Table |
Purpose |
Primary Key |
users |
Authentication credentials and login data |
user_id (UUID) |
profiles |
Public user data, preferences, points |
user_id (UUID) |
addresses |
Normalized address storage with geocoding |
address_id |
activity_log |
User activity audit trail |
activity_log_id |
password_reset_tokens |
Temporary tokens for password reset |
token_id |
schema_info |
Schema deployment metadata |
environment |
Stores and Locations (4 tables)
| Table |
Purpose |
Primary Key |
stores |
Grocery store chains (Safeway, Kroger) |
store_id |
store_locations |
Physical store locations with addresses |
store_location_id |
favorite_stores |
User store favorites |
user_id, store_id |
store_receipt_patterns |
Receipt text patterns for store ID |
pattern_id |
Flyers and Items (7 tables)
| Table |
Purpose |
Primary Key |
flyers |
Uploaded flyer metadata and status |
flyer_id |
flyer_items |
Individual deals extracted from flyers |
flyer_item_id |
flyer_locations |
Flyer-to-location associations |
flyer_location_id |
categories |
Item categorization (Produce, Dairy) |
category_id |
master_grocery_items |
Canonical grocery item dictionary |
master_grocery_item_id |
master_item_aliases |
Alternative names for master items |
alias_id |
unmatched_flyer_items |
Items pending master item matching |
unmatched_item_id |
Products and Brands (2 tables)
| Table |
Purpose |
Primary Key |
brands |
Brand names (Coca-Cola, Kraft) |
brand_id |
products |
Specific products (master item + brand + size) |
product_id |
Price Tracking (3 tables)
| Table |
Purpose |
Primary Key |
item_price_history |
Historical prices for master items |
price_history_id |
user_submitted_prices |
User-contributed price reports |
submission_id |
suggested_corrections |
Suggested edits to flyer items |
correction_id |
User Features (8 tables)
| Table |
Purpose |
Primary Key |
user_watched_items |
Items user wants to track prices for |
user_watched_item_id |
user_alerts |
Price alert thresholds |
alert_id |
notifications |
User notifications |
notification_id |
user_item_aliases |
User-defined item name aliases |
alias_id |
user_follows |
User-to-user follow relationships |
follower_id, following_id |
user_reactions |
Reactions to content (likes, etc.) |
reaction_id |
budgets |
User-defined spending budgets |
budget_id |
search_queries |
Search history for analytics |
query_id |
Shopping Lists (4 tables)
| Table |
Purpose |
Primary Key |
shopping_lists |
User shopping lists |
shopping_list_id |
shopping_list_items |
Items on shopping lists |
shopping_list_item_id |
shared_shopping_lists |
Shopping list sharing |
shared_shopping_list_id |
shopping_trips |
Completed shopping trips |
trip_id |
shopping_trip_items |
Items purchased on trips |
trip_item_id |
Recipes (11 tables)
| Table |
Purpose |
Primary Key |
recipes |
User recipes with metadata |
recipe_id |
recipe_ingredients |
Recipe ingredient list |
recipe_ingredient_id |
recipe_ingredient_substitutions |
Ingredient alternatives |
substitution_id |
tags |
Recipe tags (vegan, quick, etc.) |
tag_id |
recipe_tags |
Recipe-to-tag associations |
recipe_id, tag_id |
appliances |
Kitchen appliances |
appliance_id |
recipe_appliances |
Appliances needed for recipes |
recipe_id, appliance_id |
recipe_ratings |
User ratings for recipes |
rating_id |
recipe_comments |
User comments on recipes |
comment_id |
favorite_recipes |
User recipe favorites |
user_id, recipe_id |
recipe_collections |
User recipe collections |
collection_id |
Meal Planning (3 tables)
| Table |
Purpose |
Primary Key |
menu_plans |
Weekly/monthly meal plans |
menu_plan_id |
shared_menu_plans |
Menu plan sharing |
share_id |
planned_meals |
Individual meals in a plan |
planned_meal_id |
Pantry and Inventory (4 tables)
| Table |
Purpose |
Primary Key |
pantry_items |
User pantry inventory |
pantry_item_id |
pantry_locations |
Storage locations (fridge, freezer) |
location_id |
expiry_date_ranges |
Reference shelf life data |
expiry_range_id |
expiry_alerts |
User expiry notification preferences |
expiry_alert_id |
expiry_alert_log |
Sent expiry notifications |
alert_log_id |
Receipts (4 tables)
| Table |
Purpose |
Primary Key |
receipts |
Scanned receipt metadata |
receipt_id |
receipt_items |
Items parsed from receipts |
receipt_item_id |
receipt_processing_log |
OCR/AI processing audit trail |
log_id |
UPC Scanning (2 tables)
| Table |
Purpose |
Primary Key |
upc_scan_history |
User barcode scan history |
scan_id |
upc_external_lookups |
External UPC API response cache |
lookup_id |
Gamification (2 tables)
| Table |
Purpose |
Primary Key |
achievements |
Defined achievements |
achievement_id |
user_achievements |
Achievements earned by users |
user_id, achievement_id |
User Preferences (3 tables)
| Table |
Purpose |
Primary Key |
dietary_restrictions |
Defined dietary restrictions |
restriction_id |
user_dietary_restrictions |
User dietary preferences |
user_id, restriction_id |
user_appliances |
Appliances user owns |
user_id, appliance_id |
Reference Data (1 table)
| Table |
Purpose |
Primary Key |
unit_conversions |
Unit conversion factors |
conversion_id |
Database Setup
Required Extensions
| Extension |
Purpose |
postgis |
Geographic/spatial data for store locations |
pg_trgm |
Trigram matching for fuzzy text search |
uuid-ossp |
UUID generation for primary keys |
Database Users
This project uses environment-specific database users to isolate production and test environments:
| User |
Database |
Purpose |
flyer_crawler_prod |
flyer-crawler-prod |
Production |
flyer_crawler_test |
flyer-crawler-test |
Testing |
Production Database Setup
Step 1: Install PostgreSQL
Step 2: Create Database and User
Switch to the postgres system user:
Run the following SQL commands (replace 'a_very_strong_password' with a secure password):
Step 3: Apply the Schema
Navigate to your project directory and run:
This creates all tables, functions, triggers, and seeds essential data (categories, master items).
Step 4: Seed the Admin Account
Set the required environment variables and run the seed script:
Test Database Setup
The test database is used by CI/CD pipelines and local test runs.
Step 1: Create the Test Database
Step 2: Configure CI/CD Secrets
Ensure these secrets are set in your Gitea repository settings:
Shared:
| Secret |
Description |
DB_HOST |
Database hostname (e.g., localhost) |
DB_PORT |
Database port (e.g., 5432) |
Production-specific:
| Secret |
Description |
DB_USER_PROD |
Production database user (flyer_crawler_prod) |
DB_PASSWORD_PROD |
Production database password |
DB_DATABASE_PROD |
Production database name (flyer-crawler-prod) |
Test-specific:
| Secret |
Description |
DB_USER_TEST |
Test database user (flyer_crawler_test) |
DB_PASSWORD_TEST |
Test database password |
DB_DATABASE_TEST |
Test database name (flyer-crawler-test) |
How the Test Pipeline Works
The CI pipeline uses a permanent test database that gets reset on each test run:
- Setup: The vitest global setup script connects to
flyer-crawler-test
- Schema Reset: Executes
sql/drop_tables.sql (DROP SCHEMA public CASCADE)
- Schema Application: Runs
sql/master_schema_rollup.sql to build a fresh schema
- Test Execution: Tests run against the clean database
This approach is faster than creating/destroying databases and doesn't require sudo access.
Connecting to Production Database
Checking PostGIS Version
Example output:
Schema Files
| File |
Purpose |
sql/master_schema_rollup.sql |
Complete schema with all tables, functions, and seed data |
sql/drop_tables.sql |
Drops entire schema (used by test runner) |
sql/schema.sql.txt |
Legacy schema file (reference only) |
Backup and Restore
Create a Backup
Restore from Backup
Related Documentation