torbo/flyer-crawler.projectium.com

Fork 0

Files

Torben Sorensen 45ac4fccf5

Deploy to Test Environment / deploy-to-test (push) Failing after 2m15s

Details

comprehensive documentation review + test fixes

2026-01-28 16:35:38 -08:00

15 KiB

Raw Blame History

Database Architecture

Version: 0.12.20 Last Updated: 2026-01-28

Flyer Crawler uses PostgreSQL 16 with PostGIS for geographic data, pg_trgm for fuzzy text search, and uuid-ossp for UUID generation. The database contains 65 tables organized into logical domains.

Schema Overview
Database Setup
Schema Reference
Related Documentation

Schema Overview

The database is organized into the following domains:

Core Infrastructure (6 tables)

Table	Purpose	Primary Key
`users`	Authentication credentials and login data	`user_id` (UUID)
`profiles`	Public user data, preferences, points	`user_id` (UUID)
`addresses`	Normalized address storage with geocoding	`address_id`
`activity_log`	User activity audit trail	`activity_log_id`
`password_reset_tokens`	Temporary tokens for password reset	`token_id`
`schema_info`	Schema deployment metadata	`environment`

Stores and Locations (4 tables)

Table	Purpose	Primary Key
`stores`	Grocery store chains (Safeway, Kroger)	`store_id`
`store_locations`	Physical store locations with addresses	`store_location_id`
`favorite_stores`	User store favorites	`user_id, store_id`
`store_receipt_patterns`	Receipt text patterns for store ID	`pattern_id`

Flyers and Items (7 tables)

Table	Purpose	Primary Key
`flyers`	Uploaded flyer metadata and status	`flyer_id`
`flyer_items`	Individual deals extracted from flyers	`flyer_item_id`
`flyer_locations`	Flyer-to-location associations	`flyer_location_id`
`categories`	Item categorization (Produce, Dairy)	`category_id`
`master_grocery_items`	Canonical grocery item dictionary	`master_grocery_item_id`
`master_item_aliases`	Alternative names for master items	`alias_id`
`unmatched_flyer_items`	Items pending master item matching	`unmatched_item_id`

Products and Brands (2 tables)

Table	Purpose	Primary Key
`brands`	Brand names (Coca-Cola, Kraft)	`brand_id`
`products`	Specific products (master item + brand + size)	`product_id`

Price Tracking (3 tables)

Table	Purpose	Primary Key
`item_price_history`	Historical prices for master items	`price_history_id`
`user_submitted_prices`	User-contributed price reports	`submission_id`
`suggested_corrections`	Suggested edits to flyer items	`correction_id`

User Features (8 tables)

Table	Purpose	Primary Key
`user_watched_items`	Items user wants to track prices for	`user_watched_item_id`
`user_alerts`	Price alert thresholds	`alert_id`
`notifications`	User notifications	`notification_id`
`user_item_aliases`	User-defined item name aliases	`alias_id`
`user_follows`	User-to-user follow relationships	`follower_id, following_id`
`user_reactions`	Reactions to content (likes, etc.)	`reaction_id`
`budgets`	User-defined spending budgets	`budget_id`
`search_queries`	Search history for analytics	`query_id`

Shopping Lists (4 tables)

Table	Purpose	Primary Key
`shopping_lists`	User shopping lists	`shopping_list_id`
`shopping_list_items`	Items on shopping lists	`shopping_list_item_id`
`shared_shopping_lists`	Shopping list sharing	`shared_shopping_list_id`
`shopping_trips`	Completed shopping trips	`trip_id`
`shopping_trip_items`	Items purchased on trips	`trip_item_id`

Recipes (11 tables)

Table	Purpose	Primary Key
`recipes`	User recipes with metadata	`recipe_id`
`recipe_ingredients`	Recipe ingredient list	`recipe_ingredient_id`
`recipe_ingredient_substitutions`	Ingredient alternatives	`substitution_id`
`tags`	Recipe tags (vegan, quick, etc.)	`tag_id`
`recipe_tags`	Recipe-to-tag associations	`recipe_id, tag_id`
`appliances`	Kitchen appliances	`appliance_id`
`recipe_appliances`	Appliances needed for recipes	`recipe_id, appliance_id`
`recipe_ratings`	User ratings for recipes	`rating_id`
`recipe_comments`	User comments on recipes	`comment_id`
`favorite_recipes`	User recipe favorites	`user_id, recipe_id`
`recipe_collections`	User recipe collections	`collection_id`

Meal Planning (3 tables)

Table	Purpose	Primary Key
`menu_plans`	Weekly/monthly meal plans	`menu_plan_id`
`shared_menu_plans`	Menu plan sharing	`share_id`
`planned_meals`	Individual meals in a plan	`planned_meal_id`

Pantry and Inventory (4 tables)

Table	Purpose	Primary Key
`pantry_items`	User pantry inventory	`pantry_item_id`
`pantry_locations`	Storage locations (fridge, freezer)	`location_id`
`expiry_date_ranges`	Reference shelf life data	`expiry_range_id`
`expiry_alerts`	User expiry notification preferences	`expiry_alert_id`
`expiry_alert_log`	Sent expiry notifications	`alert_log_id`

Receipts (4 tables)

Table	Purpose	Primary Key
`receipts`	Scanned receipt metadata	`receipt_id`
`receipt_items`	Items parsed from receipts	`receipt_item_id`
`receipt_processing_log`	OCR/AI processing audit trail	`log_id`

UPC Scanning (2 tables)

Table	Purpose	Primary Key
`upc_scan_history`	User barcode scan history	`scan_id`
`upc_external_lookups`	External UPC API response cache	`lookup_id`

Gamification (2 tables)

Table	Purpose	Primary Key
`achievements`	Defined achievements	`achievement_id`
`user_achievements`	Achievements earned by users	`user_id, achievement_id`

User Preferences (3 tables)

Table	Purpose	Primary Key
`dietary_restrictions`	Defined dietary restrictions	`restriction_id`
`user_dietary_restrictions`	User dietary preferences	`user_id, restriction_id`
`user_appliances`	Appliances user owns	`user_id, appliance_id`

Reference Data (1 table)

Table	Purpose	Primary Key
`unit_conversions`	Unit conversion factors	`conversion_id`

Database Setup

Required Extensions

Extension	Purpose
`postgis`	Geographic/spatial data for store locations
`pg_trgm`	Trigram matching for fuzzy text search
`uuid-ossp`	UUID generation for primary keys

Database Users

This project uses environment-specific database users to isolate production and test environments:

User	Database	Purpose
`flyer_crawler_prod`	`flyer-crawler-prod`	Production
`flyer_crawler_test`	`flyer-crawler-test`	Testing

Production Database Setup

Step 1: Install PostgreSQL

sudo apt update
sudo apt install postgresql postgresql-contrib

Step 2: Create Database and User

Switch to the postgres system user:

sudo -u postgres psql

Run the following SQL commands (replace 'a_very_strong_password' with a secure password):

-- Create the production role
CREATE ROLE flyer_crawler_prod WITH LOGIN PASSWORD 'a_very_strong_password';

-- Create the production database
CREATE DATABASE "flyer-crawler-prod" WITH OWNER = flyer_crawler_prod;

-- Connect to the new database
\c "flyer-crawler-prod"

-- Grant schema privileges
ALTER SCHEMA public OWNER TO flyer_crawler_prod;
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_prod;

-- Install required extensions (must be done as superuser)
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- Exit
\q

Step 3: Apply the Schema

Navigate to your project directory and run:

psql -U flyer_crawler_prod -d "flyer-crawler-prod" -f sql/master_schema_rollup.sql

This creates all tables, functions, triggers, and seeds essential data (categories, master items).

Step 4: Seed the Admin Account

Set the required environment variables and run the seed script:

export DB_USER=flyer_crawler_prod
export DB_PASSWORD=your_password
export DB_NAME="flyer-crawler-prod"
export DB_HOST=localhost

npx tsx src/db/seed_admin_account.ts

Test Database Setup

The test database is used by CI/CD pipelines and local test runs.

Step 1: Create the Test Database

sudo -u postgres psql

-- Create the test role
CREATE ROLE flyer_crawler_test WITH LOGIN PASSWORD 'a_very_strong_password';

-- Create the test database
CREATE DATABASE "flyer-crawler-test" WITH OWNER = flyer_crawler_test;

-- Connect to the test database
\c "flyer-crawler-test"

-- Grant schema privileges (required for test runner to reset schema)
ALTER SCHEMA public OWNER TO flyer_crawler_test;
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test;

-- Install required extensions
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- Exit
\q

Step 2: Configure CI/CD Secrets

Ensure these secrets are set in your Gitea repository settings:

Shared:

Secret	Description
`DB_HOST`	Database hostname (e.g., `localhost`)
`DB_PORT`	Database port (e.g., `5432`)

Production-specific:

Secret	Description
`DB_USER_PROD`	Production database user (`flyer_crawler_prod`)
`DB_PASSWORD_PROD`	Production database password
`DB_DATABASE_PROD`	Production database name (`flyer-crawler-prod`)

Test-specific:

Secret	Description
`DB_USER_TEST`	Test database user (`flyer_crawler_test`)
`DB_PASSWORD_TEST`	Test database password
`DB_DATABASE_TEST`	Test database name (`flyer-crawler-test`)

How the Test Pipeline Works

The CI pipeline uses a permanent test database that gets reset on each test run:

Setup: The vitest global setup script connects to flyer-crawler-test
Schema Reset: Executes sql/drop_tables.sql (DROP SCHEMA public CASCADE)
Schema Application: Runs sql/master_schema_rollup.sql to build a fresh schema
Test Execution: Tests run against the clean database

This approach is faster than creating/destroying databases and doesn't require sudo access.

Connecting to Production Database

psql -h localhost -U flyer_crawler_prod -d "flyer-crawler-prod" -W

Checking PostGIS Version

SELECT version();
SELECT PostGIS_Full_Version();

Example output:

PostgreSQL 14.19 (Ubuntu 14.19-0ubuntu0.22.04.1)
POSTGIS="3.2.0 c3e3cc0" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1"

Schema Files

File	Purpose
`sql/master_schema_rollup.sql`	Complete schema with all tables, functions, and seed data
`sql/drop_tables.sql`	Drops entire schema (used by test runner)
`sql/schema.sql.txt`	Legacy schema file (reference only)

Backup and Restore

Create a Backup

pg_dump -U flyer_crawler_prod -d "flyer-crawler-prod" -F c -f backup.dump

Restore from Backup

pg_restore -U flyer_crawler_prod -d "flyer-crawler-prod" -c backup.dump

Installation Guide - Local development setup
Deployment Guide - Production deployment

15 KiB Raw Blame History