Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m15s
382 lines
15 KiB
Markdown
382 lines
15 KiB
Markdown
# Database Architecture
|
|
|
|
**Version**: 0.12.20
|
|
**Last Updated**: 2026-01-28
|
|
|
|
Flyer Crawler uses PostgreSQL 16 with PostGIS for geographic data, pg_trgm for fuzzy text search, and uuid-ossp for UUID generation. The database contains 65 tables organized into logical domains.
|
|
|
|
## Table of Contents
|
|
|
|
1. [Schema Overview](#schema-overview)
|
|
2. [Database Setup](#database-setup)
|
|
3. [Schema Reference](#schema-reference)
|
|
4. [Related Documentation](#related-documentation)
|
|
|
|
---
|
|
|
|
## Schema Overview
|
|
|
|
The database is organized into the following domains:
|
|
|
|
### Core Infrastructure (6 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ----------------------- | ----------------------------------------- | ----------------- |
|
|
| `users` | Authentication credentials and login data | `user_id` (UUID) |
|
|
| `profiles` | Public user data, preferences, points | `user_id` (UUID) |
|
|
| `addresses` | Normalized address storage with geocoding | `address_id` |
|
|
| `activity_log` | User activity audit trail | `activity_log_id` |
|
|
| `password_reset_tokens` | Temporary tokens for password reset | `token_id` |
|
|
| `schema_info` | Schema deployment metadata | `environment` |
|
|
|
|
### Stores and Locations (4 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ------------------------ | --------------------------------------- | ------------------- |
|
|
| `stores` | Grocery store chains (Safeway, Kroger) | `store_id` |
|
|
| `store_locations` | Physical store locations with addresses | `store_location_id` |
|
|
| `favorite_stores` | User store favorites | `user_id, store_id` |
|
|
| `store_receipt_patterns` | Receipt text patterns for store ID | `pattern_id` |
|
|
|
|
### Flyers and Items (7 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ----------------------- | -------------------------------------- | ------------------------ |
|
|
| `flyers` | Uploaded flyer metadata and status | `flyer_id` |
|
|
| `flyer_items` | Individual deals extracted from flyers | `flyer_item_id` |
|
|
| `flyer_locations` | Flyer-to-location associations | `flyer_location_id` |
|
|
| `categories` | Item categorization (Produce, Dairy) | `category_id` |
|
|
| `master_grocery_items` | Canonical grocery item dictionary | `master_grocery_item_id` |
|
|
| `master_item_aliases` | Alternative names for master items | `alias_id` |
|
|
| `unmatched_flyer_items` | Items pending master item matching | `unmatched_item_id` |
|
|
|
|
### Products and Brands (2 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ---------- | ---------------------------------------------- | ------------ |
|
|
| `brands` | Brand names (Coca-Cola, Kraft) | `brand_id` |
|
|
| `products` | Specific products (master item + brand + size) | `product_id` |
|
|
|
|
### Price Tracking (3 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ----------------------- | ---------------------------------- | ------------------ |
|
|
| `item_price_history` | Historical prices for master items | `price_history_id` |
|
|
| `user_submitted_prices` | User-contributed price reports | `submission_id` |
|
|
| `suggested_corrections` | Suggested edits to flyer items | `correction_id` |
|
|
|
|
### User Features (8 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| -------------------- | ------------------------------------ | --------------------------- |
|
|
| `user_watched_items` | Items user wants to track prices for | `user_watched_item_id` |
|
|
| `user_alerts` | Price alert thresholds | `alert_id` |
|
|
| `notifications` | User notifications | `notification_id` |
|
|
| `user_item_aliases` | User-defined item name aliases | `alias_id` |
|
|
| `user_follows` | User-to-user follow relationships | `follower_id, following_id` |
|
|
| `user_reactions` | Reactions to content (likes, etc.) | `reaction_id` |
|
|
| `budgets` | User-defined spending budgets | `budget_id` |
|
|
| `search_queries` | Search history for analytics | `query_id` |
|
|
|
|
### Shopping Lists (4 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ----------------------- | ------------------------ | ------------------------- |
|
|
| `shopping_lists` | User shopping lists | `shopping_list_id` |
|
|
| `shopping_list_items` | Items on shopping lists | `shopping_list_item_id` |
|
|
| `shared_shopping_lists` | Shopping list sharing | `shared_shopping_list_id` |
|
|
| `shopping_trips` | Completed shopping trips | `trip_id` |
|
|
| `shopping_trip_items` | Items purchased on trips | `trip_item_id` |
|
|
|
|
### Recipes (11 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| --------------------------------- | -------------------------------- | ------------------------- |
|
|
| `recipes` | User recipes with metadata | `recipe_id` |
|
|
| `recipe_ingredients` | Recipe ingredient list | `recipe_ingredient_id` |
|
|
| `recipe_ingredient_substitutions` | Ingredient alternatives | `substitution_id` |
|
|
| `tags` | Recipe tags (vegan, quick, etc.) | `tag_id` |
|
|
| `recipe_tags` | Recipe-to-tag associations | `recipe_id, tag_id` |
|
|
| `appliances` | Kitchen appliances | `appliance_id` |
|
|
| `recipe_appliances` | Appliances needed for recipes | `recipe_id, appliance_id` |
|
|
| `recipe_ratings` | User ratings for recipes | `rating_id` |
|
|
| `recipe_comments` | User comments on recipes | `comment_id` |
|
|
| `favorite_recipes` | User recipe favorites | `user_id, recipe_id` |
|
|
| `recipe_collections` | User recipe collections | `collection_id` |
|
|
|
|
### Meal Planning (3 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ------------------- | -------------------------- | ----------------- |
|
|
| `menu_plans` | Weekly/monthly meal plans | `menu_plan_id` |
|
|
| `shared_menu_plans` | Menu plan sharing | `share_id` |
|
|
| `planned_meals` | Individual meals in a plan | `planned_meal_id` |
|
|
|
|
### Pantry and Inventory (4 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| -------------------- | ------------------------------------ | ----------------- |
|
|
| `pantry_items` | User pantry inventory | `pantry_item_id` |
|
|
| `pantry_locations` | Storage locations (fridge, freezer) | `location_id` |
|
|
| `expiry_date_ranges` | Reference shelf life data | `expiry_range_id` |
|
|
| `expiry_alerts` | User expiry notification preferences | `expiry_alert_id` |
|
|
| `expiry_alert_log` | Sent expiry notifications | `alert_log_id` |
|
|
|
|
### Receipts (4 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ------------------------ | ----------------------------- | ----------------- |
|
|
| `receipts` | Scanned receipt metadata | `receipt_id` |
|
|
| `receipt_items` | Items parsed from receipts | `receipt_item_id` |
|
|
| `receipt_processing_log` | OCR/AI processing audit trail | `log_id` |
|
|
|
|
### UPC Scanning (2 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ---------------------- | ------------------------------- | ----------- |
|
|
| `upc_scan_history` | User barcode scan history | `scan_id` |
|
|
| `upc_external_lookups` | External UPC API response cache | `lookup_id` |
|
|
|
|
### Gamification (2 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ------------------- | ---------------------------- | ------------------------- |
|
|
| `achievements` | Defined achievements | `achievement_id` |
|
|
| `user_achievements` | Achievements earned by users | `user_id, achievement_id` |
|
|
|
|
### User Preferences (3 tables)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| --------------------------- | ---------------------------- | ------------------------- |
|
|
| `dietary_restrictions` | Defined dietary restrictions | `restriction_id` |
|
|
| `user_dietary_restrictions` | User dietary preferences | `user_id, restriction_id` |
|
|
| `user_appliances` | Appliances user owns | `user_id, appliance_id` |
|
|
|
|
### Reference Data (1 table)
|
|
|
|
| Table | Purpose | Primary Key |
|
|
| ------------------ | ----------------------- | --------------- |
|
|
| `unit_conversions` | Unit conversion factors | `conversion_id` |
|
|
|
|
---
|
|
|
|
## Database Setup
|
|
|
|
### Required Extensions
|
|
|
|
| Extension | Purpose |
|
|
| ----------- | ------------------------------------------- |
|
|
| `postgis` | Geographic/spatial data for store locations |
|
|
| `pg_trgm` | Trigram matching for fuzzy text search |
|
|
| `uuid-ossp` | UUID generation for primary keys |
|
|
|
|
---
|
|
|
|
### Database Users
|
|
|
|
This project uses **environment-specific database users** to isolate production and test environments:
|
|
|
|
| User | Database | Purpose |
|
|
| -------------------- | -------------------- | ---------- |
|
|
| `flyer_crawler_prod` | `flyer-crawler-prod` | Production |
|
|
| `flyer_crawler_test` | `flyer-crawler-test` | Testing |
|
|
|
|
---
|
|
|
|
## Production Database Setup
|
|
|
|
### Step 1: Install PostgreSQL
|
|
|
|
```bash
|
|
sudo apt update
|
|
sudo apt install postgresql postgresql-contrib
|
|
```
|
|
|
|
### Step 2: Create Database and User
|
|
|
|
Switch to the postgres system user:
|
|
|
|
```bash
|
|
sudo -u postgres psql
|
|
```
|
|
|
|
Run the following SQL commands (replace `'a_very_strong_password'` with a secure password):
|
|
|
|
```sql
|
|
-- Create the production role
|
|
CREATE ROLE flyer_crawler_prod WITH LOGIN PASSWORD 'a_very_strong_password';
|
|
|
|
-- Create the production database
|
|
CREATE DATABASE "flyer-crawler-prod" WITH OWNER = flyer_crawler_prod;
|
|
|
|
-- Connect to the new database
|
|
\c "flyer-crawler-prod"
|
|
|
|
-- Grant schema privileges
|
|
ALTER SCHEMA public OWNER TO flyer_crawler_prod;
|
|
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_prod;
|
|
|
|
-- Install required extensions (must be done as superuser)
|
|
CREATE EXTENSION IF NOT EXISTS postgis;
|
|
CREATE EXTENSION IF NOT EXISTS pg_trgm;
|
|
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
|
|
|
-- Exit
|
|
\q
|
|
```
|
|
|
|
### Step 3: Apply the Schema
|
|
|
|
Navigate to your project directory and run:
|
|
|
|
```bash
|
|
psql -U flyer_crawler_prod -d "flyer-crawler-prod" -f sql/master_schema_rollup.sql
|
|
```
|
|
|
|
This creates all tables, functions, triggers, and seeds essential data (categories, master items).
|
|
|
|
### Step 4: Seed the Admin Account
|
|
|
|
Set the required environment variables and run the seed script:
|
|
|
|
```bash
|
|
export DB_USER=flyer_crawler_prod
|
|
export DB_PASSWORD=your_password
|
|
export DB_NAME="flyer-crawler-prod"
|
|
export DB_HOST=localhost
|
|
|
|
npx tsx src/db/seed_admin_account.ts
|
|
```
|
|
|
|
---
|
|
|
|
## Test Database Setup
|
|
|
|
The test database is used by CI/CD pipelines and local test runs.
|
|
|
|
### Step 1: Create the Test Database
|
|
|
|
```bash
|
|
sudo -u postgres psql
|
|
```
|
|
|
|
```sql
|
|
-- Create the test role
|
|
CREATE ROLE flyer_crawler_test WITH LOGIN PASSWORD 'a_very_strong_password';
|
|
|
|
-- Create the test database
|
|
CREATE DATABASE "flyer-crawler-test" WITH OWNER = flyer_crawler_test;
|
|
|
|
-- Connect to the test database
|
|
\c "flyer-crawler-test"
|
|
|
|
-- Grant schema privileges (required for test runner to reset schema)
|
|
ALTER SCHEMA public OWNER TO flyer_crawler_test;
|
|
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test;
|
|
|
|
-- Install required extensions
|
|
CREATE EXTENSION IF NOT EXISTS postgis;
|
|
CREATE EXTENSION IF NOT EXISTS pg_trgm;
|
|
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
|
|
|
-- Exit
|
|
\q
|
|
```
|
|
|
|
### Step 2: Configure CI/CD Secrets
|
|
|
|
Ensure these secrets are set in your Gitea repository settings:
|
|
|
|
**Shared:**
|
|
|
|
| Secret | Description |
|
|
| --------- | ------------------------------------- |
|
|
| `DB_HOST` | Database hostname (e.g., `localhost`) |
|
|
| `DB_PORT` | Database port (e.g., `5432`) |
|
|
|
|
**Production-specific:**
|
|
|
|
| Secret | Description |
|
|
| ------------------ | ----------------------------------------------- |
|
|
| `DB_USER_PROD` | Production database user (`flyer_crawler_prod`) |
|
|
| `DB_PASSWORD_PROD` | Production database password |
|
|
| `DB_DATABASE_PROD` | Production database name (`flyer-crawler-prod`) |
|
|
|
|
**Test-specific:**
|
|
|
|
| Secret | Description |
|
|
| ------------------ | ----------------------------------------- |
|
|
| `DB_USER_TEST` | Test database user (`flyer_crawler_test`) |
|
|
| `DB_PASSWORD_TEST` | Test database password |
|
|
| `DB_DATABASE_TEST` | Test database name (`flyer-crawler-test`) |
|
|
|
|
---
|
|
|
|
## How the Test Pipeline Works
|
|
|
|
The CI pipeline uses a permanent test database that gets reset on each test run:
|
|
|
|
1. **Setup**: The vitest global setup script connects to `flyer-crawler-test`
|
|
2. **Schema Reset**: Executes `sql/drop_tables.sql` (`DROP SCHEMA public CASCADE`)
|
|
3. **Schema Application**: Runs `sql/master_schema_rollup.sql` to build a fresh schema
|
|
4. **Test Execution**: Tests run against the clean database
|
|
|
|
This approach is faster than creating/destroying databases and doesn't require sudo access.
|
|
|
|
---
|
|
|
|
## Connecting to Production Database
|
|
|
|
```bash
|
|
psql -h localhost -U flyer_crawler_prod -d "flyer-crawler-prod" -W
|
|
```
|
|
|
|
---
|
|
|
|
## Checking PostGIS Version
|
|
|
|
```sql
|
|
SELECT version();
|
|
SELECT PostGIS_Full_Version();
|
|
```
|
|
|
|
Example output:
|
|
|
|
```text
|
|
PostgreSQL 14.19 (Ubuntu 14.19-0ubuntu0.22.04.1)
|
|
POSTGIS="3.2.0 c3e3cc0" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1"
|
|
```
|
|
|
|
---
|
|
|
|
## Schema Files
|
|
|
|
| File | Purpose |
|
|
| ------------------------------ | --------------------------------------------------------- |
|
|
| `sql/master_schema_rollup.sql` | Complete schema with all tables, functions, and seed data |
|
|
| `sql/drop_tables.sql` | Drops entire schema (used by test runner) |
|
|
| `sql/schema.sql.txt` | Legacy schema file (reference only) |
|
|
|
|
---
|
|
|
|
## Backup and Restore
|
|
|
|
### Create a Backup
|
|
|
|
```bash
|
|
pg_dump -U flyer_crawler_prod -d "flyer-crawler-prod" -F c -f backup.dump
|
|
```
|
|
|
|
### Restore from Backup
|
|
|
|
```bash
|
|
pg_restore -U flyer_crawler_prod -d "flyer-crawler-prod" -c backup.dump
|
|
```
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
|
|
- [Installation Guide](INSTALL.md) - Local development setup
|
|
- [Deployment Guide](DEPLOYMENT.md) - Production deployment
|