Files
flyer-crawler.projectium.com/DATABASE.md

5.9 KiB

Database Setup

Flyer Crawler uses PostgreSQL with several extensions for full-text search, geographic data, and UUID generation.


Required Extensions

Extension Purpose
postgis Geographic/spatial data for store locations
pg_trgm Trigram matching for fuzzy text search
uuid-ossp UUID generation for primary keys

Database Users

This project uses environment-specific database users to isolate production and test environments:

User Database Purpose
flyer_crawler_prod flyer-crawler-prod Production
flyer_crawler_test flyer-crawler-test Testing

Production Database Setup

Step 1: Install PostgreSQL

sudo apt update
sudo apt install postgresql postgresql-contrib

Step 2: Create Database and User

Switch to the postgres system user:

sudo -u postgres psql

Run the following SQL commands (replace 'a_very_strong_password' with a secure password):

-- Create the production role
CREATE ROLE flyer_crawler_prod WITH LOGIN PASSWORD 'a_very_strong_password';

-- Create the production database
CREATE DATABASE "flyer-crawler-prod" WITH OWNER = flyer_crawler_prod;

-- Connect to the new database
\c "flyer-crawler-prod"

-- Grant schema privileges
ALTER SCHEMA public OWNER TO flyer_crawler_prod;
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_prod;

-- Install required extensions (must be done as superuser)
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- Exit
\q

Step 3: Apply the Schema

Navigate to your project directory and run:

psql -U flyer_crawler_prod -d "flyer-crawler-prod" -f sql/master_schema_rollup.sql

This creates all tables, functions, triggers, and seeds essential data (categories, master items).

Step 4: Seed the Admin Account

Set the required environment variables and run the seed script:

export DB_USER=flyer_crawler_prod
export DB_PASSWORD=your_password
export DB_NAME="flyer-crawler-prod"
export DB_HOST=localhost

npx tsx src/db/seed_admin_account.ts

Test Database Setup

The test database is used by CI/CD pipelines and local test runs.

Step 1: Create the Test Database

sudo -u postgres psql
-- Create the test role
CREATE ROLE flyer_crawler_test WITH LOGIN PASSWORD 'a_very_strong_password';

-- Create the test database
CREATE DATABASE "flyer-crawler-test" WITH OWNER = flyer_crawler_test;

-- Connect to the test database
\c "flyer-crawler-test"

-- Grant schema privileges (required for test runner to reset schema)
ALTER SCHEMA public OWNER TO flyer_crawler_test;
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test;

-- Install required extensions
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- Exit
\q

Step 2: Configure CI/CD Secrets

Ensure these secrets are set in your Gitea repository settings:

Shared:

Secret Description
DB_HOST Database hostname (e.g., localhost)
DB_PORT Database port (e.g., 5432)

Production-specific:

Secret Description
DB_USER_PROD Production database user (flyer_crawler_prod)
DB_PASSWORD_PROD Production database password
DB_DATABASE_PROD Production database name (flyer-crawler-prod)

Test-specific:

Secret Description
DB_USER_TEST Test database user (flyer_crawler_test)
DB_PASSWORD_TEST Test database password
DB_DATABASE_TEST Test database name (flyer-crawler-test)

How the Test Pipeline Works

The CI pipeline uses a permanent test database that gets reset on each test run:

  1. Setup: The vitest global setup script connects to flyer-crawler-test
  2. Schema Reset: Executes sql/drop_tables.sql (DROP SCHEMA public CASCADE)
  3. Schema Application: Runs sql/master_schema_rollup.sql to build a fresh schema
  4. Test Execution: Tests run against the clean database

This approach is faster than creating/destroying databases and doesn't require sudo access.


Connecting to Production Database

psql -h localhost -U flyer_crawler_prod -d "flyer-crawler-prod" -W

Checking PostGIS Version

SELECT version();
SELECT PostGIS_Full_Version();

Example output:

PostgreSQL 14.19 (Ubuntu 14.19-0ubuntu0.22.04.1)
POSTGIS="3.2.0 c3e3cc0" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1"

Schema Files

File Purpose
sql/master_schema_rollup.sql Complete schema with all tables, functions, and seed data
sql/drop_tables.sql Drops entire schema (used by test runner)
sql/schema.sql.txt Legacy schema file (reference only)

Backup and Restore

Create a Backup

pg_dump -U flyer_crawler_prod -d "flyer-crawler-prod" -F c -f backup.dump

Restore from Backup

pg_restore -U flyer_crawler_prod -d "flyer-crawler-prod" -c backup.dump