Files

224 lines
5.9 KiB
Markdown

# Database Setup
Flyer Crawler uses PostgreSQL with several extensions for full-text search, geographic data, and UUID generation.
---
## Required Extensions
| Extension | Purpose |
| ----------- | ------------------------------------------- |
| `postgis` | Geographic/spatial data for store locations |
| `pg_trgm` | Trigram matching for fuzzy text search |
| `uuid-ossp` | UUID generation for primary keys |
---
## Database Users
This project uses **environment-specific database users** to isolate production and test environments:
| User | Database | Purpose |
| -------------------- | -------------------- | ---------- |
| `flyer_crawler_prod` | `flyer-crawler-prod` | Production |
| `flyer_crawler_test` | `flyer-crawler-test` | Testing |
---
## Production Database Setup
### Step 1: Install PostgreSQL
```bash
sudo apt update
sudo apt install postgresql postgresql-contrib
```
### Step 2: Create Database and User
Switch to the postgres system user:
```bash
sudo -u postgres psql
```
Run the following SQL commands (replace `'a_very_strong_password'` with a secure password):
```sql
-- Create the production role
CREATE ROLE flyer_crawler_prod WITH LOGIN PASSWORD 'a_very_strong_password';
-- Create the production database
CREATE DATABASE "flyer-crawler-prod" WITH OWNER = flyer_crawler_prod;
-- Connect to the new database
\c "flyer-crawler-prod"
-- Grant schema privileges
ALTER SCHEMA public OWNER TO flyer_crawler_prod;
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_prod;
-- Install required extensions (must be done as superuser)
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Exit
\q
```
### Step 3: Apply the Schema
Navigate to your project directory and run:
```bash
psql -U flyer_crawler_prod -d "flyer-crawler-prod" -f sql/master_schema_rollup.sql
```
This creates all tables, functions, triggers, and seeds essential data (categories, master items).
### Step 4: Seed the Admin Account
Set the required environment variables and run the seed script:
```bash
export DB_USER=flyer_crawler_prod
export DB_PASSWORD=your_password
export DB_NAME="flyer-crawler-prod"
export DB_HOST=localhost
npx tsx src/db/seed_admin_account.ts
```
---
## Test Database Setup
The test database is used by CI/CD pipelines and local test runs.
### Step 1: Create the Test Database
```bash
sudo -u postgres psql
```
```sql
-- Create the test role
CREATE ROLE flyer_crawler_test WITH LOGIN PASSWORD 'a_very_strong_password';
-- Create the test database
CREATE DATABASE "flyer-crawler-test" WITH OWNER = flyer_crawler_test;
-- Connect to the test database
\c "flyer-crawler-test"
-- Grant schema privileges (required for test runner to reset schema)
ALTER SCHEMA public OWNER TO flyer_crawler_test;
GRANT CREATE, USAGE ON SCHEMA public TO flyer_crawler_test;
-- Install required extensions
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- Exit
\q
```
### Step 2: Configure CI/CD Secrets
Ensure these secrets are set in your Gitea repository settings:
**Shared:**
| Secret | Description |
| --------- | ------------------------------------- |
| `DB_HOST` | Database hostname (e.g., `localhost`) |
| `DB_PORT` | Database port (e.g., `5432`) |
**Production-specific:**
| Secret | Description |
| ------------------ | ----------------------------------------------- |
| `DB_USER_PROD` | Production database user (`flyer_crawler_prod`) |
| `DB_PASSWORD_PROD` | Production database password |
| `DB_DATABASE_PROD` | Production database name (`flyer-crawler-prod`) |
**Test-specific:**
| Secret | Description |
| ------------------ | ----------------------------------------- |
| `DB_USER_TEST` | Test database user (`flyer_crawler_test`) |
| `DB_PASSWORD_TEST` | Test database password |
| `DB_DATABASE_TEST` | Test database name (`flyer-crawler-test`) |
---
## How the Test Pipeline Works
The CI pipeline uses a permanent test database that gets reset on each test run:
1. **Setup**: The vitest global setup script connects to `flyer-crawler-test`
2. **Schema Reset**: Executes `sql/drop_tables.sql` (`DROP SCHEMA public CASCADE`)
3. **Schema Application**: Runs `sql/master_schema_rollup.sql` to build a fresh schema
4. **Test Execution**: Tests run against the clean database
This approach is faster than creating/destroying databases and doesn't require sudo access.
---
## Connecting to Production Database
```bash
psql -h localhost -U flyer_crawler_prod -d "flyer-crawler-prod" -W
```
---
## Checking PostGIS Version
```sql
SELECT version();
SELECT PostGIS_Full_Version();
```
Example output:
```text
PostgreSQL 14.19 (Ubuntu 14.19-0ubuntu0.22.04.1)
POSTGIS="3.2.0 c3e3cc0" GEOS="3.10.2-CAPI-1.16.0" PROJ="8.2.1"
```
---
## Schema Files
| File | Purpose |
| ------------------------------ | --------------------------------------------------------- |
| `sql/master_schema_rollup.sql` | Complete schema with all tables, functions, and seed data |
| `sql/drop_tables.sql` | Drops entire schema (used by test runner) |
| `sql/schema.sql.txt` | Legacy schema file (reference only) |
---
## Backup and Restore
### Create a Backup
```bash
pg_dump -U flyer_crawler_prod -d "flyer-crawler-prod" -F c -f backup.dump
```
### Restore from Backup
```bash
pg_restore -U flyer_crawler_prod -d "flyer-crawler-prod" -c backup.dump
```
---
## Related Documentation
- [Installation Guide](INSTALL.md) - Local development setup
- [Deployment Guide](DEPLOYMENT.md) - Production deployment