Files
flyer-crawler.projectium.com/docs/FLYER-URL-CONFIGURATION.md

265 lines
11 KiB
Markdown

# Flyer URL Configuration
## Overview
Flyer image and icon URLs are environment-specific to ensure they point to the correct server for each deployment. Images are served as static files by NGINX from the `/flyer-images/` path with 7-day browser caching enabled.
## Environment-Specific URLs
| Environment | Base URL | Example |
| ------------- | ------------------------------------------- | -------------------------------------------------------------------------- |
| Dev Container | `https://127.0.0.1` | `https://127.0.0.1/flyer-images/safeway-flyer.jpg` |
| Test | `https://flyer-crawler-test.projectium.com` | `https://flyer-crawler-test.projectium.com/flyer-images/safeway-flyer.jpg` |
| Production | `https://flyer-crawler.projectium.com` | `https://flyer-crawler.projectium.com/flyer-images/safeway-flyer.jpg` |
**Note:** The dev container accepts connections to **both** `https://localhost/` and `https://127.0.0.1/` thanks to the SSL certificate and NGINX configuration. See [SSL Certificate Configuration](#ssl-certificate-configuration-dev-container) below.
## SSL Certificate Configuration (Dev Container)
The dev container uses self-signed certificates generated by [mkcert](https://github.com/FiloSottile/mkcert) to enable HTTPS locally. This configuration solves a common mixed-origin SSL issue.
### The Problem
When users access the site via `https://localhost/` but image URLs in the database use `https://127.0.0.1/...`, browsers treat these as different origins. This causes `ERR_CERT_AUTHORITY_INVALID` errors when loading images, even though both hostnames point to the same server.
### The Solution
1. **Certificate Generation** (`Dockerfile.dev`):
```bash
mkcert localhost 127.0.0.1 ::1
```
This creates a certificate with Subject Alternative Names (SANs) for all three hostnames.
2. **NGINX Configuration** (`docker/nginx/dev.conf`):
```nginx
server_name localhost 127.0.0.1;
```
NGINX accepts requests to both hostnames using the same SSL certificate.
### How It Works
| Component | Configuration |
| -------------------- | ---------------------------------------------------- |
| SSL Certificate SANs | `localhost`, `127.0.0.1`, `::1` |
| NGINX `server_name` | `localhost 127.0.0.1` |
| Seed Script URLs | Uses `https://127.0.0.1` (works with DB constraints) |
| User Access | Either `https://localhost/` or `https://127.0.0.1/` |
### Why This Matters
- **Database Constraints**: The `flyers` table has CHECK constraints requiring URLs to start with `http://` or `https://`. Relative URLs are not allowed.
- **Consistent Behavior**: Users can access the site using either hostname without SSL warnings.
- **Same Certificate**: Both hostnames use the same self-signed certificate, eliminating mixed-content errors.
### Verifying the Configuration
```bash
# Check certificate SANs
podman exec flyer-crawler-dev openssl x509 -in /app/certs/localhost.crt -text -noout | grep -A1 "Subject Alternative Name"
# Expected output:
# X509v3 Subject Alternative Name:
# DNS:localhost, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1
# Test both hostnames respond
curl -k https://localhost/health
curl -k https://127.0.0.1/health
```
### Troubleshooting SSL Issues
If you encounter `ERR_CERT_AUTHORITY_INVALID`:
1. **Check NGINX is running**: `podman exec flyer-crawler-dev nginx -t`
2. **Verify certificate exists**: `podman exec flyer-crawler-dev ls -la /app/certs/`
3. **Ensure both hostnames are in server_name**: Check `/etc/nginx/sites-available/default`
4. **Rebuild container if needed**: The certificate is generated at build time
### Permanent Fix: Install CA Certificate (Recommended)
To permanently eliminate SSL certificate warnings, install the mkcert CA certificate on your system. This is optional but provides a better development experience.
The CA certificate is located at `certs/mkcert-ca.crt` in the project root. See [`certs/README.md`](../certs/README.md) for platform-specific installation instructions (Windows, macOS, Linux, Firefox).
After installation:
- Your browser will trust all mkcert certificates without warnings
- Both `https://localhost/` and `https://127.0.0.1/` will work without SSL errors
- Flyer images will load without `ERR_CERT_AUTHORITY_INVALID` errors
See also: [Debugging Guide - SSL Issues](development/DEBUGGING.md#ssl-certificate-issues)
## NGINX Static File Serving
All environments serve flyer images as static files with browser caching:
```nginx
# Serve flyer images from static storage (7-day cache)
location /flyer-images/ {
alias /path/to/flyer-images/;
expires 7d;
add_header Cache-Control "public, immutable";
}
```
### Directory Paths by Environment
| Environment | NGINX Alias Path |
| ------------- | ---------------------------------------------------------- |
| Dev Container | `/app/public/flyer-images/` |
| Test | `/var/www/flyer-crawler-test.projectium.com/flyer-images/` |
| Production | `/var/www/flyer-crawler.projectium.com/flyer-images/` |
## Configuration
### Environment Variable
Set `FLYER_BASE_URL` in your environment configuration:
```bash
# Dev container (.env)
FLYER_BASE_URL=https://localhost
# Test environment
FLYER_BASE_URL=https://flyer-crawler-test.projectium.com
# Production
FLYER_BASE_URL=https://flyer-crawler.projectium.com
```
### Seed Script
The seed script ([src/db/seed.ts](../src/db/seed.ts)) automatically uses the correct base URL based on:
1. `FLYER_BASE_URL` environment variable (if set)
2. `NODE_ENV` value:
- `production` → `https://flyer-crawler.projectium.com`
- `test` → `https://flyer-crawler-test.projectium.com`
- Default → `https://localhost`
The seed script also copies test images from `src/tests/assets/` to `public/flyer-images/`:
- `test-flyer-image.jpg` - Sample flyer image
- `test-flyer-icon.png` - Sample 64x64 icon
## Updating Existing Data
If you need to update existing flyer URLs in the database, use the provided SQL script:
### Dev Container
```bash
# Connect to dev database
podman exec -it flyer-crawler-postgres psql -U postgres -d flyer_crawler_dev
# Run the update (dev container uses HTTPS with self-signed certs)
UPDATE flyers
SET
image_url = REPLACE(image_url, 'example.com', 'localhost'),
icon_url = REPLACE(icon_url, 'example.com', 'localhost')
WHERE
image_url LIKE '%example.com%'
OR icon_url LIKE '%example.com%';
# Verify
SELECT flyer_id, image_url, icon_url FROM flyers;
```
### Test Environment
```bash
# Via SSH
ssh root@projectium.com "psql -U flyer_crawler_test -d flyer-crawler-test -c \"
UPDATE flyers
SET
image_url = REPLACE(image_url, 'example.com', 'flyer-crawler-test.projectium.com'),
icon_url = REPLACE(icon_url, 'example.com', 'flyer-crawler-test.projectium.com')
WHERE
image_url LIKE '%example.com%'
OR icon_url LIKE '%example.com%';
\""
```
### Production
```bash
# Via SSH
ssh root@projectium.com "psql -U flyer_crawler_prod -d flyer-crawler-prod -c \"
UPDATE flyers
SET
image_url = REPLACE(image_url, 'example.com', 'flyer-crawler.projectium.com'),
icon_url = REPLACE(icon_url, 'example.com', 'flyer-crawler.projectium.com')
WHERE
image_url LIKE '%example.com%'
OR icon_url LIKE '%example.com%';
\""
```
## Test Data Updates
### Test Helper Function
A helper function `getFlyerBaseUrl()` is available in [src/tests/utils/testHelpers.ts](../src/tests/utils/testHelpers.ts) that automatically detects the correct base URL for tests:
```typescript
export const getFlyerBaseUrl = (): string => {
if (process.env.FLYER_BASE_URL) {
return process.env.FLYER_BASE_URL;
}
// Check if we're in dev container (DB_HOST=postgres is typical indicator)
// Use 'localhost' instead of '127.0.0.1' to match the hostname users access
// This avoids SSL certificate mixed-origin issues in browsers
if (process.env.DB_HOST === 'postgres' || process.env.DB_HOST === '127.0.0.1') {
return 'https://localhost';
}
if (process.env.NODE_ENV === 'production') {
return 'https://flyer-crawler.projectium.com';
}
if (process.env.NODE_ENV === 'test') {
return 'https://flyer-crawler-test.projectium.com';
}
// Default for unit tests
return 'https://example.com';
};
```
### Updated Test Files
The following test files now use `getFlyerBaseUrl()` for environment-aware URL generation:
- [src/db/seed.ts](../src/db/seed.ts) - Main seed script (uses `FLYER_BASE_URL`)
- [src/tests/utils/testHelpers.ts](../src/tests/utils/testHelpers.ts) - `getFlyerBaseUrl()` helper function
- [src/hooks/useDataExtraction.test.ts](../src/hooks/useDataExtraction.test.ts) - Mock flyer factory
- [src/schemas/flyer.schemas.test.ts](../src/schemas/flyer.schemas.test.ts) - Schema validation tests
- [src/services/flyerProcessingService.server.test.ts](../src/services/flyerProcessingService.server.test.ts) - Processing service tests
- [src/tests/integration/flyer-processing.integration.test.ts](../src/tests/integration/flyer-processing.integration.test.ts) - Integration tests
This approach ensures tests work correctly in all environments (dev container, CI/CD, local development, test, production).
## Files Changed
| File | Change |
| --------------------------- | ------------------------------------------------------------------------------------------------- |
| `src/db/seed.ts` | Added `FLYER_BASE_URL` environment variable support, copies test images to `public/flyer-images/` |
| `docker/nginx/dev.conf` | Added `/flyer-images/` location block for static file serving |
| `.env.example` | Added `FLYER_BASE_URL` variable |
| `sql/update_flyer_urls.sql` | SQL script for updating existing data |
| Test files | Updated mock data to use `https://localhost` |
## Summary
- Seed script now uses environment-specific HTTPS URLs
- Seed script copies test images from `src/tests/assets/` to `public/flyer-images/`
- NGINX serves `/flyer-images/` as static files with 7-day cache
- Test files updated with `https://localhost` (not `127.0.0.1` to avoid SSL mixed-origin issues)
- SQL script provided for updating existing data
- Documentation updated for each environment