All checks were successful
Deploy to Test Environment / deploy-to-test (push) Successful in 18m47s
312 lines
11 KiB
Markdown
312 lines
11 KiB
Markdown
# Database Schema Relationship Analysis
|
|
|
|
## Executive Summary
|
|
|
|
This document analyzes the database schema to identify missing table relationships and JOINs that aren't properly implemented in the codebase. This analysis was triggered by discovering that `WatchedItemDeal` was using a `store_name` string instead of a proper `store` object with nested locations.
|
|
|
|
## Key Findings
|
|
|
|
### ✅ CORRECTLY IMPLEMENTED
|
|
|
|
#### 1. Store → Store Locations → Addresses (3-table normalization)
|
|
|
|
**Schema:**
|
|
|
|
```sql
|
|
stores (store_id) → store_locations (store_location_id) → addresses (address_id)
|
|
```
|
|
|
|
**Implementation:**
|
|
|
|
- [src/services/db/storeLocation.db.ts](src/services/db/storeLocation.db.ts) properly JOINs all three tables
|
|
- [src/types.ts](src/types.ts) defines `StoreWithLocations` interface with nested address objects
|
|
- Recent fixes corrected `WatchedItemDeal` to use `store` object instead of `store_name` string
|
|
|
|
**Queries:**
|
|
|
|
```typescript
|
|
// From storeLocation.db.ts
|
|
FROM public.stores s
|
|
LEFT JOIN public.store_locations sl ON s.store_id = sl.store_id
|
|
LEFT JOIN public.addresses a ON sl.address_id = a.address_id
|
|
```
|
|
|
|
#### 2. Shopping Trips → Shopping Trip Items
|
|
|
|
**Schema:**
|
|
|
|
```sql
|
|
shopping_trips (shopping_trip_id) → shopping_trip_items (shopping_trip_item_id) → master_grocery_items
|
|
```
|
|
|
|
**Implementation:**
|
|
|
|
- [src/services/db/shopping.db.ts:513-518](src/services/db/shopping.db.ts#L513-L518) properly JOINs shopping_trips → shopping_trip_items → master_grocery_items
|
|
- Uses `json_agg` to nest items array within trip object
|
|
- [src/types.ts:639-647](src/types.ts#L639-L647) `ShoppingTrip` interface includes nested `items: ShoppingTripItem[]`
|
|
|
|
**Queries:**
|
|
|
|
```typescript
|
|
FROM public.shopping_trips st
|
|
LEFT JOIN public.shopping_trip_items sti ON st.shopping_trip_id = sti.shopping_trip_id
|
|
LEFT JOIN public.master_grocery_items mgi ON sti.master_item_id = mgi.master_grocery_item_id
|
|
```
|
|
|
|
#### 3. Receipts → Receipt Items
|
|
|
|
**Schema:**
|
|
|
|
```sql
|
|
receipts (receipt_id) → receipt_items (receipt_item_id)
|
|
```
|
|
|
|
**Implementation:**
|
|
|
|
- [src/types.ts:649-662](src/types.ts#L649-L662) `Receipt` interface includes optional `items?: ReceiptItem[]`
|
|
- Receipt items are fetched separately via repository methods
|
|
- Proper foreign key relationship maintained
|
|
|
|
---
|
|
|
|
### ❌ MISSING / INCORRECT IMPLEMENTATIONS
|
|
|
|
#### 1. **CRITICAL: Flyers → Flyer Locations → Store Locations (Many-to-Many)**
|
|
|
|
**Schema:**
|
|
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS public.flyer_locations (
|
|
flyer_id BIGINT NOT NULL REFERENCES public.flyers(flyer_id) ON DELETE CASCADE,
|
|
store_location_id BIGINT NOT NULL REFERENCES public.store_locations(store_location_id) ON DELETE CASCADE,
|
|
PRIMARY KEY (flyer_id, store_location_id),
|
|
...
|
|
);
|
|
COMMENT: 'A linking table associating a single flyer with multiple store locations where its deals are valid.'
|
|
```
|
|
|
|
**Problem:**
|
|
|
|
- The schema defines a **many-to-many relationship** - a flyer can be valid at multiple store locations
|
|
- Current implementation in [src/services/db/flyer.db.ts](src/services/db/flyer.db.ts) **IGNORES** the `flyer_locations` table entirely
|
|
- Queries JOIN `flyers` directly to `stores` via `store_id` foreign key
|
|
- This means flyers can only be associated with ONE store, not multiple locations
|
|
|
|
**Current (Incorrect) Queries:**
|
|
|
|
```typescript
|
|
// From flyer.db.ts:315-362
|
|
FROM public.flyers f
|
|
JOIN public.stores s ON f.store_id = s.store_id // ❌ Wrong - ignores flyer_locations
|
|
```
|
|
|
|
**Expected (Correct) Queries:**
|
|
|
|
```typescript
|
|
// Should be:
|
|
FROM public.flyers f
|
|
JOIN public.flyer_locations fl ON f.flyer_id = fl.flyer_id
|
|
JOIN public.store_locations sl ON fl.store_location_id = sl.store_location_id
|
|
JOIN public.stores s ON sl.store_id = s.store_id
|
|
JOIN public.addresses a ON sl.address_id = a.address_id
|
|
```
|
|
|
|
**TypeScript Type Issues:**
|
|
|
|
- [src/types.ts](src/types.ts) `Flyer` interface has `store` object, but it should have `locations: StoreLocation[]` array
|
|
- Current structure assumes one store per flyer, not multiple locations
|
|
|
|
**Files Affected:**
|
|
|
|
- [src/services/db/flyer.db.ts](src/services/db/flyer.db.ts) - All flyer queries
|
|
- [src/types.ts](src/types.ts) - `Flyer` interface definition
|
|
- Any component displaying flyer locations
|
|
|
|
---
|
|
|
|
#### 2. **User Submitted Prices → Store Locations (MIGRATED)**
|
|
|
|
**Status**: ✅ **FIXED** - Migration created
|
|
|
|
**Schema:**
|
|
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS public.user_submitted_prices (
|
|
...
|
|
store_id BIGINT NOT NULL REFERENCES public.stores(store_id) ON DELETE CASCADE,
|
|
...
|
|
);
|
|
```
|
|
|
|
**Solution Implemented:**
|
|
|
|
- Created migration [sql/migrations/005_add_store_location_to_user_submitted_prices.sql](sql/migrations/005_add_store_location_to_user_submitted_prices.sql)
|
|
- Added `store_location_id` column to table (NOT NULL after migration)
|
|
- Migrated existing data: linked each price to first location of its store
|
|
- Updated TypeScript interface [src/types.ts:270-282](src/types.ts#L270-L282) to include both fields
|
|
- Kept `store_id` for backward compatibility during transition
|
|
|
|
**Benefits:**
|
|
|
|
- Prices are now specific to individual store locations
|
|
- "Walmart Toronto" and "Walmart Vancouver" prices are tracked separately
|
|
- Improves geographic specificity for price comparisons
|
|
- Enables proximity-based price recommendations
|
|
|
|
**Next Steps:**
|
|
|
|
- Application code needs to be updated to use `store_location_id` when creating new prices
|
|
- Once all code is migrated, can drop the legacy `store_id` column
|
|
- User-submitted prices feature is not yet implemented in the UI
|
|
|
|
---
|
|
|
|
#### 3. **Receipts → Store Locations (MIGRATED)**
|
|
|
|
**Status**: ✅ **FIXED** - Migration created
|
|
|
|
**Schema:**
|
|
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS public.receipts (
|
|
...
|
|
store_id BIGINT REFERENCES public.stores(store_id) ON DELETE CASCADE,
|
|
store_location_id BIGINT REFERENCES public.store_locations(store_location_id) ON DELETE SET NULL,
|
|
...
|
|
);
|
|
```
|
|
|
|
**Solution Implemented:**
|
|
|
|
- Created migration [sql/migrations/006_add_store_location_to_receipts.sql](sql/migrations/006_add_store_location_to_receipts.sql)
|
|
- Added `store_location_id` column to table (nullable - receipts may not have matched store)
|
|
- Migrated existing data: linked each receipt to first location of its store
|
|
- Updated TypeScript interface [src/types.ts:661-675](src/types.ts#L661-L675) to include both fields
|
|
- Kept `store_id` for backward compatibility during transition
|
|
|
|
**Benefits:**
|
|
|
|
- Receipts can now be tied to specific store locations
|
|
- "Loblaws Queen St" and "Loblaws Bloor St" are tracked separately
|
|
- Enables location-specific shopping pattern analysis
|
|
- Improves receipt matching accuracy with address data
|
|
|
|
**Next Steps:**
|
|
|
|
- Receipt scanning code needs to determine specific store_location_id from OCR text
|
|
- May require address parsing/matching logic in receipt processing
|
|
- Once all code is migrated, can drop the legacy `store_id` column
|
|
- OCR confidence and pattern matching should prefer location-specific data
|
|
|
|
---
|
|
|
|
#### 4. Item Price History → Store Locations (Already Correct!)
|
|
|
|
**Schema:**
|
|
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS public.item_price_history (
|
|
...
|
|
store_location_id BIGINT REFERENCES public.store_locations(store_location_id) ON DELETE CASCADE,
|
|
...
|
|
);
|
|
```
|
|
|
|
**Status:**
|
|
|
|
- ✅ **CORRECTLY IMPLEMENTED** - This table already uses `store_location_id`
|
|
- Properly tracks price history per location
|
|
- Good example of how other tables should be structured
|
|
|
|
---
|
|
|
|
## Summary Table
|
|
|
|
| Table | Foreign Key | Should Use | Status | Priority |
|
|
| --------------------- | --------------------------- | ------------------------------------- | --------------- | -------- |
|
|
| **flyer_locations** | flyer_id, store_location_id | Many-to-many link | ✅ **FIXED** | ✅ Done |
|
|
| flyers | store_id | ~~store_id~~ Now uses flyer_locations | ✅ **FIXED** | ✅ Done |
|
|
| user_submitted_prices | store_id | store_location_id | ✅ **MIGRATED** | ✅ Done |
|
|
| receipts | store_id | store_location_id | ✅ **MIGRATED** | ✅ Done |
|
|
| item_price_history | store_location_id | ✅ Already correct | ✅ Correct | ✅ Good |
|
|
| shopping_trips | (no store ref) | N/A | ✅ Correct | ✅ Good |
|
|
| store_locations | store_id, address_id | ✅ Already correct | ✅ Correct | ✅ Good |
|
|
|
|
---
|
|
|
|
## Impact Assessment
|
|
|
|
### Critical (Must Fix)
|
|
|
|
1. **Flyer Locations Many-to-Many**
|
|
- **Impact:** Flyers can't be associated with multiple store locations
|
|
- **User Impact:** Users can't see which specific store locations have deals
|
|
- **Business Logic:** Breaks core assumption that one flyer can be valid at multiple stores
|
|
- **Fix Complexity:** High - requires schema migration, type changes, query rewrites
|
|
|
|
### Medium (Should Consider)
|
|
|
|
2. **User Submitted Prices & Receipts**
|
|
- **Impact:** Loss of location-specific data
|
|
- **User Impact:** Can't distinguish between different locations of same store chain
|
|
- **Business Logic:** Reduces accuracy of proximity-based recommendations
|
|
- **Fix Complexity:** Medium - requires migration and query updates
|
|
|
|
---
|
|
|
|
## Recommended Actions
|
|
|
|
### Phase 1: Fix Flyer Locations (Critical)
|
|
|
|
1. Create migration to properly use `flyer_locations` table
|
|
2. Update `Flyer` TypeScript interface to support multiple locations
|
|
3. Rewrite all flyer queries in [src/services/db/flyer.db.ts](src/services/db/flyer.db.ts)
|
|
4. Update flyer creation/update endpoints to manage `flyer_locations` entries
|
|
5. Update frontend components to display multiple locations per flyer
|
|
6. Update tests to use new structure
|
|
|
|
### Phase 2: Consider Store Location Specificity (Optional)
|
|
|
|
1. Evaluate if location-specific receipts and prices provide value
|
|
2. If yes, create migrations to change `store_id` → `store_location_id`
|
|
3. Update repository queries
|
|
4. Update TypeScript interfaces
|
|
5. Update tests
|
|
|
|
---
|
|
|
|
## Related Documents
|
|
|
|
- [ADR-013: Store Address Normalization](../docs/adr/0013-store-address-normalization.md)
|
|
- [STORE_ADDRESS_IMPLEMENTATION_PLAN.md](../STORE_ADDRESS_IMPLEMENTATION_PLAN.md)
|
|
- [TESTING.md](../docs/TESTING.md)
|
|
|
|
---
|
|
|
|
## Analysis Methodology
|
|
|
|
This analysis was conducted by:
|
|
|
|
1. Extracting all foreign key relationships from [sql/master_schema_rollup.sql](sql/master_schema_rollup.sql)
|
|
2. Comparing schema relationships against TypeScript interfaces in [src/types.ts](src/types.ts)
|
|
3. Auditing database queries in [src/services/db/](src/services/db/) for proper JOIN usage
|
|
4. Identifying gaps where schema relationships exist but aren't used in queries
|
|
|
|
Commands used:
|
|
|
|
```bash
|
|
# Extract all foreign keys
|
|
podman exec -it flyer-crawler-dev bash -c "grep -n 'REFERENCES' sql/master_schema_rollup.sql"
|
|
|
|
# Check specific table structures
|
|
podman exec -it flyer-crawler-dev bash -c "grep -A 15 'CREATE TABLE.*table_name' sql/master_schema_rollup.sql"
|
|
|
|
# Verify query patterns
|
|
podman exec -it flyer-crawler-dev bash -c "grep -n 'JOIN.*table_name' src/services/db/*.ts"
|
|
```
|
|
|
|
---
|
|
|
|
**Last Updated:** 2026-01-19
|
|
**Analyzed By:** Claude Code (via user request after discovering store_name → store bug)
|