27 KiB
Research: Migration Plan for Category ID References
Date: 2026-01-19 Status: Planning Related: ADR-023: Database Normalization and Referential Integrity
Executive Summary
The API currently accepts category names (strings) instead of category IDs (numbers) when creating watched items. This violates database normalization principles (3NF) and creates a brittle API. This document outlines a comprehensive, phased migration plan to fix this issue across the entire codebase.
Estimated Effort: 2-3 weeks (including testing and deployment) Breaking Changes: Yes (API contract changes) Recommended Version Bump: Major (0.11.14 → 1.0.0 or 0.12.0)
Current State Analysis
Files Using String-Based Category References
| File | Line | Current Type | Usage |
|---|---|---|---|
src/routes/user.routes.ts |
76 | category: string |
API validation schema |
src/services/db/personalization.db.ts |
175 | categoryName: string |
Service method parameter |
src/services/apiClient.ts |
436 | category: string |
API client function |
src/hooks/mutations/useAddWatchedItemMutation.ts |
9 | category?: string |
Frontend mutation hook |
src/types/upc.ts |
85, 107, 201 | category: string | null |
UPC types |
src/types/ai.ts |
21 | category_name: z.string() |
AI extraction schema |
src/services/flyerDataTransformer.ts |
40 | category_name: string |
Flyer processing |
src/services/db/expiry.db.ts |
51 | category_name: string | null |
Expiry tracking |
src/services/db/upc.db.ts |
36, 347 | category: string | null |
UPC database |
Database Schema (Current - Correct)
-- Categories table (normalized)
CREATE TABLE categories (
category_id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
name TEXT NOT NULL UNIQUE,
created_at TIMESTAMPTZ DEFAULT now() NOT NULL,
updated_at TIMESTAMPTZ DEFAULT now() NOT NULL
);
-- Master items reference category by ID (correct)
CREATE TABLE master_grocery_items (
master_grocery_item_id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
name TEXT NOT NULL,
category_id BIGINT REFERENCES categories(category_id), -- ✅ Proper FK
-- ...
);
Current API Behavior
// API Request (current - problematic)
POST /api/users/watched-items
{
"itemName": "Milk",
"category": "Dairy & Eggs" // ❌ String
}
// Service performs string lookup
const categoryRes = await client.query(
'SELECT category_id FROM categories WHERE name = $1',
[categoryName] // ❌ String matching on every request
);
// Then uses ID internally
INSERT INTO master_grocery_items (name, category_id) VALUES ($1, $2)
Migration Plan
Phase 1: Add Category Discovery (Non-Breaking)
Duration: 2-3 days Risk: Low Breaking: No
1.1. Create Categories API Route
File: src/routes/category.routes.ts (new file)
// src/routes/category.routes.ts
import { Router } from 'express';
import { CategoryDbService } from '../services/db/category.db';
import { asyncHandler } from '../middleware/asyncHandler';
import { getLogger } from '../services/logger.server';
const router = Router();
const logger = getLogger();
/**
* @swagger
* /api/categories:
* get:
* summary: List all available grocery categories
* tags: [Categories]
* responses:
* 200:
* description: List of categories
* content:
* application/json:
* schema:
* type: object
* properties:
* success:
* type: boolean
* data:
* type: array
* items:
* type: object
* properties:
* category_id:
* type: integer
* name:
* type: string
*/
router.get(
'/',
asyncHandler(async (req, res) => {
const categories = await CategoryDbService.getAllCategories(logger);
res.json({
success: true,
data: categories,
});
}),
);
/**
* @swagger
* /api/categories/{id}:
* get:
* summary: Get a specific category by ID
* tags: [Categories]
* parameters:
* - in: path
* name: id
* required: true
* schema:
* type: integer
* responses:
* 200:
* description: Category details
* 404:
* description: Category not found
*/
router.get(
'/:id',
asyncHandler(async (req, res) => {
const categoryId = parseInt(req.params.id);
const category = await CategoryDbService.getCategoryById(categoryId, logger);
if (!category) {
return res.status(404).json({
success: false,
error: 'Category not found',
});
}
res.json({
success: true,
data: category,
});
}),
);
/**
* @swagger
* /api/categories/lookup:
* get:
* summary: Lookup category by name (for migration support)
* tags: [Categories]
* parameters:
* - in: query
* name: name
* required: true
* schema:
* type: string
* responses:
* 200:
* description: Category found
* 404:
* description: Category not found
*/
router.get(
'/lookup',
asyncHandler(async (req, res) => {
const name = req.query.name as string;
if (!name) {
return res.status(400).json({
success: false,
error: 'Query parameter "name" is required',
});
}
const category = await CategoryDbService.getCategoryByName(name, logger);
if (!category) {
return res.status(404).json({
success: false,
error: `Category '${name}' not found`,
});
}
res.json({
success: true,
data: category,
});
}),
);
export default router;
1.2. Create Category Database Service
File: src/services/db/category.db.ts (new file)
// src/services/db/category.db.ts
import { Logger } from 'pino';
import { getPool } from './connection.db';
import { handleDbError } from '../errorHandler';
export interface Category {
category_id: number;
name: string;
created_at: Date;
updated_at: Date;
}
export class CategoryDbService {
/**
* Get all categories ordered by name
*/
static async getAllCategories(logger: Logger): Promise<Category[]> {
const pool = getPool();
try {
const result = await pool.query<Category>(
`SELECT category_id, name, created_at, updated_at
FROM public.categories
ORDER BY name ASC`,
);
return result.rows;
} catch (error) {
handleDbError(error, logger, 'Error fetching all categories');
throw error;
}
}
/**
* Get category by ID
*/
static async getCategoryById(categoryId: number, logger: Logger): Promise<Category | null> {
const pool = getPool();
try {
const result = await pool.query<Category>(
`SELECT category_id, name, created_at, updated_at
FROM public.categories
WHERE category_id = $1`,
[categoryId],
);
return result.rows[0] || null;
} catch (error) {
handleDbError(error, logger, 'Error fetching category by ID', { categoryId });
throw error;
}
}
/**
* Get category by name (case-insensitive)
*/
static async getCategoryByName(name: string, logger: Logger): Promise<Category | null> {
const pool = getPool();
try {
const result = await pool.query<Category>(
`SELECT category_id, name, created_at, updated_at
FROM public.categories
WHERE LOWER(name) = LOWER($1)`,
[name],
);
return result.rows[0] || null;
} catch (error) {
handleDbError(error, logger, 'Error fetching category by name', { name });
throw error;
}
}
}
1.3. Register Category Routes
File: server.ts
// Add to server.ts imports
import categoryRoutes from './src/routes/category.routes';
// Add to route registrations
app.use('/api/categories', categoryRoutes);
1.4. Testing Phase 1
Create: src/tests/integration/category.routes.test.ts
// src/tests/integration/category.routes.test.ts
import { describe, it, expect, beforeAll } from 'vitest';
import supertest from 'supertest';
import { app } from '../../server';
describe('Category API Routes', () => {
let request: supertest.SuperTest<supertest.Test>;
beforeAll(() => {
request = supertest(app);
});
describe('GET /api/categories', () => {
it('should return list of all categories', async () => {
const response = await request.get('/api/categories');
expect(response.status).toBe(200);
expect(response.body.success).toBe(true);
expect(Array.isArray(response.body.data)).toBe(true);
expect(response.body.data.length).toBeGreaterThan(0);
// Verify category structure
const firstCategory = response.body.data[0];
expect(firstCategory).toHaveProperty('category_id');
expect(firstCategory).toHaveProperty('name');
expect(typeof firstCategory.category_id).toBe('number');
expect(typeof firstCategory.name).toBe('string');
});
it('should return categories in alphabetical order', async () => {
const response = await request.get('/api/categories');
const categories = response.body.data;
for (let i = 1; i < categories.length; i++) {
expect(categories[i].name >= categories[i - 1].name).toBe(true);
}
});
});
describe('GET /api/categories/:id', () => {
it('should return specific category by ID', async () => {
// First get all categories to find a valid ID
const listResponse = await request.get('/api/categories');
const firstCategory = listResponse.body.data[0];
const response = await request.get(`/api/categories/${firstCategory.category_id}`);
expect(response.status).toBe(200);
expect(response.body.success).toBe(true);
expect(response.body.data.category_id).toBe(firstCategory.category_id);
expect(response.body.data.name).toBe(firstCategory.name);
});
it('should return 404 for non-existent category ID', async () => {
const response = await request.get('/api/categories/999999');
expect(response.status).toBe(404);
expect(response.body.success).toBe(false);
});
});
describe('GET /api/categories/lookup', () => {
it('should find category by exact name', async () => {
const response = await request.get('/api/categories/lookup?name=Dairy & Eggs');
expect(response.status).toBe(200);
expect(response.body.success).toBe(true);
expect(response.body.data.name).toBe('Dairy & Eggs');
});
it('should find category by case-insensitive name', async () => {
const response = await request.get('/api/categories/lookup?name=dairy & eggs');
expect(response.status).toBe(200);
expect(response.body.success).toBe(true);
expect(response.body.data.name).toBe('Dairy & Eggs');
});
it('should return 404 for non-existent category name', async () => {
const response = await request.get('/api/categories/lookup?name=NonExistent');
expect(response.status).toBe(404);
expect(response.body.success).toBe(false);
});
it('should return 400 if name parameter is missing', async () => {
const response = await request.get('/api/categories/lookup');
expect(response.status).toBe(400);
expect(response.body.success).toBe(false);
});
});
});
Update E2E test: src/tests/e2e/deals-journey.e2e.test.ts
// Add category fetch at start of test
const categoriesResponse = await authedFetch('/categories', {
method: 'GET',
});
const categories = await categoriesResponse.json();
const dairyCategory = categories.data.find((c) => c.name === 'Dairy & Eggs');
const bakeryCategory = categories.data.find((c) => c.name === 'Bakery & Bread');
// Still use string for now (backward compatible)
const watchItem1Response = await authedFetch('/users/watched-items', {
method: 'POST',
token: authToken,
body: JSON.stringify({
itemName: 'E2E Milk 2%',
category: 'Dairy & Eggs', // Still works
}),
});
1.5. Documentation
Update docs/api/README.md or Swagger docs with new /api/categories endpoint.
Phase 2: Support Both Formats (Non-Breaking)
Duration: 3-5 days Risk: Medium Breaking: No (backward compatible)
2.1. Update API Validation Schema
File: src/routes/user.routes.ts
// Update schema to accept both formats
const addWatchedItemSchema = z.object({
body: z
.object({
itemName: requiredString("Field 'itemName' is required."),
// Accept either category (string, deprecated) or category_id (number, preferred)
category: z.string().trim().optional(),
category_id: z.number().int().positive().optional(),
})
.refine((data) => data.category || data.category_id, {
message: "Either 'category' or 'category_id' must be provided",
path: ['category'],
}),
});
2.2. Update Service Layer
File: src/services/db/personalization.db.ts
/**
* Add a master grocery item to a user's watchlist.
*
* @param userId - UUID of the user
* @param itemName - Name of the grocery item
* @param categoryIdOrName - Either category ID (number) or name (string, deprecated)
* @param logger - Pino logger instance
* @returns A promise that resolves to the MasterGroceryItem
*
* @deprecated Passing category name as string is deprecated. Use category ID instead.
*/
async addWatchedItem(
userId: string,
itemName: string,
categoryIdOrName: number | string,
logger: Logger,
): Promise<MasterGroceryItem> {
try {
return await withTransaction(async (client) => {
let categoryId: number;
// Handle both ID and name (with deprecation warning)
if (typeof categoryIdOrName === 'number') {
categoryId = categoryIdOrName;
} else {
// Legacy string-based lookup (deprecated)
logger.warn(
{ categoryName: categoryIdOrName, userId, itemName },
'DEPRECATED: Using category name instead of ID. Please update to use category_id.'
);
const categoryRes = await client.query<{ category_id: number }>(
'SELECT category_id FROM public.categories WHERE name = $1',
[categoryIdOrName],
);
categoryId = categoryRes.rows[0]?.category_id;
if (!categoryId) {
throw new Error(`Category '${categoryIdOrName}' not found.`);
}
}
// Validate category ID exists
const categoryCheck = await client.query(
'SELECT 1 FROM public.categories WHERE category_id = $1',
[categoryId]
);
if (categoryCheck.rows.length === 0) {
throw new Error(`Category ID ${categoryId} does not exist.`);
}
// Find or create master item
let masterItem: MasterGroceryItem;
const masterItemRes = await client.query<MasterGroceryItem>(
'SELECT * FROM public.master_grocery_items WHERE name = $1',
[itemName],
);
if (masterItemRes.rows.length > 0) {
masterItem = masterItemRes.rows[0];
} else {
const newMasterItemRes = await client.query<MasterGroceryItem>(
'INSERT INTO public.master_grocery_items (name, category_id) VALUES ($1, $2) RETURNING *',
[itemName, categoryId],
);
masterItem = newMasterItemRes.rows[0];
}
// Add to user's watchlist
await client.query(
'INSERT INTO public.user_watched_items (user_id, master_item_id) VALUES ($1, $2) ON CONFLICT (user_id, master_item_id) DO NOTHING',
[userId, masterItem.master_grocery_item_id],
);
return masterItem;
});
} catch (error) {
handleDbError(
error,
logger,
'Transaction error in addWatchedItem',
{ userId, itemName, categoryIdOrName },
);
throw error;
}
}
2.3. Update Route Handler
File: src/routes/user.routes.ts (route handler)
router.post(
'/watched-items',
validateRequest(addWatchedItemSchema),
asyncHandler(async (req: RequestWithJwt, res: Response) => {
const { itemName, category, category_id } = req.body;
const userId = req.jwt?.userId;
// Prefer category_id, fallback to category (deprecated)
const categoryIdOrName = category_id ?? category;
if (category && !category_id) {
// Log deprecation warning
req.log.warn(
{ category, userId },
'Client using deprecated "category" field. Should use "category_id".',
);
}
const item = await PersonalizationDbService.addWatchedItem(
userId,
itemName,
categoryIdOrName,
req.log,
);
res.status(201).json({
success: true,
data: item,
});
}),
);
2.4. Update Frontend (Gradual)
File: src/services/apiClient.ts
/**
* Add item to watched list
*
* @param itemName - Name of the item
* @param categoryIdOrName - Category ID (preferred) or name (deprecated)
*/
export const addWatchedItem = async (
itemName: string,
categoryIdOrName: number | string,
): Promise<Response> => {
const body: { itemName: string; category_id?: number; category?: string } = {
itemName,
};
if (typeof categoryIdOrName === 'number') {
body.category_id = categoryIdOrName;
} else {
body.category = categoryIdOrName;
}
return authenticatedFetch('/users/watched-items', {
method: 'POST',
body: JSON.stringify(body),
});
};
2.5. Update E2E Tests (Support Both)
File: src/tests/e2e/deals-journey.e2e.test.ts
// Fetch categories first
const categoriesResponse = await fetch(`${API_BASE_URL}/categories`);
const categoriesData = await categoriesResponse.json();
const dairyCategory = categoriesData.data.find((c) => c.name === 'Dairy & Eggs');
// Use category_id (new format)
const watchItem1Response = await authedFetch('/users/watched-items', {
method: 'POST',
token: authToken,
body: JSON.stringify({
itemName: 'E2E Milk 2%',
category_id: dairyCategory.category_id, // ✅ Use ID
}),
});
2.6. Testing Phase 2
Create integration tests that verify both formats work:
describe('POST /api/users/watched-items (dual format)', () => {
it('should accept category_id (new format)', async () => {
const response = await authenticatedRequest.post('/api/users/watched-items').send({
itemName: 'Test Item',
category_id: 3,
});
expect(response.status).toBe(201);
});
it('should accept category name (deprecated format)', async () => {
const response = await authenticatedRequest.post('/api/users/watched-items').send({
itemName: 'Test Item',
category: 'Dairy & Eggs',
});
expect(response.status).toBe(201);
// Should log deprecation warning
});
it('should prefer category_id when both provided', async () => {
const response = await authenticatedRequest.post('/api/users/watched-items').send({
itemName: 'Test Item',
category_id: 3,
category: 'Wrong Name', // Should be ignored
});
expect(response.status).toBe(201);
});
});
Phase 3: Remove String Support (Breaking Change)
Duration: 2-3 days Risk: High Breaking: Yes (requires version bump)
3.1. Version Bump
Update package.json:
{
"version": "1.0.0" // or "0.12.0" depending on semver strategy
}
3.2. Remove Deprecated Code
File: src/routes/user.routes.ts
// Remove deprecated category field
const addWatchedItemSchema = z.object({
body: z.object({
itemName: requiredString("Field 'itemName' is required."),
category_id: z.number().int().positive("Field 'category_id' must be a positive integer"),
}),
});
File: src/services/db/personalization.db.ts
// Remove string overload
async addWatchedItem(
userId: string,
itemName: string,
categoryId: number, // Only accept number
logger: Logger,
): Promise<MasterGroceryItem> {
try {
return await withTransaction(async (client) => {
// Validate category exists
const categoryCheck = await client.query(
'SELECT 1 FROM public.categories WHERE category_id = $1',
[categoryId]
);
if (categoryCheck.rows.length === 0) {
throw new Error(`Category ID ${categoryId} does not exist.`);
}
// ... rest of implementation (no string lookup)
});
} catch (error) {
handleDbError(error, logger, 'Transaction error in addWatchedItem', {
userId,
itemName,
categoryId,
});
throw error;
}
}
3.3. Update All Client Code
File: src/services/apiClient.ts
// Remove string support
export const addWatchedItem = async (
itemName: string,
categoryId: number, // Only number
): Promise<Response> => {
return authenticatedFetch('/users/watched-items', {
method: 'POST',
body: JSON.stringify({
itemName,
category_id: categoryId,
}),
});
};
File: src/hooks/mutations/useAddWatchedItemMutation.ts
interface AddWatchedItemParams {
itemName: string;
category_id: number; // Only number
}
3.4. Update All Tests
Update all E2E and integration tests to use category_id:
// All tests must fetch category IDs
const categoriesResponse = await fetch(`${API_BASE_URL}/categories`);
const categories = await categoriesResponse.json();
const watchItem1Response = await authedFetch('/users/watched-items', {
method: 'POST',
token: authToken,
body: JSON.stringify({
itemName: 'E2E Milk 2%',
category_id: categories.data.find((c) => c.name === 'Dairy & Eggs').category_id,
}),
});
3.5. Update API Documentation
- Update Swagger/OpenAPI specs
- Update README and API docs
- Add migration guide for API consumers
3.6. Deployment
- Deploy Phase 3 to test environment
- Run full E2E test suite
- Verify all tests pass
- Deploy to production with major version bump
- Communicate breaking change to stakeholders
Similar Issues to Address
1. UPC/Barcode System
Current:
// src/types/upc.ts
export interface UpcLookupResult {
category: string | null; // ❌ String
}
Decision Needed:
- Is this user-provided free-form text? → Keep as string
- Is this from a predefined list? → Change to category_id
Recommendation: If categories are predefined, change to category_id: number | null
2. AI Extraction
Current:
// src/types/ai.ts
export const FlyerItemSchema = z.object({
category_name: z.string().nullish(),
});
Approach:
- AI extracts category name (string)
- Service layer maps to category_id using fuzzy matching
- Store category_id in database
Implementation:
// New: AI Category Mapper Service
export class AiCategoryMapper {
static async mapCategoryName(
categoryName: string | null,
logger: Logger,
): Promise<number | null> {
if (!categoryName) return null;
// Try exact match first
let category = await CategoryDbService.getCategoryByName(categoryName, logger);
// If no match, try fuzzy matching or default to "Other/Miscellaneous"
if (!category) {
logger.warn({ categoryName }, 'AI extracted unknown category, using default');
category = await CategoryDbService.getCategoryByName('Other/Miscellaneous', logger);
}
return category?.category_id || null;
}
}
3. Flyer Data Transformer
Current:
// src/services/flyerDataTransformer.ts
category_name: String(item.category_name ?? '').trim() || 'Other/Miscellaneous',
Should map to category_id:
const categoryId = await AiCategoryMapper.mapCategoryName(item.category_name, logger);
Testing Strategy
Unit Tests
CategoryDbService- all CRUD operationsAiCategoryMapper- fuzzy matching logic- Validation schemas
Integration Tests
- Category routes (all endpoints)
- Watched items with category_id
- Backward compatibility (Phase 2 only)
E2E Tests
- Full user journey with category selection
- All 11 E2E tests updated to use category_id
Manual Testing Checklist
- Frontend category dropdown populated from API
- Creating watched item with category_id
- Viewing watched items shows category name
- AI extraction maps categories correctly
- Error handling for invalid category_id
Rollback Plan
Phase 1 Rollback
Risk: Very low Steps:
- Remove category routes from
server.ts - Delete
src/routes/category.routes.ts - Delete
src/services/db/category.db.ts - Redeploy previous version
Phase 2 Rollback
Risk: Low (backward compatible) Steps:
- Revert changes to validation schemas
- Revert service layer to string-only
- Clients using category_id will break, but old clients still work
Phase 3 Rollback
Risk: High (breaking change) Steps:
- Revert to Phase 2 code (dual support)
- Redeploy with version downgrade
- Communicate rollback to API consumers
Communication Plan
Before Phase 1
- Announce new
/api/categoriesendpoint - Encourage clients to start fetching categories
Before Phase 2
- Announce deprecation of
categoryfield - Recommend migration to
category_id - Provide timeline for Phase 3
Before Phase 3
- Final warning: 2 weeks before deployment
- Publish migration guide
- Offer support for client updates
After Phase 3
- Announce breaking change deployed
- Monitor error rates and Bugsink for issues
- Provide rollback plan if needed
Success Criteria
- All E2E tests pass with category_id
- No performance regression
- API response times improved (no string lookups)
- Zero Bugsink errors related to category issues
- Frontend displays categories correctly
- All documentation updated
Timeline
| Phase | Duration | Start | End |
|---|---|---|---|
| Phase 1: Discovery Endpoint | 2-3 days | Week 1 | Week 1 |
| Phase 1: Testing | 1 day | Week 1 | Week 1 |
| Phase 2: Dual Support | 3-5 days | Week 2 | Week 2 |
| Phase 2: Testing | 2 days | Week 2 | Week 2 |
| Phase 3: Remove Deprecated | 2-3 days | Week 3 | Week 3 |
| Phase 3: Testing & Deploy | 2 days | Week 3 | Week 3 |
| Total | 12-16 days |
Related Documents
- ADR-023: Database Normalization and Referential Integrity
- TESTING.md
- API Documentation (Swagger)
Notes
- Consider adding category icons/colors in future
- May want to allow user-defined categories later
- Internationalization of category names needed eventually
- Category ordering/sorting preferences per user