# AI Usage Subagent Guide The **ai-usage** subagent specializes in LLM APIs (Gemini, Claude), prompt engineering, and AI-powered features in the Flyer Crawler project. ## Quick Reference | Aspect | Details | | ------------------ | ----------------------------------------------------------------------------------- | | **Primary Use** | Gemini API integration, prompt engineering, AI extraction | | **Key Files** | `src/services/aiService.server.ts`, `src/services/flyerProcessingService.server.ts` | | **Key ADRs** | ADR-041 (AI Integration), ADR-046 (Image Processing) | | **API Key Env** | `VITE_GOOGLE_GENAI_API_KEY` (prod), `VITE_GOOGLE_GENAI_API_KEY_TEST` (test) | | **Error Handling** | Rate limits (429), JSON parse errors, timeout handling | | **Delegate To** | `coder` (implementation), `testwriter` (tests), `integrations-specialist` | ## When to Use Use the **ai-usage** subagent when you need to: - Integrate with the Gemini API for flyer extraction - Debug AI extraction failures - Optimize prompts for better accuracy - Handle rate limiting and API errors - Implement new AI-powered features - Fine-tune extraction schemas ## What ai-usage Knows The ai-usage subagent understands: - Google Generative AI (Gemini) API - Flyer extraction prompts and schemas - Error handling for AI services - Rate limiting strategies - Token optimization - AI service architecture (ADR-041) ## AI Architecture Overview ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Flyer Upload │───►│ AI Service │───►│ Gemini API │ │ │ │ │ │ │ └─────────────────┘ │ - Preprocessing │ │ - Vision model │ │ - Prompt build │ │ - JSON output │ │ - Response parse│ │ │ └─────────────────┘ └─────────────────┘ │ ▼ ┌─────────────────┐ │ Validation & │ │ Normalization │ └─────────────────┘ ``` ## Key Files | File | Purpose | | ----------------------------------------------- | ------------------------------------ | | `src/services/aiService.server.ts` | Gemini API integration | | `src/services/flyerProcessingService.server.ts` | Flyer extraction pipeline | | `src/schemas/flyer.schemas.ts` | Zod schemas for AI output validation | | `src/types/ai.types.ts` | TypeScript types for AI responses | ## Example Requests ### Debugging Extraction Failures ``` "Use ai-usage to debug why flyer extractions are failing for multi-page PDFs. The error logs show 'Invalid JSON response' but only for certain stores." ``` ### Optimizing Prompts ``` "Use ai-usage to improve the item extraction prompt. Currently it's missing unit prices when items show 'X for $Y' pricing (e.g., '3 for $5')." ``` ### Handling Rate Limits ``` "Use ai-usage to implement exponential backoff for Gemini API rate limits. We're seeing 429 errors during high-volume uploads." ``` ### Adding New AI Features ``` "Use ai-usage to add a feature that uses Gemini to categorize extracted items into grocery categories (produce, dairy, meat, etc.)." ``` ## Extraction Pipeline ### 1. Image Preprocessing ```typescript // Convert PDF to images, resize large images const processedImages = await imageProcessor.prepareForAI(uploadedFile); ``` ### 2. Prompt Construction The extraction prompt includes: - System instructions for the AI model - Expected output schema (JSON) - Examples of correct extraction - Handling instructions for edge cases ### 3. API Call ```typescript const response = await aiService.extractFlyerData(processedImages, storeContext, extractionOptions); ``` ### 4. Response Validation ```typescript // Validate against Zod schema const validatedItems = flyerItemsSchema.parse(response.items); ``` ### 5. Normalization ```typescript // Normalize prices, units, quantities const normalizedItems = normalizeExtractedItems(validatedItems); ``` ## Common Issues and Solutions ### Issue: Inconsistent Price Extraction **Symptoms**: Same item priced differently on different extractions. **Solution**: Improve prompt with explicit price format examples: ``` "Price formats to recognize: - $X.XX (regular price) - X for $Y.YY (multi-buy) - $X.XX/lb or $X.XX/kg (unit price) - $X.XX each (per item) - SAVE $X.XX (discount amount, not item price)" ``` ### Issue: Missing Items from Dense Flyers **Symptoms**: Flyers with many items on one page have missing extractions. **Solution**: 1. Split page into quadrants for separate extraction 2. Increase token limit for response 3. Use structured grid-based prompting ### Issue: Rate Limit Errors (429) **Symptoms**: `429 Too Many Requests` errors during bulk uploads. **Solution**: Implement request queuing: ```typescript // Add to job queue instead of direct call await flyerQueue.add( 'extract', { flyerId, images, }, { attempts: 3, backoff: { type: 'exponential', delay: 2000, }, }, ); ``` ### Issue: Hallucinated Items **Symptoms**: Items extracted that don't exist in the flyer. **Solution**: 1. Add confidence scoring to extraction 2. Request bounding box coordinates for verification 3. Add post-extraction validation against image ## Prompt Engineering Best Practices ### 1. Be Specific About Output Format ``` Output MUST be valid JSON matching this schema: { "items": [ { "name": "string (product name as shown)", "brand": "string or null", "price": number (in dollars), "unit": "string (each, lb, kg, etc.)", "quantity": number (default 1) } ] } ``` ### 2. Provide Examples ``` Example extractions: - "Chicken Breast $4.99/lb" -> {"name": "Chicken Breast", "price": 4.99, "unit": "lb"} - "Coca-Cola 12pk $5.99" -> {"name": "Coca-Cola", "quantity": 12, "price": 5.99, "unit": "each"} ``` ### 3. Handle Edge Cases Explicitly ``` Special cases: - If "LIMIT X" shown, add to notes, don't affect price - If "SAVE $X" shown without base price, mark price as null - If item is "FREE with purchase", set price to 0 ``` ### 4. Request Structured Thinking ``` For each item: 1. Identify the product name and brand 2. Find the associated price 3. Determine if price is per-unit or total 4. Extract any quantity information ``` ## Monitoring AI Performance ### Metrics to Track | Metric | Description | Target | | ----------------------- | --------------------------------------- | --------------- | | Extraction success rate | % of flyers processed without error | >95% | | Items per flyer | Average items extracted | Varies by store | | Price accuracy | Match rate vs manual verification | >98% | | Response time | Time from upload to extraction complete | <30s | ### Logging ```typescript log.info( { flyerId, itemCount: extractedItems.length, processingTime: duration, modelVersion: response.model, tokenUsage: response.usage, }, 'Flyer extraction completed', ); ``` ## Environment Configuration | Variable | Purpose | | -------------------------------- | --------------------------- | | `VITE_GOOGLE_GENAI_API_KEY` | Gemini API key (production) | | `VITE_GOOGLE_GENAI_API_KEY_TEST` | Gemini API key (test) | **Note**: Use separate API keys for production and test to avoid rate limit conflicts and enable separate billing tracking. ## Testing AI Features ### Unit Tests Mock the Gemini API response: ```typescript vi.mock('@google/generative-ai', () => ({ GoogleGenerativeAI: vi.fn().mockImplementation(() => ({ getGenerativeModel: () => ({ generateContent: vi.fn().mockResolvedValue({ response: { text: () => JSON.stringify({ items: mockItems }), }, }), }), })), })); ``` ### Integration Tests Use recorded responses for deterministic testing: ```typescript // Save real API responses to fixtures const fixtureResponse = await fs.readFile('fixtures/gemini-response.json'); ``` ## Related Documentation - [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview - [CODER-GUIDE.md](./CODER-GUIDE.md) - For implementing AI features - [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Testing AI features - [INTEGRATIONS-GUIDE.md](./INTEGRATIONS-GUIDE.md) - External API patterns - [../adr/0041-ai-gemini-integration-architecture.md](../adr/0041-ai-gemini-integration-architecture.md) - AI integration ADR - [../adr/0046-image-processing-pipeline.md](../adr/0046-image-processing-pipeline.md) - Image processing - [../getting-started/ENVIRONMENT.md](../getting-started/ENVIRONMENT.md) - Environment configuration