Files
flyer-crawler.projectium.com/docs/subagents/AI-USAGE-GUIDE.md
Torben Sorensen 45ac4fccf5
Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m15s
comprehensive documentation review + test fixes
2026-01-28 16:35:38 -08:00

315 lines
9.4 KiB
Markdown

# AI Usage Subagent Guide
The **ai-usage** subagent specializes in LLM APIs (Gemini, Claude), prompt engineering, and AI-powered features in the Flyer Crawler project.
## Quick Reference
| Aspect | Details |
| ------------------ | ----------------------------------------------------------------------------------- |
| **Primary Use** | Gemini API integration, prompt engineering, AI extraction |
| **Key Files** | `src/services/aiService.server.ts`, `src/services/flyerProcessingService.server.ts` |
| **Key ADRs** | ADR-041 (AI Integration), ADR-046 (Image Processing) |
| **API Key Env** | `VITE_GOOGLE_GENAI_API_KEY` (prod), `VITE_GOOGLE_GENAI_API_KEY_TEST` (test) |
| **Error Handling** | Rate limits (429), JSON parse errors, timeout handling |
| **Delegate To** | `coder` (implementation), `testwriter` (tests), `integrations-specialist` |
## When to Use
Use the **ai-usage** subagent when you need to:
- Integrate with the Gemini API for flyer extraction
- Debug AI extraction failures
- Optimize prompts for better accuracy
- Handle rate limiting and API errors
- Implement new AI-powered features
- Fine-tune extraction schemas
## What ai-usage Knows
The ai-usage subagent understands:
- Google Generative AI (Gemini) API
- Flyer extraction prompts and schemas
- Error handling for AI services
- Rate limiting strategies
- Token optimization
- AI service architecture (ADR-041)
## AI Architecture Overview
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Flyer Upload │───►│ AI Service │───►│ Gemini API │
│ │ │ │ │ │
└─────────────────┘ │ - Preprocessing │ │ - Vision model │
│ - Prompt build │ │ - JSON output │
│ - Response parse│ │ │
└─────────────────┘ └─────────────────┘
┌─────────────────┐
│ Validation & │
│ Normalization │
└─────────────────┘
```
## Key Files
| File | Purpose |
| ----------------------------------------------- | ------------------------------------ |
| `src/services/aiService.server.ts` | Gemini API integration |
| `src/services/flyerProcessingService.server.ts` | Flyer extraction pipeline |
| `src/schemas/flyer.schemas.ts` | Zod schemas for AI output validation |
| `src/types/ai.types.ts` | TypeScript types for AI responses |
## Example Requests
### Debugging Extraction Failures
```
"Use ai-usage to debug why flyer extractions are failing for
multi-page PDFs. The error logs show 'Invalid JSON response'
but only for certain stores."
```
### Optimizing Prompts
```
"Use ai-usage to improve the item extraction prompt. Currently
it's missing unit prices when items show 'X for $Y' pricing
(e.g., '3 for $5')."
```
### Handling Rate Limits
```
"Use ai-usage to implement exponential backoff for Gemini API
rate limits. We're seeing 429 errors during high-volume uploads."
```
### Adding New AI Features
```
"Use ai-usage to add a feature that uses Gemini to categorize
extracted items into grocery categories (produce, dairy, meat, etc.)."
```
## Extraction Pipeline
### 1. Image Preprocessing
```typescript
// Convert PDF to images, resize large images
const processedImages = await imageProcessor.prepareForAI(uploadedFile);
```
### 2. Prompt Construction
The extraction prompt includes:
- System instructions for the AI model
- Expected output schema (JSON)
- Examples of correct extraction
- Handling instructions for edge cases
### 3. API Call
```typescript
const response = await aiService.extractFlyerData(processedImages, storeContext, extractionOptions);
```
### 4. Response Validation
```typescript
// Validate against Zod schema
const validatedItems = flyerItemsSchema.parse(response.items);
```
### 5. Normalization
```typescript
// Normalize prices, units, quantities
const normalizedItems = normalizeExtractedItems(validatedItems);
```
## Common Issues and Solutions
### Issue: Inconsistent Price Extraction
**Symptoms**: Same item priced differently on different extractions.
**Solution**: Improve prompt with explicit price format examples:
```
"Price formats to recognize:
- $X.XX (regular price)
- X for $Y.YY (multi-buy)
- $X.XX/lb or $X.XX/kg (unit price)
- $X.XX each (per item)
- SAVE $X.XX (discount amount, not item price)"
```
### Issue: Missing Items from Dense Flyers
**Symptoms**: Flyers with many items on one page have missing extractions.
**Solution**:
1. Split page into quadrants for separate extraction
2. Increase token limit for response
3. Use structured grid-based prompting
### Issue: Rate Limit Errors (429)
**Symptoms**: `429 Too Many Requests` errors during bulk uploads.
**Solution**: Implement request queuing:
```typescript
// Add to job queue instead of direct call
await flyerQueue.add(
'extract',
{
flyerId,
images,
},
{
attempts: 3,
backoff: {
type: 'exponential',
delay: 2000,
},
},
);
```
### Issue: Hallucinated Items
**Symptoms**: Items extracted that don't exist in the flyer.
**Solution**:
1. Add confidence scoring to extraction
2. Request bounding box coordinates for verification
3. Add post-extraction validation against image
## Prompt Engineering Best Practices
### 1. Be Specific About Output Format
```
Output MUST be valid JSON matching this schema:
{
"items": [
{
"name": "string (product name as shown)",
"brand": "string or null",
"price": number (in dollars),
"unit": "string (each, lb, kg, etc.)",
"quantity": number (default 1)
}
]
}
```
### 2. Provide Examples
```
Example extractions:
- "Chicken Breast $4.99/lb" -> {"name": "Chicken Breast", "price": 4.99, "unit": "lb"}
- "Coca-Cola 12pk $5.99" -> {"name": "Coca-Cola", "quantity": 12, "price": 5.99, "unit": "each"}
```
### 3. Handle Edge Cases Explicitly
```
Special cases:
- If "LIMIT X" shown, add to notes, don't affect price
- If "SAVE $X" shown without base price, mark price as null
- If item is "FREE with purchase", set price to 0
```
### 4. Request Structured Thinking
```
For each item:
1. Identify the product name and brand
2. Find the associated price
3. Determine if price is per-unit or total
4. Extract any quantity information
```
## Monitoring AI Performance
### Metrics to Track
| Metric | Description | Target |
| ----------------------- | --------------------------------------- | --------------- |
| Extraction success rate | % of flyers processed without error | >95% |
| Items per flyer | Average items extracted | Varies by store |
| Price accuracy | Match rate vs manual verification | >98% |
| Response time | Time from upload to extraction complete | <30s |
### Logging
```typescript
log.info(
{
flyerId,
itemCount: extractedItems.length,
processingTime: duration,
modelVersion: response.model,
tokenUsage: response.usage,
},
'Flyer extraction completed',
);
```
## Environment Configuration
| Variable | Purpose |
| -------------------------------- | --------------------------- |
| `VITE_GOOGLE_GENAI_API_KEY` | Gemini API key (production) |
| `VITE_GOOGLE_GENAI_API_KEY_TEST` | Gemini API key (test) |
**Note**: Use separate API keys for production and test to avoid rate limit conflicts and enable separate billing tracking.
## Testing AI Features
### Unit Tests
Mock the Gemini API response:
```typescript
vi.mock('@google/generative-ai', () => ({
GoogleGenerativeAI: vi.fn().mockImplementation(() => ({
getGenerativeModel: () => ({
generateContent: vi.fn().mockResolvedValue({
response: {
text: () => JSON.stringify({ items: mockItems }),
},
}),
}),
})),
}));
```
### Integration Tests
Use recorded responses for deterministic testing:
```typescript
// Save real API responses to fixtures
const fixtureResponse = await fs.readFile('fixtures/gemini-response.json');
```
## Related Documentation
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
- [CODER-GUIDE.md](./CODER-GUIDE.md) - For implementing AI features
- [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Testing AI features
- [INTEGRATIONS-GUIDE.md](./INTEGRATIONS-GUIDE.md) - External API patterns
- [../adr/0041-ai-gemini-integration-architecture.md](../adr/0041-ai-gemini-integration-architecture.md) - AI integration ADR
- [../adr/0046-image-processing-pipeline.md](../adr/0046-image-processing-pipeline.md) - Image processing
- [../getting-started/ENVIRONMENT.md](../getting-started/ENVIRONMENT.md) - Environment configuration