Some checks failed
Deploy to Test Environment / deploy-to-test (push) Failing after 2m15s
315 lines
9.4 KiB
Markdown
315 lines
9.4 KiB
Markdown
# AI Usage Subagent Guide
|
|
|
|
The **ai-usage** subagent specializes in LLM APIs (Gemini, Claude), prompt engineering, and AI-powered features in the Flyer Crawler project.
|
|
|
|
## Quick Reference
|
|
|
|
| Aspect | Details |
|
|
| ------------------ | ----------------------------------------------------------------------------------- |
|
|
| **Primary Use** | Gemini API integration, prompt engineering, AI extraction |
|
|
| **Key Files** | `src/services/aiService.server.ts`, `src/services/flyerProcessingService.server.ts` |
|
|
| **Key ADRs** | ADR-041 (AI Integration), ADR-046 (Image Processing) |
|
|
| **API Key Env** | `VITE_GOOGLE_GENAI_API_KEY` (prod), `VITE_GOOGLE_GENAI_API_KEY_TEST` (test) |
|
|
| **Error Handling** | Rate limits (429), JSON parse errors, timeout handling |
|
|
| **Delegate To** | `coder` (implementation), `testwriter` (tests), `integrations-specialist` |
|
|
|
|
## When to Use
|
|
|
|
Use the **ai-usage** subagent when you need to:
|
|
|
|
- Integrate with the Gemini API for flyer extraction
|
|
- Debug AI extraction failures
|
|
- Optimize prompts for better accuracy
|
|
- Handle rate limiting and API errors
|
|
- Implement new AI-powered features
|
|
- Fine-tune extraction schemas
|
|
|
|
## What ai-usage Knows
|
|
|
|
The ai-usage subagent understands:
|
|
|
|
- Google Generative AI (Gemini) API
|
|
- Flyer extraction prompts and schemas
|
|
- Error handling for AI services
|
|
- Rate limiting strategies
|
|
- Token optimization
|
|
- AI service architecture (ADR-041)
|
|
|
|
## AI Architecture Overview
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Flyer Upload │───►│ AI Service │───►│ Gemini API │
|
|
│ │ │ │ │ │
|
|
└─────────────────┘ │ - Preprocessing │ │ - Vision model │
|
|
│ - Prompt build │ │ - JSON output │
|
|
│ - Response parse│ │ │
|
|
└─────────────────┘ └─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐
|
|
│ Validation & │
|
|
│ Normalization │
|
|
└─────────────────┘
|
|
```
|
|
|
|
## Key Files
|
|
|
|
| File | Purpose |
|
|
| ----------------------------------------------- | ------------------------------------ |
|
|
| `src/services/aiService.server.ts` | Gemini API integration |
|
|
| `src/services/flyerProcessingService.server.ts` | Flyer extraction pipeline |
|
|
| `src/schemas/flyer.schemas.ts` | Zod schemas for AI output validation |
|
|
| `src/types/ai.types.ts` | TypeScript types for AI responses |
|
|
|
|
## Example Requests
|
|
|
|
### Debugging Extraction Failures
|
|
|
|
```
|
|
"Use ai-usage to debug why flyer extractions are failing for
|
|
multi-page PDFs. The error logs show 'Invalid JSON response'
|
|
but only for certain stores."
|
|
```
|
|
|
|
### Optimizing Prompts
|
|
|
|
```
|
|
"Use ai-usage to improve the item extraction prompt. Currently
|
|
it's missing unit prices when items show 'X for $Y' pricing
|
|
(e.g., '3 for $5')."
|
|
```
|
|
|
|
### Handling Rate Limits
|
|
|
|
```
|
|
"Use ai-usage to implement exponential backoff for Gemini API
|
|
rate limits. We're seeing 429 errors during high-volume uploads."
|
|
```
|
|
|
|
### Adding New AI Features
|
|
|
|
```
|
|
"Use ai-usage to add a feature that uses Gemini to categorize
|
|
extracted items into grocery categories (produce, dairy, meat, etc.)."
|
|
```
|
|
|
|
## Extraction Pipeline
|
|
|
|
### 1. Image Preprocessing
|
|
|
|
```typescript
|
|
// Convert PDF to images, resize large images
|
|
const processedImages = await imageProcessor.prepareForAI(uploadedFile);
|
|
```
|
|
|
|
### 2. Prompt Construction
|
|
|
|
The extraction prompt includes:
|
|
|
|
- System instructions for the AI model
|
|
- Expected output schema (JSON)
|
|
- Examples of correct extraction
|
|
- Handling instructions for edge cases
|
|
|
|
### 3. API Call
|
|
|
|
```typescript
|
|
const response = await aiService.extractFlyerData(processedImages, storeContext, extractionOptions);
|
|
```
|
|
|
|
### 4. Response Validation
|
|
|
|
```typescript
|
|
// Validate against Zod schema
|
|
const validatedItems = flyerItemsSchema.parse(response.items);
|
|
```
|
|
|
|
### 5. Normalization
|
|
|
|
```typescript
|
|
// Normalize prices, units, quantities
|
|
const normalizedItems = normalizeExtractedItems(validatedItems);
|
|
```
|
|
|
|
## Common Issues and Solutions
|
|
|
|
### Issue: Inconsistent Price Extraction
|
|
|
|
**Symptoms**: Same item priced differently on different extractions.
|
|
|
|
**Solution**: Improve prompt with explicit price format examples:
|
|
|
|
```
|
|
"Price formats to recognize:
|
|
- $X.XX (regular price)
|
|
- X for $Y.YY (multi-buy)
|
|
- $X.XX/lb or $X.XX/kg (unit price)
|
|
- $X.XX each (per item)
|
|
- SAVE $X.XX (discount amount, not item price)"
|
|
```
|
|
|
|
### Issue: Missing Items from Dense Flyers
|
|
|
|
**Symptoms**: Flyers with many items on one page have missing extractions.
|
|
|
|
**Solution**:
|
|
|
|
1. Split page into quadrants for separate extraction
|
|
2. Increase token limit for response
|
|
3. Use structured grid-based prompting
|
|
|
|
### Issue: Rate Limit Errors (429)
|
|
|
|
**Symptoms**: `429 Too Many Requests` errors during bulk uploads.
|
|
|
|
**Solution**: Implement request queuing:
|
|
|
|
```typescript
|
|
// Add to job queue instead of direct call
|
|
await flyerQueue.add(
|
|
'extract',
|
|
{
|
|
flyerId,
|
|
images,
|
|
},
|
|
{
|
|
attempts: 3,
|
|
backoff: {
|
|
type: 'exponential',
|
|
delay: 2000,
|
|
},
|
|
},
|
|
);
|
|
```
|
|
|
|
### Issue: Hallucinated Items
|
|
|
|
**Symptoms**: Items extracted that don't exist in the flyer.
|
|
|
|
**Solution**:
|
|
|
|
1. Add confidence scoring to extraction
|
|
2. Request bounding box coordinates for verification
|
|
3. Add post-extraction validation against image
|
|
|
|
## Prompt Engineering Best Practices
|
|
|
|
### 1. Be Specific About Output Format
|
|
|
|
```
|
|
Output MUST be valid JSON matching this schema:
|
|
{
|
|
"items": [
|
|
{
|
|
"name": "string (product name as shown)",
|
|
"brand": "string or null",
|
|
"price": number (in dollars),
|
|
"unit": "string (each, lb, kg, etc.)",
|
|
"quantity": number (default 1)
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### 2. Provide Examples
|
|
|
|
```
|
|
Example extractions:
|
|
- "Chicken Breast $4.99/lb" -> {"name": "Chicken Breast", "price": 4.99, "unit": "lb"}
|
|
- "Coca-Cola 12pk $5.99" -> {"name": "Coca-Cola", "quantity": 12, "price": 5.99, "unit": "each"}
|
|
```
|
|
|
|
### 3. Handle Edge Cases Explicitly
|
|
|
|
```
|
|
Special cases:
|
|
- If "LIMIT X" shown, add to notes, don't affect price
|
|
- If "SAVE $X" shown without base price, mark price as null
|
|
- If item is "FREE with purchase", set price to 0
|
|
```
|
|
|
|
### 4. Request Structured Thinking
|
|
|
|
```
|
|
For each item:
|
|
1. Identify the product name and brand
|
|
2. Find the associated price
|
|
3. Determine if price is per-unit or total
|
|
4. Extract any quantity information
|
|
```
|
|
|
|
## Monitoring AI Performance
|
|
|
|
### Metrics to Track
|
|
|
|
| Metric | Description | Target |
|
|
| ----------------------- | --------------------------------------- | --------------- |
|
|
| Extraction success rate | % of flyers processed without error | >95% |
|
|
| Items per flyer | Average items extracted | Varies by store |
|
|
| Price accuracy | Match rate vs manual verification | >98% |
|
|
| Response time | Time from upload to extraction complete | <30s |
|
|
|
|
### Logging
|
|
|
|
```typescript
|
|
log.info(
|
|
{
|
|
flyerId,
|
|
itemCount: extractedItems.length,
|
|
processingTime: duration,
|
|
modelVersion: response.model,
|
|
tokenUsage: response.usage,
|
|
},
|
|
'Flyer extraction completed',
|
|
);
|
|
```
|
|
|
|
## Environment Configuration
|
|
|
|
| Variable | Purpose |
|
|
| -------------------------------- | --------------------------- |
|
|
| `VITE_GOOGLE_GENAI_API_KEY` | Gemini API key (production) |
|
|
| `VITE_GOOGLE_GENAI_API_KEY_TEST` | Gemini API key (test) |
|
|
|
|
**Note**: Use separate API keys for production and test to avoid rate limit conflicts and enable separate billing tracking.
|
|
|
|
## Testing AI Features
|
|
|
|
### Unit Tests
|
|
|
|
Mock the Gemini API response:
|
|
|
|
```typescript
|
|
vi.mock('@google/generative-ai', () => ({
|
|
GoogleGenerativeAI: vi.fn().mockImplementation(() => ({
|
|
getGenerativeModel: () => ({
|
|
generateContent: vi.fn().mockResolvedValue({
|
|
response: {
|
|
text: () => JSON.stringify({ items: mockItems }),
|
|
},
|
|
}),
|
|
}),
|
|
})),
|
|
}));
|
|
```
|
|
|
|
### Integration Tests
|
|
|
|
Use recorded responses for deterministic testing:
|
|
|
|
```typescript
|
|
// Save real API responses to fixtures
|
|
const fixtureResponse = await fs.readFile('fixtures/gemini-response.json');
|
|
```
|
|
|
|
## Related Documentation
|
|
|
|
- [OVERVIEW.md](./OVERVIEW.md) - Subagent system overview
|
|
- [CODER-GUIDE.md](./CODER-GUIDE.md) - For implementing AI features
|
|
- [TESTER-GUIDE.md](./TESTER-GUIDE.md) - Testing AI features
|
|
- [INTEGRATIONS-GUIDE.md](./INTEGRATIONS-GUIDE.md) - External API patterns
|
|
- [../adr/0041-ai-gemini-integration-architecture.md](../adr/0041-ai-gemini-integration-architecture.md) - AI integration ADR
|
|
- [../adr/0046-image-processing-pipeline.md](../adr/0046-image-processing-pipeline.md) - Image processing
|
|
- [../getting-started/ENVIRONMENT.md](../getting-started/ENVIRONMENT.md) - Environment configuration
|