active
Feature request: image drop → auto-extract → save to correct destination
USE CASES:
- Screenshot of a movie poster → saved to Movies to Watch list
- Photo of a flyer/event → saved to Events or Calendar
- Screenshot of a receipt → logged as expense
- Photo of a note/whiteboard → saved to Notes
- Screenshot of anything → smart routing based on content
FLOW:
1. User drops image / screenshots into chat
2. OCR + vision model extracts text + context
3. LLM classifies: is this a note? list item? expense? event?
4. Shows preview: "This looks like a movie — want to add to Watch list?"
5. User confirms → saved to correct destination
TECH:
- Vision model (GPT-4o vision or equivalent) for image understanding
- OCR fallback for pure text screenshots
- Smart routing rules: receipt → expense, poster → list, flyer → calendar event
- All with preview before write — same verified_state rules apply
NEXT STEPS:
- Enable image/file uploads in chat interface
- Build vision pipeline with classification layer
- Connect to existing save_note / list_add / log_expense / calendar_create tools
- Test with receipts first (highest value), then movies/events
project · Systemfeaturevisionimage-to-textocrscreenshotsnoteslists