I built this prototype to explore how AI could help with personal memory recall using visual data. Think of it as a "digital memory" system that captures images throughout your day and lets you search through them using natural language.
This is a proof-of-concept for a wearable-like memory assistant that:
- Captures images periodically via webcam (simulating smart glasses)
- Detects objects using YOLOv8 to understand what's in each scene
- Stores memories with timestamps and detailed descriptions
- Lets you search using natural language like "When did I last see my keys?"
- Provides a web interface to browse and query your visual memories
I was curious about how we could use AI to augment human memory. The idea came from thinking about how often I misplace things or forget where I was at a certain time. What if I could just ask my computer "When did I last see my laptop?" and get an instant answer with a photo?
The system uses a hybrid approach:
- Local processing for object detection (YOLO) - keeps things fast and private
- Cloud AI (ChatGPT) for understanding natural language queries
- SQLite database to store everything locally
- Streamlit web app for easy interaction
Ask questions like:
- "When did I last see my keys?"
- "Show me when I was working at my desk"
- "Find memories from today"
- "When was I in the kitchen?"
- Automatically captures images every 5 minutes during active hours
- Uses YOLO to detect objects and understand scenes
- Stores everything with timestamps and metadata
- Browse all your captured memories
- See statistics about your daily patterns
- Track what objects you interact with most
I chose these technologies for a good balance of performance and ease of use:
- Python - Easy to prototype and has great AI libraries
- YOLOv8 - Fast, accurate object detection
- ChatGPT API - Natural language understanding
- SQLite - Simple, local database
- Streamlit - Quick web interface
- OpenCV - Image processing
- Python 3.8+
- Webcam
- OpenAI API key (for natural language queries)
# Clone the repo
git clone https://github.com/yourusername/MemoryAssistant.git
cd MemoryAssistant
# Run the setup script
python setup.py
# Add your OpenAI API key to .env file
# Edit .env and add: OPENAI_API_KEY=your_key_here
# Start the app
streamlit run app.py# Install dependencies
pip install -r requirements.txt
python -m spacy download en_core_web_sm
# Create .env file
cp env_example.txt .env
# Edit .env and add your OpenAI API key
# Run tests
python test_setup.py
# Start the app
streamlit run app.py
- **Natural Language Queries**: Ask questions like "When did I last see my keys?" or "Show me when I was working at my desk"
- **Memory Storage**: Structured storage with timestamps and metadata
- **Interactive Demo**: Web-based interface for querying and browsing memories
## Tech Stack
- **Vision**: YOLOv8 + OpenCV
- **NLP**: ChatGPT API + spaCy
- **Frontend**: Streamlit
- **Backend**: FastAPI
- **Database**: SQLite + FAISS (vector search)
- **Storage**: Local file system
## Quick Start
1. **Install Dependencies**:
```bash
pip install -r requirements.txt
python -m spacy download en_core_web_sm-
Set up Environment Variables:
cp .env.example .env # Add your OpenAI API key to .env -
Run the Application:
streamlit run app.py
MemoryAssistant/
βββ app.py # Main Streamlit application
βββ capture/ # Image capture system
β βββ camera.py # Webcam integration
β βββ scheduler.py # Periodic capture logic
βββ vision/ # Computer vision processing
β βββ detector.py # YOLO object detection
β βββ processor.py # Scene understanding
βββ memory/ # Memory storage and retrieval
β βββ database.py # SQLite database operations
β βββ storage.py # File storage management
β βββ query.py # Natural language query processing
βββ api/ # ChatGPT API integration
β βββ openai_client.py # OpenAI API wrapper
βββ ui/ # User interface components
β βββ components.py # Streamlit UI components
βββ config/ # Configuration files
β βββ settings.py # Application settings
βββ data/ # Data storage
βββ images/ # Captured images
βββ database/ # SQLite database
βββ embeddings/ # Vector embeddings
>>>>>>> 97c227aa902e1220da5301ba17c5e8a868905970
<<<<<<< HEAD
- Start the app:
streamlit run app.py - Capture some images: Click "Capture Image Now" in the sidebar
- Search your memories: Try queries like "When did I last see my phone?"
- Browse recent captures: Check the "Recent Captures" tab
- View statistics: See your memory patterns in the "Statistics" tab
You can run a demo without needing an API key:
python demo.pyThis shows the system working with sample data.
Edit the .env file to customize:
# Capture settings
CAPTURE_INTERVAL=300 # Capture every 5 minutes
CAPTURE_ACTIVE_HOURS_START=8 # Start at 8 AM
CAPTURE_ACTIVE_HOURS_END=22 # Stop at 10 PM
# AI settings
OPENAI_MODEL=gpt-3.5-turbo # Use GPT-3.5 for faster responses
CONFIDENCE_THRESHOLD=0.5 # Object detection confidenceBuilding this taught me a lot about:
- Hybrid AI approaches - Local processing for speed, cloud for intelligence
- Privacy considerations - Keeping sensitive data local while using cloud AI
- User experience - Natural language makes AI much more accessible
- Rapid prototyping - Streamlit is amazing for quick demos
If I continue this project, I'd like to add:
- Auto-capture scheduling based on activity
- Cloud storage for backup and sharing
- Mobile app companion
- Voice queries for hands-free use
- Activity prediction to anticipate what you might be looking for
- Make sure your webcam is connected and accessible
- On macOS, you'll need to grant camera permissions
- Try different camera index in
.envfile
- Verify your API key is correct
- Check your OpenAI account has credits
- Ensure you're using a supported model
- Use
yolov8n.pt(nano) model for faster processing - Reduce capture frequency in
.env - Close other applications using the webcam
This is a personal project, but I'm open to ideas and improvements! Feel free to:
- Report bugs
- Suggest new features
- Submit pull requests
MIT License - feel free to use this code for your own projects.
- Start Capture: The system will automatically begin capturing images every 5 minutes
- Query Memories: Use natural language to search through your visual memories
- Browse Gallery: View all captured images with timestamps and descriptions
- Project setup and dependencies
- Image capture system
- Object detection pipeline
- Memory storage system
- Natural language query processing
- Web interface
- Demo and testing
MIT License
97c227aa902e1220da5301ba17c5e8a868905970