A semantic search engine for CSV data using sentence transformers and FAISS for fast similarity search.
- Semantic Search: Uses sentence transformers to understand the meaning behind your queries
- Fast Search: FAISS indexing for lightning-fast similarity search
- CLI Interface: Command-line tool for quick searches
- Web Interface: Beautiful Streamlit web app with interactive features
- Easy Setup: Simple installation and setup process
First, create a virtual environment (recommended):
# Create virtual environment
python -m venv venv
# Activate it
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activateInstall the required packages:
pip install -r requirements.txtPlace your CSV file in the data/ folder as workflows.csv. Make sure it has at least these columns:
workflow: Name or identifier for each itemdescription: Text description to search through
Navigate to the src/ folder and run:
cd src
python build_index.pyThis will:
- Load CSV data
- Generate semantic embeddings for all descriptions
- Build a FAISS index for fast searching
- Save everything to the
embeddings/folder
For quick searches from the terminal:
cd src
python search.pyThen enter your search queries interactively. Examples:
- "email automation"
- "data scraping"
- "notifications"
- "file backup"
For a beautiful web interface with more features:
cd src
streamlit run app.pyThis will open a web browser with:
- Interactive search box
- Similarity scores
- Example queries
- Downloadable results
- Dataset statistics
This project is open source and available under the MIT License.
Feel free to submit issues, feature requests, or pull requests to improve this search engine!


