TubeSage is an LLM-based app that understands YouTube videos and answers questions about them.
- YouTube Video Transcription Tool: Converts video content into text.
- Text Embedding: Uses the Ollama server with Llama3.1 to embed the video transcription.
- Vector Storage: Stores the embedded vectors in the Chroma vector database.
- Q&A System: Utilizes LangChain with a custom prompt and RAG injection from the Chroma DB to answer questions about the video.
- Web Interface: Provides a Streamlit chat UI.
- Deployment: Dockerizes all components (TubeSage API, Ollama server for the LLM, and Streamlit UI) and runs them with Docker Compose. Use the provided Dockerfiles for deployment.
To get started with TubeSage, follow these steps:
- NVIDIA GPU: Required for LLM inference with Ollama.
- Docker & Docker Compose: Ensure Docker and Docker Compose are installed on your system.
-
Clone the Repository
git clone https://github.com/guscl/tubesage.git cd tubesage -
Install Dependencies
Ensure you have the NVIDIA toolkit installed for CUDA support.
- NVIDIA Toolkit: NVIDIA Installation Guide.
-
Run Docker Compose
Use Docker Compose to start all services:
docker compose up
-
Access the Application
Open your web browser and navigate to
http://localhost:8501to interact with the TubeSage interface.
-
Transcribe a Video
Enter a YouTube video URL.
-
Ask Questions
Type questions in the textbox at the end of the page. TubeSage will provide answers based on the transcribed and embedded text.
Feel free to submit issues or pull requests. Contributions are welcome!
For questions, please contact correia.gustavol@gmail.com or open an issue on the GitHub repository.
- LangChain: LangChain API is not very well organized; Future experiments with LlamaIndex are planned.
- YouTube Transcription API: This API is a wrapper over an undocumented API, which may become unreliable.
- Ollama: Ollama currently does not feel production-ready. Future experimentation with VLLM is planned.
