🔹 Summary
Implement a complete RAG pipeline that allows the backend to load text data (e.g., instructions from Excel/Word uploads), convert it into embeddings, store them in a vector store, retrieve the most relevant chunks, and pass them into LangChain for final AI responses.
This makes the AI more accurate, context-aware, and capable of using user-uploaded data.
👤 User Story
As a developer,
I want the AI system to use context from stored documents,
So that the AI responses become accurate and aligned with the user’s uploaded instructions.
📌 Requirements / Scope
- Vector Store Setup
Use one of the following (decide in issue or create sub-task):
MemoryVectorStore (temporary, dev use)
Faiss, Chroma, Pinecone, or Qdrant (recommended for production)
Create a module:
src/rag/vectorStore.js
Responsibilities:
Initialize vector store
Provide addDocuments(chunks)
Provide getRetriever() for LangChain
Save & load vector store (if supported by backend DB)
- Document Loader + Chunking
Load text from database (instructions saved from Excel/Word upload).
Break long text into chunks using:
RecursiveCharacterTextSplitter
File:
src/rag/documentLoader.js
Required functions:
loadInstructionsFromDB()
chunkDocuments(text)
- Embeddings
Use @langchain/openai embeddings:
new OpenAIEmbeddings({ model: "text-embedding-3-small" })
Embed each chunk and store in vector store.
- Retriever Setup
Convert vector store into retriever:
const retriever = vectorStore.asRetriever({
k: 5, // top relevant chunks
});
- RAG Chain
Create final RAG pipeline using RunnableSequence:
Steps:
Retrieve relevant context
Format prompt
Call LLM model
Return response with used context
File:
src/rag/ragChain.js
- API Endpoint
Create backend endpoint for querying RAG:
POST /api/ai/rag
Payload:
{
"query": "How should the AI behave when a user uploads a file?"
}
Backend:
Load retriever
Get context
Pass to RAG chain
Send answer
🔍 Expected Flow
User uploads Excel/Word → text is extracted → saved in DB.
RAG ingestion job runs:
Loads text
Splits into chunks
Embeds chunks
Saves to vector store
When user queries:
System retrieves top relevant chunks
Sends context to LLM
Returns final generated answer
✅ Acceptance Criteria
Document loader correctly retrieves and splits text
Embeddings are generated without errors
Vector store stores and retrieves relevant chunks
RAG chain returns meaningful answers using context
API endpoint returns AI answers with applied instruction context
Errors are handled for missing DB data or missing embeddings
Code is modular and lives in src/rag/* structure
💡 Optional Enhancements
Add cron job to refresh embeddings when instructions update
Cache retriever for performance
Add context visibility in API response (debug mode)
Store vector DB in Redis or file system
Enable versioning of embeddings set
🔹 Summary
Implement a complete RAG pipeline that allows the backend to load text data (e.g., instructions from Excel/Word uploads), convert it into embeddings, store them in a vector store, retrieve the most relevant chunks, and pass them into LangChain for final AI responses.
This makes the AI more accurate, context-aware, and capable of using user-uploaded data.
👤 User Story
As a developer,
I want the AI system to use context from stored documents,
So that the AI responses become accurate and aligned with the user’s uploaded instructions.
📌 Requirements / Scope
Use one of the following (decide in issue or create sub-task):
MemoryVectorStore (temporary, dev use)
Faiss, Chroma, Pinecone, or Qdrant (recommended for production)
Create a module:
src/rag/vectorStore.js
Responsibilities:
Initialize vector store
Provide addDocuments(chunks)
Provide getRetriever() for LangChain
Save & load vector store (if supported by backend DB)
Load text from database (instructions saved from Excel/Word upload).
Break long text into chunks using:
RecursiveCharacterTextSplitter
File:
src/rag/documentLoader.js
Required functions:
loadInstructionsFromDB()
chunkDocuments(text)
Use @langchain/openai embeddings:
Embed each chunk and store in vector store.
Convert vector store into retriever:
Create final RAG pipeline using RunnableSequence:
Steps:
Retrieve relevant context
Format prompt
Call LLM model
Return response with used context
File:
src/rag/ragChain.js
Create backend endpoint for querying RAG:
POST /api/ai/rag
Backend:
Load retriever
Get context
Pass to RAG chain
Send answer
🔍 Expected Flow
User uploads Excel/Word → text is extracted → saved in DB.
RAG ingestion job runs:
Loads text
Splits into chunks
Embeds chunks
Saves to vector store
When user queries:
System retrieves top relevant chunks
Sends context to LLM
Returns final generated answer
✅ Acceptance Criteria
Document loader correctly retrieves and splits text
Embeddings are generated without errors
Vector store stores and retrieves relevant chunks
RAG chain returns meaningful answers using context
API endpoint returns AI answers with applied instruction context
Errors are handled for missing DB data or missing embeddings
Code is modular and lives in src/rag/* structure
💡 Optional Enhancements
Add cron job to refresh embeddings when instructions update
Cache retriever for performance
Add context visibility in API response (debug mode)
Store vector DB in Redis or file system
Enable versioning of embeddings set