Skip to content

Implement RAG (Retrieval-Augmented Generation) for AI System #21

Description

@Sc-Designs

🔹 Summary

Implement a complete RAG pipeline that allows the backend to load text data (e.g., instructions from Excel/Word uploads), convert it into embeddings, store them in a vector store, retrieve the most relevant chunks, and pass them into LangChain for final AI responses.

This makes the AI more accurate, context-aware, and capable of using user-uploaded data.

👤 User Story

As a developer,
I want the AI system to use context from stored documents,
So that the AI responses become accurate and aligned with the user’s uploaded instructions.

📌 Requirements / Scope

  1. Vector Store Setup

Use one of the following (decide in issue or create sub-task):

MemoryVectorStore (temporary, dev use)

Faiss, Chroma, Pinecone, or Qdrant (recommended for production)

Create a module:
src/rag/vectorStore.js

Responsibilities:

Initialize vector store

Provide addDocuments(chunks)

Provide getRetriever() for LangChain

Save & load vector store (if supported by backend DB)

  1. Document Loader + Chunking

Load text from database (instructions saved from Excel/Word upload).

Break long text into chunks using:

RecursiveCharacterTextSplitter

File:
src/rag/documentLoader.js

Required functions:

loadInstructionsFromDB()

chunkDocuments(text)

  1. Embeddings

Use @langchain/openai embeddings:

new OpenAIEmbeddings({ model: "text-embedding-3-small" })

Embed each chunk and store in vector store.

  1. Retriever Setup

Convert vector store into retriever:

const retriever = vectorStore.asRetriever({
  k: 5,  // top relevant chunks
});
  1. RAG Chain

Create final RAG pipeline using RunnableSequence:

Steps:

Retrieve relevant context

Format prompt

Call LLM model

Return response with used context

File:
src/rag/ragChain.js

  1. API Endpoint

Create backend endpoint for querying RAG:

POST /api/ai/rag

Payload:

{
  "query": "How should the AI behave when a user uploads a file?"
}

Backend:

Load retriever

Get context

Pass to RAG chain

Send answer

🔍 Expected Flow

User uploads Excel/Word → text is extracted → saved in DB.

RAG ingestion job runs:

Loads text

Splits into chunks

Embeds chunks

Saves to vector store

When user queries:

System retrieves top relevant chunks

Sends context to LLM

Returns final generated answer

✅ Acceptance Criteria

Document loader correctly retrieves and splits text

Embeddings are generated without errors

Vector store stores and retrieves relevant chunks

RAG chain returns meaningful answers using context

API endpoint returns AI answers with applied instruction context

Errors are handled for missing DB data or missing embeddings

Code is modular and lives in src/rag/* structure

💡 Optional Enhancements

Add cron job to refresh embeddings when instructions update

Cache retriever for performance

Add context visibility in API response (debug mode)

Store vector DB in Redis or file system

Enable versioning of embeddings set

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is neededquestionFurther information is requestedwocs
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions