Variable RAG DB setup for query, query v2, and streaming query#908
Variable RAG DB setup for query, query v2, and streaming query#908tisnik merged 1 commit intolightspeed-core:mainfrom
Conversation
Signed-off-by: Lucas <lyoon@redhat.com>
WalkthroughThe changes extend the system to support selective vector store querying in RAG operations. A new optional Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @JslYoon. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/app/endpoints/streaming_query.py (1)
1091-1104: Handle empty list edge case consistently across all endpoints.This endpoint has the same empty list ambiguity as
query.pyandquery_v2.py. For a unified user experience, all three query endpoints should handlevector_store_idsidentically:
None→ fetch all available stores[]→ ??? (currently treated as "use no stores", but should this fetch all instead?)["id1", "id2"]→ use specified storesRecommend normalizing the behavior across all endpoints (query.py, query_v2.py, and streaming_query.py) by either:
- Treating empty list as "fetch all" (normalize at model validation level), or
- Treating empty list as "disable RAG" and documenting it clearly
- # Use specified vector stores or fetch all available ones - if query_request.vector_store_ids: - vector_db_ids = query_request.vector_store_ids - else: - vector_db_ids = [ - vector_store.id - for vector_store in (await client.vector_stores.list()).data - ] + # Use specified vector stores or fetch all available ones + if query_request.vector_store_ids is not None and len(query_request.vector_store_ids) > 0: + vector_db_ids = query_request.vector_store_ids + else: + # None or empty list means fetch all available stores + vector_db_ids = [ + vector_store.id + for vector_store in (await client.vector_stores.list()).data + ]
🧹 Nitpick comments (1)
src/models/requests.py (1)
163-168: Clarify empty list vs None semantics and consider validation.The field uses truthiness checks in endpoints (
if query_request.vector_store_ids:), so an empty list[]would be treated as "use these specific stores" (resulting in no RAG tools), whileNonemeans "fetch all available stores". This distinction is subtle and not documented.Consider:
- Adding a
@field_validatorto normalize empty lists toNoneif they should mean "use all stores"- Validating that list items are non-empty strings
- Documenting the empty list behavior in the field description
@field_validator("vector_store_ids") @classmethod def validate_vector_store_ids( cls, value: Optional[list[str]] ) -> Optional[list[str]]: """ Validate and normalize vector store IDs. Parameters: value: List of vector store IDs or None Returns: Normalized list or None Raises: ValueError: If list contains empty strings """ if value is None: return value # Normalize empty list to None (fetch all stores) if len(value) == 0: return None # Validate no empty strings if any(not id.strip() for id in value): raise ValueError("vector_store_ids cannot contain empty strings") return valueAs per coding guidelines, Pydantic models should use
@field_validatorfor custom validation.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/app/endpoints/query.py(1 hunks)src/app/endpoints/query_v2.py(1 hunks)src/app/endpoints/streaming_query.py(1 hunks)src/models/requests.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use absolute imports for internal modules in LCS project (e.g.,from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) andOptional[Type]orType | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Useasync deffor I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes:Configurationfor config classes,Error/Exceptionfor exceptions,Resolverfor strategy patterns,Interfacefor abstract base classes
Abstract classes must use ABC with@abstractmethoddecorators
Include complete type annotations for all class attributes in Python classes
Useimport loggingand module logger pattern with standard log levels: debug, info, warning, error
Files:
src/models/requests.pysrc/app/endpoints/streaming_query.pysrc/app/endpoints/query_v2.pysrc/app/endpoints/query.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py: Use@field_validatorand@model_validatorfor custom validation in Pydantic models
Pydantic configuration classes must extendConfigurationBase; data models must extendBaseModel
Files:
src/models/requests.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/streaming_query.pysrc/app/endpoints/query_v2.pysrc/app/endpoints/query.py
src/**/{client,app/endpoints/**}.py
📄 CodeRabbit inference engine (CLAUDE.md)
Handle
APIConnectionErrorfrom Llama Stack in integration code
Files:
src/app/endpoints/streaming_query.pysrc/app/endpoints/query_v2.pysrc/app/endpoints/query.py
🔇 Additional comments (1)
src/app/endpoints/query.py (1)
731-744: [Rewritten review comment]
[Classification tag]
| # Get vector stores for RAG tools - use specified ones or fetch all | ||
| if query_request.vector_store_ids: | ||
| vector_store_ids = query_request.vector_store_ids | ||
| else: | ||
| vector_store_ids = [ | ||
| vector_store.id for vector_store in (await client.vector_stores.list()).data | ||
| ] |
There was a problem hiding this comment.
Handle empty list edge case consistently.
The same empty list ambiguity exists here as in query.py: if query_request.vector_store_ids: evaluates to False for both None and [], but only None is falsy. An empty list would skip the fetch-all-stores branch, potentially disabling RAG unintentionally.
For consistency across endpoints and clarity, consider explicit handling:
- # Get vector stores for RAG tools - use specified ones or fetch all
- if query_request.vector_store_ids:
- vector_store_ids = query_request.vector_store_ids
- else:
- vector_store_ids = [
- vector_store.id for vector_store in (await client.vector_stores.list()).data
- ]
+ # Get vector stores for RAG tools - use specified ones or fetch all
+ if query_request.vector_store_ids is not None and len(query_request.vector_store_ids) > 0:
+ vector_store_ids = query_request.vector_store_ids
+ else:
+ # None or empty list means fetch all available stores
+ vector_store_ids = [
+ vector_store.id for vector_store in (await client.vector_stores.list()).data
+ ]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Get vector stores for RAG tools - use specified ones or fetch all | |
| if query_request.vector_store_ids: | |
| vector_store_ids = query_request.vector_store_ids | |
| else: | |
| vector_store_ids = [ | |
| vector_store.id for vector_store in (await client.vector_stores.list()).data | |
| ] | |
| # Get vector stores for RAG tools - use specified ones or fetch all | |
| if query_request.vector_store_ids is not None and len(query_request.vector_store_ids) > 0: | |
| vector_store_ids = query_request.vector_store_ids | |
| else: | |
| # None or empty list means fetch all available stores | |
| vector_store_ids = [ | |
| vector_store.id for vector_store in (await client.vector_stores.list()).data | |
| ] |
🤖 Prompt for AI Agents
In src/app/endpoints/query_v2.py around lines 746 to 752, the current condition
treats both None and empty list as falsy so an empty list unintentionally
triggers fetching all stores; change the check to explicitly test for None
(e.g., if query_request.vector_store_ids is not None) so that a provided empty
list is preserved (disables RAG) while only None causes the code to fetch all
vector stores; update assignment accordingly and ensure any downstream code
expects an empty list vs populated list.
|
/ok-to-test |
Description
Currently, the RAG implementation lacks granularity in retrieval. To improve response accuracy and support multi-tenant or multi-context use cases, we need the ability to dynamically select which knowledge base (Vector DB/Index) to query.
Added a new request param vector_store_ids for:
query, query_v2, and streaming_query endpoints
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.