Skip to content

LCORE-302: User Data Collection#178

Closed
jrobertboos wants to merge 129 commits intolightspeed-core:mainfrom
jrobertboos:lcore-302
Closed

LCORE-302: User Data Collection#178
jrobertboos wants to merge 129 commits intolightspeed-core:mainfrom
jrobertboos:lcore-302

Conversation

@jrobertboos
Copy link
Contributor

@jrobertboos jrobertboos commented Jul 3, 2025

Description

This pull request implements support for delivering user chat transcripts and feedback at a regular period to an ingress server.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Introduced a configurable data collector service that periodically collects and securely sends user feedback and transcripts.
    • Added command-line option and Makefile target to run the data collector service independently.
    • Added authentication modules supporting no-op, token-based, and Kubernetes RBAC authentication for API endpoints.
    • Implemented session-persistent agent caching for query and streaming query endpoints to enhance performance.
    • Enhanced system prompt customization with configuration options and validation.
    • Added CORS middleware to support cross-origin requests.
    • Extended API with an authorized endpoint providing user identity information.
    • Added GitHub Actions workflows for multi-architecture container image building and publishing.
    • Updated documentation with detailed usage, data collector service description, system prompt customization, and publishing instructions.
  • Bug Fixes

    • Improved configuration loading validation with clear error responses.
    • Fixed streaming query metadata extraction and referenced document reporting.
  • Tests

    • Added comprehensive unit tests covering configuration validation, data collector service, authentication modules, endpoint handlers, agent caching, and utility functions.
    • Updated existing tests to reflect new authentication, client management, and endpoint behaviors.
  • Chores

    • Updated development dependencies for improved type checking support.
    • Refactored client initialization to singleton holders enforcing explicit loading.
    • Replaced deprecated client API calls with updated interfaces.
    • Cleaned up and enhanced Makefile targets for linting, packaging, and uploading.
    • Replaced deprecated Docker CMD with ENTRYPOINT for container runtime.

jrobertboos and others added 2 commits July 3, 2025 09:48
- Introduced DataCollectorConfiguration model to manage data collection settings.
- Updated lightspeed-stack.yaml to include data collector configuration options.
- Integrated data collector service startup and shutdown in main application flow.
…core#181)

* LCORE-311: build dev image from main with tag dev-latest

* LCORE-311: build dev image from main with tag dev-latest
raise ValueError("Ingress server URL is not configured")

headers = {
"Content-Type": "application/vnd.redhat.lightspeed-stack.periodic+tar",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the service part lightspeed-stack needs to be empty. That is actually the part which distinguishes from what lightspeed the data comes from, eg application/vnd.redhat.openshift-lightspeed.periodic+tar, application/vnd.redhat.openstack-lightspeed.periodic+tar.
The default can be there, but there won't be any listeners for it on the RH side. So it seems a bit pointless. And leaving this blank enforces the need to have service defined for ingres before trying to send the data.

tisnik and others added 4 commits July 14, 2025 09:56
…lint-issues

LCORE-303: fixed Pylint issues + new config tests
Switch Containerfile CMD with ENTRYPOINT
- Introduced `ingress_content_service_name` to the DataCollectorConfiguration model to specify the service name for data collection.
- Updated `lightspeed-stack.yaml` to include the new configuration option.
- Enhanced error handling to ensure `ingress_content_service_name` is required when the data collector is enabled.
- Modified the DataCollectorService to use the new service name in the content type header during data transmission.
- Updated unit tests to validate the new configuration and error scenarios.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
README.md (1)

119-161: Comprehensive documentation with minor formatting issues.

The documentation provides excellent coverage of the data collector service features, configuration options, and usage instructions. However, there are minor markdown formatting inconsistencies.

Apply this diff to fix the markdown list formatting:

-### Features
-
-- **Periodic Collection**: Runs at configurable intervals
-- **Data Packaging**: Packages feedback and transcript files into compressed tar.gz archives
-- **Secure Transmission**: Sends data to a configured ingress server with optional authentication
-- **File Cleanup**: Optionally removes local files after successful transmission
-- **Error Handling**: Includes retry logic and comprehensive error handling
+### Features
+
+* **Periodic Collection**: Runs at configurable intervals
+* **Data Packaging**: Packages feedback and transcript files into compressed tar.gz archives
+* **Secure Transmission**: Sends data to a configured ingress server with optional authentication
+* **File Cleanup**: Optionally removes local files after successful transmission
+* **Error Handling**: Includes retry logic and comprehensive error handling
src/services/data_collector.py (1)

40-41: Simplify collection interval check.

The null check for collection_interval may be unnecessary if the configuration validation ensures it's always set when the data collector is enabled.

Consider simplifying to:

-                if collector_config.collection_interval is not None:
-                    time.sleep(collector_config.collection_interval)
+                time.sleep(collector_config.collection_interval)

This assumes the configuration validation ensures collection_interval is always set when enabled. If not, keep the current implementation.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 751d103 and 98fed29.

📒 Files selected for processing (7)
  • README.md (2 hunks)
  • lightspeed-stack.yaml (1 hunks)
  • src/models/config.py (2 hunks)
  • src/services/data_collector.py (1 hunks)
  • tests/unit/models/test_config.py (5 hunks)
  • tests/unit/runners/test_data_collector_runner.py (1 hunks)
  • tests/unit/services/test_data_collector.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • lightspeed-stack.yaml
  • src/models/config.py
  • tests/unit/runners/test_data_collector_runner.py
  • tests/unit/models/test_config.py
  • tests/unit/services/test_data_collector.py
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
README.md

125-125: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


126-126: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


127-127: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


128-128: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


129-129: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

🔇 Additional comments (14)
README.md (1)

68-68: LGTM - Make target addition is consistent with implementation.

The new run-data-collector make target properly supports the data collector service feature.

src/services/data_collector.py (13)

1-16: LGTM - Well-structured imports and module setup.

The imports are properly organized and the module-level setup is clean.


18-24: LGTM - Clear class definition and documentation.

The class is well-documented with a clear docstring explaining its purpose.


25-50: Robust main loop with proper error handling.

The main loop correctly handles the periodic collection cycle with proper error handling for KeyboardInterrupt and operational errors. The retry logic with a 5-minute backoff is appropriate.


51-93: Well-structured collection orchestration.

The _perform_collection method properly coordinates the collection process with good logging and error handling. The exception handling now includes tarfile.TarError as suggested in previous reviews.


94-106: LGTM - Proper feedback file collection logic.

The method correctly handles disabled feedback collection and missing directories, with appropriate glob pattern matching.


107-120: LGTM - Comprehensive transcript file collection.

The method properly handles recursive file collection with rglob for the transcript directory structure.


121-142: Well-implemented tarball creation and transmission.

The method properly coordinates tarball creation, transmission, and cleanup with appropriate error handling in the finally block.


143-169: Robust tarball creation with proper error handling.

The tarball creation method correctly uses the base directory for relative paths (addressing previous review comments) and handles individual file errors gracefully without failing the entire operation.


170-212: Secure transmission implementation.

The method properly handles HTTP transmission with appropriate headers, authentication, and error handling. The content-type header format appears to address previous review feedback.


180-182: Verify content-type header format.

Based on previous review comments, the content-type header format should match the expected ingress server format. The current implementation uses the configurable service name.

Please verify that the content-type header format application/vnd.redhat.{service_name}.periodic+tar matches the expected format for your ingress server, especially considering the previous feedback about service-specific content types.


213-221: LGTM - Proper file cleanup implementation.

The cleanup method handles individual file removal errors gracefully without failing the entire cleanup operation.


222-251: Thorough empty directory cleanup.

The method properly handles the nested directory structure cleanup for transcripts, removing empty conversation and user directories while handling errors gracefully.


252-259: LGTM - Proper temporary file cleanup.

The tarball cleanup method properly handles temporary file removal with appropriate error handling.

manstis and others added 11 commits July 14, 2025 13:30
LCORE-369: Align service URLs with road-core
- Introduced DataCollectorConfiguration model to manage data collection settings.
- Updated lightspeed-stack.yaml to include data collector configuration options.
- Integrated data collector service startup and shutdown in main application flow.
- Introduced a new target in the Makefile to run the data collector service.
- Enhanced `lightspeed_stack.py` to support starting the data collector via command-line argument.
- Refactored `data_collector.py` to simplify the service's run method and remove unnecessary async handling.
- Cleaned up the main application startup process by removing the data collector's startup and shutdown events from `main.py`.
- Added `types-requests` to development dependencies for type checking.
- Modified `lightspeed_stack.py` to pass the data collector configuration to the `start_data_collector` function.
- Updated `DataCollectorConfiguration` model to enforce required fields when data archival is enabled.
- Refactored logging statements in `DataCollectorService` for improved clarity and consistency.
- Added unit tests for `UserDataCollection` to validate data collector configuration scenarios.
- Included `types-requests` in the development dependencies for type checking.
- Updated the lock file to reflect the new package and its version.
- Added relevant wheel and source information for `types-requests`.
- Updated exception handling in tests to use specific exception types (OSError, requests.RequestException, tarfile.TarError).
- Added new test for handling missing ingress server URL in tarball sending.
- Improved test coverage for the _perform_collection method to ensure proper exception catching.
- Renamed validation method in DataCollectorConfiguration for clarity.
- Updated error messages to reflect data collector context.
- Improved exception handling in start_data_collector function to log errors.
- Modified DataCollectorService to conditionally send feedback and transcript files based on configuration.
- Updated unit tests to cover new exception handling and configuration scenarios.
- Introduced new constants for data collector settings, including collection interval, connection timeout, and retry interval.
- Updated DataCollectorConfiguration to use these constants for default values and enforced positive integer types.
- Refactored error handling in tests to align with the new configuration model.
- Enhanced unit tests to validate default values and ensure proper configuration scenarios.
- Introduced `ingress_content_service_name` to the DataCollectorConfiguration model to specify the service name for data collection.
- Updated `lightspeed-stack.yaml` to include the new configuration option.
- Enhanced error handling to ensure `ingress_content_service_name` is required when the data collector is enabled.
- Modified the DataCollectorService to use the new service name in the content type header during data transmission.
- Updated unit tests to validate the new configuration and error scenarios.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🔭 Outside diff range comments (4)
tests/e2e/features/steps/common_http.py (3)

135-147: Fix misleading assertion message.

The assertion message "Response does not contain any payload" is misleading since this check is for the expected schema in context.text, not the response payload.

-    assert context.text is not None, "Response does not contain any payload"
+    assert context.text is not None, "Schema needs to be specified"

160-173: Fix misleading assertion message.

Similar to the previous function, the assertion message is misleading as it checks for expected JSON in context.text, not the response payload.

-    assert context.text is not None, "Response does not contain any payload"
+    assert context.text is not None, "Expected body needs to be specified"

177-194: Fix misleading assertion message and improve function safety.

The assertion message is misleading. Additionally, the function modifies dictionaries without checking if the field exists first.

-    assert context.text is not None, "Response does not contain any payload"
+    assert context.text is not None, "Expected body needs to be specified"

The use of .pop(field, None) is already safe and handles missing fields correctly.

tests/unit/auth/test_k8s.py (1)

237-245: Potential test isolation issue.

Directly modifying the class attribute K8sClientSingleton._cluster_id could affect other tests if they run in parallel or if teardown is not proper.

Consider using a fixture or setup/teardown to ensure test isolation:

 def test_get_cluster_id_outside_of_cluster(mocker):
     """Test get_cluster_id function when running outside of cluster."""
     mocker.patch("auth.k8s.RUNNING_IN_CLUSTER", False)
     mocker.patch("auth.k8s.K8sClientSingleton.__new__")
+    # Mock the class attribute instead of modifying it directly
+    mocker.patch.object(K8sClientSingleton, '_cluster_id', None)
 
-    # ensure cluster_id is None to trigger the condition
-    K8sClientSingleton._cluster_id = None
     assert K8sClientSingleton.get_cluster_id() == CLUSTER_ID_LOCAL
🧹 Nitpick comments (27)
.github/workflows/build_and_push_release.yaml (1)

43-52: Make the diagnostic steps optional to shorten CI time

The image/manifest inspection is handy while tuning the workflow but adds ~10-15 s per run and noisy logs.
Consider wrapping them with an environment flag:

- name: Check images
  if: ${{ env.DEBUG == 'true' }}
  run: |
    buildah images | grep '${{ env.IMAGE_NAME }}'
    ...
tests/e2e/features/steps/common.py (1)

10-10: Consider more meaningful assertions for test steps.

While the assert context is not None checks are better than pass, they provide minimal validation. Consider implementing actual service status checks or default state validation to make these test steps more meaningful.

For example:

 @given("the service is started locally")
 def service_is_started_locally(context: Context) -> None:
     """Check the service status."""
-    assert context is not None
+    assert context is not None
+    # TODO: Add actual service health check
+    # Example: response = requests.get("http://localhost:8000/health")
+    # assert response.status_code == 200

Also applies to: 16-16

src/utils/types.py (2)

10-19: Consider thread safety for the Singleton implementation.

The current implementation uses a mutable class variable _instances which could lead to race conditions in multi-threaded environments. Consider using a lock for thread safety.

+import threading
+
 class Singleton(type):
     """Metaclass for Singleton support."""
 
-    _instances = {}  # type: ignore
+    _instances = {}  # type: ignore
+    _lock = threading.Lock()
 
     def __call__(cls, *args, **kwargs):  # type: ignore
         """Ensure a single instance is created."""
-        if cls not in cls._instances:
-            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
+        if cls not in cls._instances:
+            with cls._lock:
+                if cls not in cls._instances:
+                    cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
         return cls._instances[cls]

32-37: Consider using more robust model ID validation.

The current implementation uses a simple string prefix check. Consider adding validation for empty/None model_id and potentially more robust matching.

 @staticmethod
 def get_parser(model_id: str) -> Optional[ToolParser]:
     """Get the applicable ToolParser for the model."""
-    if model_id and model_id.lower().startswith("granite"):
+    if model_id and model_id.strip().lower().startswith("granite"):
         return GraniteToolParser()
     return None
src/auth/interface.py (1)

8-13: LGTM: Clean abstract interface design.

The interface is well-designed with clear method signature. Consider adding more specific documentation about what each string in the returned tuple represents.

 @abstractmethod
 async def __call__(self, request: Request) -> tuple[str, str, str]:
-    """Validate FastAPI Requests for authentication and authorization."""
+    """Validate FastAPI Requests for authentication and authorization.
+    
+    Returns:
+        tuple[str, str, str]: A tuple containing (user_id, username, user_token)
+    """
tests/unit/auth/test_utils.py (1)

14-22: Consider using pytest.raises for cleaner exception testing.

The current try/except approach works but pytest.raises provides cleaner and more idiomatic exception testing.

+import pytest
+
-def test_extract_user_token_no_header():
-    """Test extracting user token when no Authorization header is present."""
-    headers = {}
-    try:
-        extract_user_token(headers)
-    except HTTPException as exc:
-        assert exc.status_code == 400
-        assert exc.detail == "No Authorization header found"
+def test_extract_user_token_no_header():
+    """Test extracting user token when no Authorization header is present."""
+    headers = {}
+    with pytest.raises(HTTPException) as exc_info:
+        extract_user_token(headers)
+    assert exc_info.value.status_code == 400
+    assert exc_info.value.detail == "No Authorization header found"

-def test_extract_user_token_invalid_format():
-    """Test extracting user token with invalid Authorization header format."""
-    headers = {"Authorization": "InvalidFormat"}
-    try:
-        extract_user_token(headers)
-    except HTTPException as exc:
-        assert exc.status_code == 400
-        assert exc.detail == "No token found in Authorization header"
+def test_extract_user_token_invalid_format():
+    """Test extracting user token with invalid Authorization header format."""
+    headers = {"Authorization": "InvalidFormat"}
+    with pytest.raises(HTTPException) as exc_info:
+        extract_user_token(headers)
+    assert exc_info.value.status_code == 400
+    assert exc_info.value.detail == "No token found in Authorization header"

Also applies to: 24-31

src/auth/utils.py (1)

10-12: Fix parameter name in docstring.

The docstring refers to header but the parameter is named headers.

     Args:
-        header: The authorization header containing the token.
+        headers: The request headers containing the authorization header.
tests/unit/utils/test_types.py (2)

9-9: Fix typo in method name.

There's a typo in the method name: "is_is_not_granite" should be "is_not_granite".

-    def test_get_tool_parser_when_model_is_is_not_granite(self):
+    def test_get_tool_parser_when_model_is_not_granite(self):

15-19: Remove redundant test case.

This test case is redundant with the first test case (test_get_tool_parser_when_model_is_not_granite). Both test the same scenario: when the model ID is not a granite model, the parser should return None.

-    def test_get_tool_parser_when_model_id_does_not_start_with_granite(self):
-        """Test that the tool_parser is None when model_id does not start with granite."""
-        assert (
-            GraniteToolParser.get_parser("a-fine-trained-granite-model") is None
-        ), "tool_parser should be None"
-
src/app/endpoints/config.py (1)

52-52: Fix typo in error message.

There's a typo in the error message: "is no loaded" should be "is not loaded".

-            "response": "Configuration is no loaded",
+            "response": "Configuration is not loaded",
src/app/endpoints/authorized.py (1)

14-14: Consider dependency injection for auth_dependency.

Initializing auth_dependency at module level could be problematic if configuration changes during runtime. Consider using FastAPI's dependency injection system instead.

-auth_dependency = get_auth_dependency()
+# Remove module-level initialization

@router.post("/authorized", responses=authorized_responses)
-async def authorized_endpoint_handler(_request: Request) -> AuthorizedResponse:
+async def authorized_endpoint_handler(
+    _request: Request, 
+    auth_dependency: AuthInterface = Depends(get_auth_dependency)
+) -> AuthorizedResponse:
tests/unit/auth/test_auth.py (1)

12-12: Test the virtual_path parameter.

The get_auth_dependency function accepts a virtual_path parameter, but none of the tests verify this parameter is properly passed to the dependencies.

Add a test to verify the virtual_path parameter:

+def test_get_auth_dependency_with_virtual_path():
+    """Test getting authentication dependency with custom virtual path."""
+    configuration.authentication_configuration.module = AUTH_MOD_NOOP
+    custom_path = "/custom/path"
+    auth_dependency = get_auth_dependency(virtual_path=custom_path)
+    assert auth_dependency.virtual_path == custom_path
src/auth/__init__.py (1)

14-16: Consider adding virtual_path validation.

The function accepts a virtual_path parameter but doesn't validate it. Invalid paths could cause issues downstream in the authentication dependencies.

Add basic validation for the virtual_path parameter:

def get_auth_dependency(
    virtual_path: str = constants.DEFAULT_VIRTUAL_PATH,
) -> AuthInterface:
    """Select the configured authentication dependency interface."""
+    if not virtual_path or not virtual_path.startswith("/"):
+        raise ValueError(f"Invalid virtual_path: {virtual_path}")
tests/unit/app/endpoints/test_authorized.py (1)

34-34: Fix misleading docstring.

The docstring mentions "custom user ID" but the test is actually testing unauthorized access.

-def test_authorized_unauthorized(mocker):
-    """Test the authorized endpoint handler with a custom user ID."""
+def test_authorized_unauthorized(mocker):
+    """Test the authorized endpoint handler with unauthorized access."""
Makefile (1)

74-74: Consider splitting the verify target to address static analysis warning

The static analysis tool warns about the verify target body exceeding the recommended length. Consider splitting into logical groups:

verify-style: ## Run style-related linters
	$(MAKE) black
	$(MAKE) ruff
	$(MAKE) docstyle

verify-analysis: ## Run analysis-related linters  
	$(MAKE) pylint
	$(MAKE) pyright
	$(MAKE) check-types

verify: ## Run all linters
	$(MAKE) verify-style
	$(MAKE) verify-analysis
tests/unit/auth/test_noop_with_token.py (1)

1-96: Consider adding edge case tests

The test coverage is good but could be enhanced with additional edge cases:

Consider adding tests for:

  • Empty bearer token (e.g., "Bearer ")
  • Multiple authorization headers
  • Case sensitivity of "Bearer" prefix
  • Malformed query string parameters

Would you like me to generate these additional test cases?

src/client.py (1)

35-38: Address TODO: Implement custom exception

The TODO comments indicate that a custom exception should be used instead of ValueError. This would improve error handling and make the code more maintainable.

Would you like me to create a custom exception class for configuration errors and update both locations to use it?

Also applies to: 72-74

README.md (3)

17-29: Fix markdown list indentation

The table of contents has inconsistent indentation. According to markdown best practices, nested list items should be indented by 2 spaces, not 4.

 * [Configuration](#configuration)
-    * [Llama Stack as separate server](#llama-stack-as-separate-server)
-    * [Llama Stack as client library](#llama-stack-as-client-library)
-    * [System prompt](#system-prompt)
+  * [Llama Stack as separate server](#llama-stack-as-separate-server)
+  * [Llama Stack as client library](#llama-stack-as-client-library)
+  * [System prompt](#system-prompt)
 * [Usage](#usage)
-    * [Make targets](#make-targets)
-    * [Running Linux container image](#running-linux-container-image)
+  * [Make targets](#make-targets)
+  * [Running Linux container image](#running-linux-container-image)
 * [Endpoints](#endpoints)
-    * [Readiness Endpoint](#readiness-endpoint)
-    * [Liveness Endpoint](#liveness-endpoint)
+  * [Readiness Endpoint](#readiness-endpoint)
+  * [Liveness Endpoint](#liveness-endpoint)
 * [Publish the service as Python package on PyPI](#publish-the-service-as-python-package-on-pypi)
-    * [Generate distribution archives to be uploaded into Python registry](#generate-distribution-archives-to-be-uploaded-into-python-registry)
-    * [Upload distribution archives into selected Python registry](#upload-distribution-archives-into-selected-python-registry)
-    * [Packages on PyPI and Test PyPI](#packages-on-pypi-and-test-pypi)
+  * [Generate distribution archives to be uploaded into Python registry](#generate-distribution-archives-to-be-uploaded-into-python-registry)
+  * [Upload distribution archives into selected Python registry](#upload-distribution-archives-into-selected-python-registry)
+  * [Packages on PyPI and Test PyPI](#packages-on-pypi-and-test-pypi)

266-275: Add language specifier to code blocks

The code blocks showing the .pypirc configuration should have a language specifier for proper syntax highlighting.

-```
+```ini
 [testpypi]
   username = __token__
   password = pypi-{your-API-token}

318-322: Maintain consistent list style

The feature list uses dashes while the rest of the document uses asterisks for unordered lists.

-### Features
-
-- **Periodic Collection**: Runs at configurable intervals
-- **Data Packaging**: Packages feedback and transcript files into compressed tar.gz archives
-- **Secure Transmission**: Sends data to a configured ingress server with optional authentication
-- **File Cleanup**: Optionally removes local files after successful transmission
-- **Error Handling**: Includes retry logic and comprehensive error handling
+### Features
+
+* **Periodic Collection**: Runs at configurable intervals
+* **Data Packaging**: Packages feedback and transcript files into compressed tar.gz archives
+* **Secure Transmission**: Sends data to a configured ingress server with optional authentication
+* **File Cleanup**: Optionally removes local files after successful transmission
+* **Error Handling**: Includes retry logic and comprehensive error handling
src/utils/mcp_headers.py (2)

15-26: Fix typo in docstring

There's a grammatical error in the docstring.

-    """Get the mcp headers dependency to passed to mcp servers.
+    """Get the mcp headers dependency to pass to mcp servers.

46-52: Enhance type validation for nested dictionary values

While the function checks if mcp_headers is a dictionary, it doesn't validate that the values are also dictionaries with string keys and values as expected by the return type.

         if not isinstance(mcp_headers, dict):
             logger.error(
                 "MCP headers wrong type supplied (mcp headers must be a dictionary), "
                 "but type %s was supplied",
                 type(mcp_headers),
             )
             mcp_headers = {}
+        else:
+            # Validate nested structure
+            for key, value in list(mcp_headers.items()):
+                if not isinstance(value, dict) or not all(
+                    isinstance(k, str) and isinstance(v, str) 
+                    for k, v in value.items()
+                ):
+                    logger.error(
+                        "Invalid MCP header structure for key '%s': values must be dict[str, str]",
+                        key
+                    )
+                    del mcp_headers[key]
tests/unit/auth/test_k8s.py (1)

174-186: Consider a more descriptive test name.

The test verifies that the singleton returns correct API client instances, not specifically about loading kubeconfig.

-def test_auth_dependency_config(mocker):
-    """Test the auth dependency can load kubeconfig file."""
+def test_singleton_returns_correct_api_clients(mocker):
+    """Test that K8sClientSingleton returns correct API client instances."""
src/auth/k8s.py (1)

44-95: Consider thread-safe singleton implementation.

The current singleton implementation could have race conditions in multi-threaded environments. Multiple threads could pass the if cls._instance is None check before the first one completes initialization.

Consider using a lock for thread safety:

import threading

class K8sClientSingleton:
    _instance = None
    _lock = threading.Lock()
    
    def __new__(cls: type[Self]) -> Self:
        if cls._instance is None:
            with cls._lock:
                # Double-check pattern
                if cls._instance is None:
                    cls._instance = super().__new__(cls)
                    # ... rest of initialization
        return cls._instance
src/app/endpoints/query.py (2)

40-41: Consider adding cache monitoring and eviction metrics.

The agent cache configuration looks reasonable with a 1-hour TTL and 1000 max size. However, in production environments, it would be beneficial to monitor cache hit rates, evictions, and memory usage to ensure optimal performance.

Consider implementing metrics collection for:

  • Cache hit/miss rates
  • Number of evictions
  • Memory usage
  • Average agent lifetime

74-100: Good implementation of agent caching with session persistence!

The caching logic properly handles both new and existing conversations. The session persistence is correctly enabled to maintain conversation state.

Note: The TODO comment indicates a planned refactor to ReActAgent - ensure this is tracked in your issue system.

Would you like me to create an issue to track the TODO for moving to ReActAgent?

src/app/endpoints/streaming_query.py (1)

123-182: Consider refactoring the complex event processing logic.

While the metadata extraction logic is correct, the function has become quite complex (as indicated by the pylint disable). Consider extracting the metadata parsing logic into a separate helper function for better maintainability.

+def extract_metadata_from_tool_response(tool_response) -> dict[str, dict]:
+    """Extract metadata from knowledge search tool responses."""
+    metadata_map = {}
+    if tool_response.tool_name == "knowledge_search" and tool_response.content:
+        for text_content_item in tool_response.content:
+            if isinstance(text_content_item, TextContentItem):
+                for match in METADATA_PATTERN.findall(text_content_item.text):
+                    meta = json.loads(match.replace("'", '"'))
+                    metadata_map[meta["document_id"]] = meta
+    return metadata_map

 def stream_build_event(chunk: Any, chunk_id: int, metadata_map: dict) -> str | None:
     """Build a streaming event from a chunk response.
     ...
     """
-    # pylint: disable=R1702
     if hasattr(chunk.event, "payload"):
         if chunk.event.payload.event_type == "step_progress":
             if hasattr(chunk.event.payload.delta, "text"):
                 text = chunk.event.payload.delta.text
                 return format_stream_data(
                     {
                         "event": "token",
                         "data": {
                             "id": chunk_id,
                             "role": chunk.event.payload.step_type,
                             "token": text,
                         },
                     }
                 )
         if (
             chunk.event.payload.event_type == "step_complete"
             and chunk.event.payload.step_details.step_type == "tool_execution"
         ):
             for r in chunk.event.payload.step_details.tool_responses:
-                if r.tool_name == "knowledge_search" and r.content:
-                    for text_content_item in r.content:
-                        if isinstance(text_content_item, TextContentItem):
-                            for match in METADATA_PATTERN.findall(
-                                text_content_item.text
-                            ):
-                                meta = json.loads(match.replace("'", '"'))
-                                metadata_map[meta["document_id"]] = meta
+                extracted = extract_metadata_from_tool_response(r)
+                metadata_map.update(extracted)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98fed29 and 636b6c9.

⛔ Files ignored due to path filters (2)
  • pdm.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (68)
  • .github/workflows/build.yaml (0 hunks)
  • .github/workflows/build_and_push_dev.yaml (1 hunks)
  • .github/workflows/build_and_push_release.yaml (2 hunks)
  • .github/workflows/build_pr.yaml (1 hunks)
  • Containerfile (1 hunks)
  • Makefile (2 hunks)
  • README.md (6 hunks)
  • docs/e2e_scenarios.md (1 hunks)
  • docs/openapi.json (7 hunks)
  • lightspeed-stack.yaml (1 hunks)
  • pyproject.toml (2 hunks)
  • src/app/endpoints/authorized.py (1 hunks)
  • src/app/endpoints/config.py (3 hunks)
  • src/app/endpoints/feedback.py (1 hunks)
  • src/app/endpoints/health.py (2 hunks)
  • src/app/endpoints/models.py (2 hunks)
  • src/app/endpoints/query.py (5 hunks)
  • src/app/endpoints/streaming_query.py (7 hunks)
  • src/app/main.py (2 hunks)
  • src/app/routers.py (2 hunks)
  • src/auth/__init__.py (1 hunks)
  • src/auth/interface.py (1 hunks)
  • src/auth/k8s.py (1 hunks)
  • src/auth/noop.py (1 hunks)
  • src/auth/noop_with_token.py (1 hunks)
  • src/auth/utils.py (1 hunks)
  • src/client.py (1 hunks)
  • src/configuration.py (2 hunks)
  • src/constants.py (1 hunks)
  • src/lightspeed_stack.py (3 hunks)
  • src/models/config.py (4 hunks)
  • src/models/responses.py (1 hunks)
  • src/utils/auth.py (0 hunks)
  • src/utils/checks.py (1 hunks)
  • src/utils/common.py (3 hunks)
  • src/utils/endpoints.py (1 hunks)
  • src/utils/mcp_headers.py (1 hunks)
  • src/utils/types.py (1 hunks)
  • src/version.py (1 hunks)
  • tests/e2e/features/steps/common.py (1 hunks)
  • tests/e2e/features/steps/common_http.py (7 hunks)
  • tests/e2e/features/steps/llm_query_response.py (2 hunks)
  • tests/e2e/gen_scenario_list.py (1 hunks)
  • tests/integration/test_configuration.py (3 hunks)
  • tests/unit/__init__.py (1 hunks)
  • tests/unit/app/endpoints/test_authorized.py (1 hunks)
  • tests/unit/app/endpoints/test_config.py (2 hunks)
  • tests/unit/app/endpoints/test_health.py (2 hunks)
  • tests/unit/app/endpoints/test_info.py (1 hunks)
  • tests/unit/app/endpoints/test_models.py (1 hunks)
  • tests/unit/app/endpoints/test_query.py (15 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (12 hunks)
  • tests/unit/app/test_routers.py (3 hunks)
  • tests/unit/auth/__init__.py (1 hunks)
  • tests/unit/auth/test_auth.py (1 hunks)
  • tests/unit/auth/test_k8s.py (1 hunks)
  • tests/unit/auth/test_noop.py (1 hunks)
  • tests/unit/auth/test_noop_with_token.py (1 hunks)
  • tests/unit/auth/test_utils.py (1 hunks)
  • tests/unit/models/test_config.py (14 hunks)
  • tests/unit/models/test_responses.py (2 hunks)
  • tests/unit/test_client.py (3 hunks)
  • tests/unit/test_configuration.py (12 hunks)
  • tests/unit/utils/test_auth.py (0 hunks)
  • tests/unit/utils/test_checks.py (1 hunks)
  • tests/unit/utils/test_common.py (13 hunks)
  • tests/unit/utils/test_endpoints.py (1 hunks)
  • tests/unit/utils/test_types.py (1 hunks)
💤 Files with no reviewable changes (3)
  • tests/unit/utils/test_auth.py
  • .github/workflows/build.yaml
  • src/utils/auth.py
✅ Files skipped from review due to trivial changes (7)
  • tests/unit/auth/init.py
  • src/version.py
  • tests/unit/app/endpoints/test_info.py
  • docs/e2e_scenarios.md
  • tests/integration/test_configuration.py
  • tests/e2e/gen_scenario_list.py
  • .github/workflows/build_pr.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
  • pyproject.toml
  • lightspeed-stack.yaml
  • src/lightspeed_stack.py
  • src/models/config.py
  • tests/unit/models/test_config.py
  • src/constants.py
🧰 Additional context used
🧬 Code Graph Analysis (25)
tests/unit/auth/test_utils.py (1)
src/auth/utils.py (1)
  • extract_user_token (7-26)
tests/unit/models/test_responses.py (1)
src/models/responses.py (4)
  • QueryResponse (25-46)
  • StatusResponse (215-244)
  • AuthorizedResponse (247-268)
  • UnauthorizedResponse (271-285)
src/configuration.py (1)
src/models/config.py (2)
  • Customization (161-176)
  • AuthenticationConfiguration (141-158)
src/app/endpoints/health.py (1)
src/client.py (3)
  • LlamaStackClientHolder (19-52)
  • get_client (46-52)
  • get_client (81-87)
src/app/endpoints/config.py (3)
src/utils/endpoints.py (1)
  • check_configuration_loaded (10-16)
src/models/config.py (1)
  • Configuration (179-195)
src/configuration.py (1)
  • configuration (47-52)
tests/unit/test_client.py (2)
src/client.py (4)
  • LlamaStackClientHolder (19-52)
  • AsyncLlamaStackClientHolder (55-87)
  • load (24-44)
  • load (60-79)
src/models/config.py (1)
  • LLamaStackConfiguration (64-92)
src/app/endpoints/authorized.py (2)
src/auth/__init__.py (1)
  • get_auth_dependency (14-38)
src/models/responses.py (3)
  • AuthorizedResponse (247-268)
  • UnauthorizedResponse (271-285)
  • ForbiddenResponse (288-300)
tests/unit/__init__.py (1)
src/configuration.py (2)
  • configuration (47-52)
  • init_from_dict (42-44)
src/app/endpoints/models.py (4)
src/client.py (3)
  • LlamaStackClientHolder (19-52)
  • get_client (46-52)
  • get_client (81-87)
src/configuration.py (2)
  • configuration (47-52)
  • llama_stack_configuration (63-68)
src/models/responses.py (1)
  • ModelsResponse (8-11)
src/utils/endpoints.py (1)
  • check_configuration_loaded (10-16)
src/auth/__init__.py (5)
src/auth/interface.py (1)
  • AuthInterface (8-13)
src/configuration.py (2)
  • configuration (47-52)
  • authentication_configuration (87-92)
src/auth/noop.py (1)
  • NoopAuthDependency (18-42)
src/auth/noop_with_token.py (1)
  • NoopWithTokenAuthDependency (18-46)
src/auth/k8s.py (1)
  • K8SAuthDependency (213-270)
tests/unit/app/endpoints/test_authorized.py (1)
src/app/endpoints/authorized.py (1)
  • authorized_endpoint_handler (34-38)
tests/unit/utils/test_types.py (1)
src/utils/types.py (3)
  • GraniteToolParser (23-37)
  • get_parser (33-37)
  • get_tool_calls (26-30)
tests/unit/auth/test_noop.py (1)
src/auth/noop.py (1)
  • NoopAuthDependency (18-42)
tests/unit/auth/test_auth.py (5)
src/auth/__init__.py (1)
  • get_auth_dependency (14-38)
src/configuration.py (2)
  • configuration (47-52)
  • authentication_configuration (87-92)
src/auth/noop.py (1)
  • NoopAuthDependency (18-42)
src/auth/noop_with_token.py (1)
  • NoopWithTokenAuthDependency (18-46)
src/auth/k8s.py (1)
  • K8SAuthDependency (213-270)
src/auth/noop.py (1)
src/auth/interface.py (1)
  • AuthInterface (8-13)
tests/unit/app/endpoints/test_config.py (2)
src/app/endpoints/config.py (1)
  • config_endpoint_handler (59-64)
src/configuration.py (2)
  • configuration (47-52)
  • AppConfig (20-100)
tests/unit/utils/test_common.py (2)
src/configuration.py (1)
  • customization (95-100)
src/utils/common.py (1)
  • register_mcp_servers_async (30-49)
src/auth/noop_with_token.py (2)
src/auth/interface.py (1)
  • AuthInterface (8-13)
src/auth/utils.py (1)
  • extract_user_token (7-26)
tests/unit/utils/test_checks.py (1)
src/utils/checks.py (3)
  • get_attribute_from_file (13-19)
  • file_check (22-27)
  • InvalidConfigurationError (9-10)
tests/unit/app/test_routers.py (1)
src/app/routers.py (1)
  • include_routers (18-34)
tests/unit/app/endpoints/test_models.py (2)
src/app/endpoints/models.py (1)
  • models_endpoint_handler (46-80)
src/configuration.py (3)
  • configuration (47-52)
  • AppConfig (20-100)
  • init_from_dict (42-44)
tests/unit/test_configuration.py (1)
src/configuration.py (9)
  • AppConfig (20-100)
  • configuration (47-52)
  • service_configuration (55-60)
  • llama_stack_configuration (63-68)
  • user_data_collection_configuration (71-76)
  • mcp_servers (79-84)
  • authentication_configuration (87-92)
  • customization (95-100)
  • load_configuration (35-40)
src/utils/common.py (2)
src/client.py (4)
  • LlamaStackClientHolder (19-52)
  • AsyncLlamaStackClientHolder (55-87)
  • get_client (46-52)
  • get_client (81-87)
src/models/config.py (1)
  • ModelContextProtocolServer (56-61)
tests/unit/auth/test_k8s.py (1)
src/auth/k8s.py (7)
  • K8sClientSingleton (31-165)
  • K8SAuthDependency (213-270)
  • ClusterIDUnavailableError (27-28)
  • get_authn_api (98-105)
  • get_authz_api (108-115)
  • _get_cluster_id (128-152)
  • get_cluster_id (155-165)
tests/unit/app/endpoints/test_query.py (6)
src/app/endpoints/query.py (5)
  • get_agent (74-100)
  • query_endpoint_handler (104-155)
  • store_transcript (308-356)
  • retrieve_response (202-263)
  • get_rag_toolgroups (359-374)
src/models/requests.py (2)
  • QueryRequest (60-133)
  • Attachment (11-55)
src/models/config.py (1)
  • ModelContextProtocolServer (56-61)
src/configuration.py (5)
  • AppConfig (20-100)
  • init_from_dict (42-44)
  • user_data_collection_configuration (71-76)
  • mcp_servers (79-84)
  • llama_stack_configuration (63-68)
tests/unit/app/endpoints/test_streaming_query.py (3)
  • prepare_agent_mocks (79-84)
  • identifier (346-347)
  • identifier (389-390)
src/client.py (2)
  • get_client (46-52)
  • get_client (81-87)
🪛 checkmake (0.2.2)
Makefile

[warning] 74-74: Target body for "verify" exceeds allowed length of 5 (6).

(maxbodylength)

🪛 markdownlint-cli2 (0.17.2)
README.md

17-17: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


18-18: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


19-19: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


21-21: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


22-22: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


24-24: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


28-28: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


29-29: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


248-248: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


256-256: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


266-266: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


282-282: Bare URL used

(MD034, no-bare-urls)


283-283: Bare URL used

(MD034, no-bare-urls)


318-318: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


319-319: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


320-320: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


321-321: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


322-322: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

🪛 LanguageTool
README.md

[style] ~191-~191: Try using a more formal synonym for ‘check’.
Context: ...on in your browser does not work please check that in the config file host option l...

(VERIFY)


[style] ~252-~252: Consider using a different verb to strengthen your wording.
Context: ... make distribution-archives ``` Please make sure that the archive was really built to av...

(MAKE_SURE_ENSURE)

🔇 Additional comments (113)
.github/workflows/build_and_push_release.yaml (2)

27-29: Explicitly register binfmt to avoid sporadic cross-arch build failures

qemu-user-static is necessary, but on fresh runners the binfmt entries are not always activated.
A short registration step right after the install prevents “exec format error” surprises:

 sudo apt install -y buildah qemu-user-static
+# register qemu handlers for cross-arch builds
+sudo update-binfmts --enable qemu-aarch64 || true

41-42: Good use of archs – multi-arch manifest will be produced
No issues spotted here.

.github/workflows/build_and_push_dev.yaml (2)

23-27: Same binfmt registration remark applies here

Add the update-binfmts (or similar) command after installing qemu-user-static for consistent arm64 builds.


37-44: Multi-arch configuration looks correct

archs: amd64, arm64 together with OCI output matches the release workflow – nice consistency.

tests/e2e/features/steps/common.py (1)

3-3: LGTM: Valid type checking workaround.

The pyright ignore directive is appropriate for addressing attribute access issues with the behave import.

src/utils/types.py (1)

26-30: LGTM: Tool call extraction logic is correct.

The implementation correctly handles the case where output_message might be None or have no tool_calls, returning an empty list as fallback.

tests/unit/auth/test_utils.py (1)

7-11: LGTM: Good test coverage for valid token extraction.

The test correctly verifies the happy path of token extraction from a Bearer token.

tests/unit/app/endpoints/test_health.py (2)

101-105: Test updates correctly reflect the client holder refactoring.

The mocking approach has been properly updated to use the singleton LlamaStackClientHolder.get_client() method instead of the previous direct client creation functions. This maintains the same test coverage while adapting to the new centralized client management pattern.


152-155: Exception handling test properly updated for new client pattern.

The test correctly mocks the new singleton holder's get_client() method to simulate connection failures, maintaining the same test coverage for error scenarios.

src/app/routers.py (2)

14-14: Import added for new authorized endpoint.

The import for the authorized endpoint is correctly added to support the new authorization functionality.


28-34: Router organization improved with logical prefix grouping.

The reorganization appropriately groups versioned business logic endpoints (with /v1 prefix) separately from infrastructure endpoints (health, authorized) that don't require versioning. The explanatory comment adds helpful context for this decision.

src/app/endpoints/health.py (2)

14-14: Import updated for centralized client management.

The import correctly switches to use the singleton LlamaStackClientHolder instead of direct client creation functions, aligning with the improved client management architecture.


32-32: Client retrieval updated to use singleton pattern.

The client retrieval now uses the singleton holder's get_client() method, which provides better centralized client management and consistent configuration across the application.

tests/unit/models/test_responses.py (3)

1-6: Imports updated for new authorization response models.

The imports correctly include the new AuthorizedResponse and UnauthorizedResponse models to support testing the new authorization functionality.


38-49: Test coverage added for AuthorizedResponse model.

The test class provides appropriate coverage for the AuthorizedResponse model, verifying correct attribute initialization and following the established testing patterns.


51-59: Test coverage added for UnauthorizedResponse model.

The test class provides appropriate coverage for the UnauthorizedResponse model, verifying correct attribute initialization and following the established testing patterns.

src/configuration.py (3)

9-9: Imports added for new configuration models.

The imports correctly include the new Customization and AuthenticationConfiguration models to support the extended configuration interface.

Also applies to: 14-14


86-92: Authentication configuration property added.

The property follows the established pattern with proper assertion checks and type annotations. The Optional return type correctly reflects that authentication configuration may not be present.


94-100: Customization configuration property added.

The property follows the established pattern with proper assertion checks and type annotations. The Optional return type correctly reflects that customization configuration may not be present.

tests/unit/__init__.py (2)

5-35: Well-structured test configuration setup.

The comprehensive configuration dictionary covers all necessary service parameters for testing. The structure is logical and includes proper defaults for test environments.


37-39: Good practice: Configuration initialization before module imports.

The comment correctly identifies the dependency requirement and the initialization is properly placed before any imports that might need the configuration.

Containerfile (1)

54-54: Appropriate change from CMD to ENTRYPOINT.

This change ensures the Python script always runs as the main executable, allowing command-line arguments (like the new --data-collector option) to be passed at runtime while maintaining the base command structure.

src/auth/utils.py (1)

7-26: Solid implementation of token extraction logic.

The function correctly handles Authorization header parsing, validates the Bearer token format, and provides appropriate error responses. The error handling is comprehensive and follows FastAPI conventions.

tests/unit/test_client.py (3)

5-5: Correct import update for client holder pattern.

The import correctly switches from direct function calls to the new singleton holder classes, aligning with the architectural refactoring.


10-10: Helpful addition of test docstrings.

The descriptive docstrings clarify the purpose of each test function, improving code maintainability.

Also applies to: 23-23, 36-36, 53-53, 66-66, 79-79


17-18: Consistent refactoring to client holder pattern.

The refactoring consistently replaces direct function calls with the new holder pattern: instantiate the holder, then call load() (with await for async). This maintains the same test logic while adopting the new architecture.

Also applies to: 30-31, 48-49, 60-61, 73-74, 91-92

src/app/endpoints/feedback.py (3)

11-11: Correct adoption of new authentication dependency pattern.

The change from direct import to using get_auth_dependency() factory function aligns with the broader authentication refactor, allowing for dynamic selection of authentication modules based on configuration.

Also applies to: 25-25


13-18: Good addition of response models for better API documentation.

The expanded imports include UnauthorizedResponse and ForbiddenResponse models, which will improve API documentation and standardize error responses.


30-37: Enhanced response schema with proper error codes.

The addition of 400 and 403 status codes with corresponding response models improves API documentation and provides clear error handling specifications for authentication and authorization failures.

tests/unit/utils/test_types.py (1)

40-48: LGTM! Well-structured test with proper mocking.

The test effectively verifies that get_tool_calls returns the expected tool calls when the completion message has them. The use of Mock objects is appropriate for testing.

tests/unit/auth/test_noop.py (2)

8-21: LGTM! Comprehensive test coverage for default behavior.

The test properly verifies that NoopAuthDependency returns the expected default values when no user_id is provided in the query parameters. The mock request setup is appropriate.


24-37: LGTM! Good test coverage for custom user ID scenario.

The test effectively verifies that the dependency correctly extracts and returns the custom user ID from query parameters while maintaining default values for username and user token.

src/app/endpoints/config.py (1)

61-62: LGTM! Good addition of configuration validation.

The addition of check_configuration_loaded(configuration) ensures that the endpoint properly validates configuration state before responding, which improves error handling and consistency.

src/app/endpoints/authorized.py (1)

17-30: LGTM! Well-structured response configuration.

The response configuration properly defines the expected responses for different HTTP status codes with appropriate models and descriptions.

src/models/responses.py (3)

247-268: LGTM! Well-structured authorization response model.

The AuthorizedResponse model is properly defined with clear field types, good documentation, and appropriate examples for API documentation. The structure aligns well with the endpoint's needs.


271-285: LGTM! Appropriate unauthorized response model.

The UnauthorizedResponse model properly represents authentication failures with a clear detail field and good documentation examples.


288-300: LGTM! Good use of inheritance for forbidden response.

The ForbiddenResponse inheriting from UnauthorizedResponse makes sense since both have the same structure but represent different authorization states. The example message clearly distinguishes between the two scenarios.

src/auth/__init__.py (1)

26-38: LGTM! Clean pattern matching implementation.

The pattern matching approach for selecting authentication dependencies is clean and maintainable. The error handling with logging and appropriate exception raising is well implemented.

tests/unit/app/endpoints/test_authorized.py (2)

9-31: LGTM! Well-structured test with proper mocking.

The test properly mocks the auth dependency using AsyncMock and verifies the expected response structure. The Request object construction is minimal but sufficient for testing purposes.


33-54: LGTM! Comprehensive unauthorized access test.

The test properly simulates an HTTPException from the auth dependency and verifies that the same exception is propagated. The error details are correctly validated.

tests/e2e/features/steps/llm_query_response.py (3)

4-4: LGTM! Good fix for type checking issues.

The pyright ignore comment appropriately addresses the attribute access issue with behave imports.


39-47: LGTM! Excellent documentation improvement.

The enhanced docstring provides clear and comprehensive documentation of the function's behavior, making it much easier for developers to understand what the test step does.


52-53: LGTM! Important defensive programming addition.

The explicit assertion for context.table prevents potential runtime errors and provides a clear error message when the fragments table is missing from the scenario.

src/app/endpoints/models.py (4)

48-48: LGTM! Important configuration validation.

Adding the configuration check prevents runtime errors and provides clear error messages when configuration is missing.


50-52: LGTM! Good logging addition.

The logging provides useful debugging information about the Llama stack configuration state.


53-80: LGTM! Excellent error handling improvements.

The comprehensive error handling distinguishes between connection errors and other exceptions, providing specific error messages and appropriate HTTP status codes. The singleton client holder usage is also appropriate for resource management.


41-41: LGTM! Good OpenAPI schema update.

The addition of the 503 status code to the OpenAPI response schema accurately reflects the new error handling for connection failures.

src/utils/endpoints.py (2)

10-16: LGTM - Configuration validation is properly implemented.

The function correctly validates configuration presence and raises appropriate HTTP 500 error with descriptive message when configuration is missing.


19-50: System prompt logic is well-structured with proper precedence handling.

The function correctly implements the precedence order: query request > configuration > default, with proper validation for disabled system prompt customization. The error messages are clear and helpful.

src/utils/checks.py (2)

9-11: LGTM - Custom exception is properly defined.

The custom exception provides clear context for configuration validation failures.


22-27: File validation logic is correct and comprehensive.

The function properly validates both file existence and readability with descriptive error messages.

tests/unit/app/endpoints/test_config.py (1)

43-46: Test configuration includes new authentication and customization fields.

The test properly includes the new authentication and customization fields to match the expanded configuration schema.

src/auth/noop_with_token.py (2)

18-26: Class structure is well-defined with proper interface implementation.

The class correctly implements the AuthInterface with proper initialization and documentation.


27-46: Authentication logic is correct with appropriate security warnings.

The method properly:

  • Logs security warnings about insecure mode
  • Extracts user token using the utility function
  • Allows user_id override from query parameters
  • Returns the expected tuple format
src/auth/noop.py (2)

18-24: Class structure is consistent with other auth implementations.

The class correctly implements the AuthInterface with proper initialization and documentation.


25-42: No-op authentication logic is appropriate for development use.

The method properly:

  • Logs security warnings about insecure mode
  • Extracts user_id from query parameters with default fallback
  • Returns NO_USER_TOKEN constant indicating no token available
  • Maintains consistent return type with other auth implementations
tests/unit/utils/test_common.py (4)

34-34: LGTM: Client holder pattern correctly implemented

The change from direct client getter to singleton pattern client.LlamaStackClientHolder.get_client is consistent with the architectural refactoring mentioned in the AI summary.


69-70: LGTM: API interface updated correctly

The changes from tools to toolgroups API and toolgroup_id to provider_resource_id attribute reflect the updated client API and data model changes.


45-45: LGTM: Configuration model extended with customization field

The addition of customization=None to all test configurations is consistent with the extended configuration model introduced in the PR.

Also applies to: 85-85, 129-129, 176-176, 230-230, 275-275


253-254: LGTM: Async client holder pattern correctly implemented

The change to use client.AsyncLlamaStackClientHolder.get_client for library client configuration is consistent with the architectural changes.

tests/unit/app/test_routers.py (4)

25-25: LGTM: MockFastAPI enhanced to track router prefixes

The change from simple list to tuple tracking (router, prefix) enables proper testing of router prefixes, which is essential for API versioning.

Also applies to: 29-29


31-35: LGTM: Helper methods for router introspection

The get_routers() and get_router_prefix() methods provide clean interfaces for testing router configuration.


44-44: LGTM: Router count and authorized router correctly added

The router count update from 8 to 9 and inclusion of authorized.router is consistent with the changes described in the AI summary.

Also applies to: 53-53, 62-62


56-71: LGTM: Comprehensive prefix testing

The new test_check_prefixes() function ensures proper API versioning by verifying that /v1 prefix is applied to versioned endpoints while health and authorized routers remain unversioned.

Makefile (4)

5-5: LGTM: Python registry configuration

The PYTHON_REGISTRY variable provides configurable upload target for package distribution.


11-12: LGTM: Data collector support

The run-data-collector target enables local execution of the new data collector functionality using the --data-collector flag.


69-69: LGTM: Docstyle scope restriction

Restricting pydocstyle to the src directory is appropriate since test files don't need the same documentation standards.


82-87: LGTM: Python package distribution support

The distribution-archives and upload-distribution-archives targets provide proper workflow for Python package publishing using standard tools.

tests/unit/utils/test_checks.py (7)

9-16: LGTM: Well-structured test fixture

The input_file fixture correctly uses tmp_path to create temporary files with known content for testing.


18-32: LGTM: Comprehensive edge case testing

The test covers both empty dictionary and non-existent key scenarios, ensuring robust handling of missing data.


34-46: LGTM: Proper file reading verification

The test verifies that file content is read correctly and trailing whitespace is handled as expected.


48-59: LGTM: Exception handling validation

The test properly verifies that FileNotFoundError is raised with the correct filename when the file doesn't exist.


61-65: LGTM: Positive path testing

The test verifies that file_check succeeds for valid, readable files without raising exceptions.


67-71: LGTM: Non-existent file handling

The test correctly verifies that InvalidConfigurationError is raised for non-existent files.


73-77: LGTM: Unreadable file simulation

The test uses mocking to simulate file access issues and verifies appropriate exception handling.

tests/unit/app/endpoints/test_models.py (1)

98-135: LGTM: Successful models list retrieval test

The test correctly mocks the client and configuration to simulate successful model retrieval with an empty list.

src/utils/common.py (2)

59-63: LGTM! Correctly updated to use the new toolgroups API

The changes from tools.list() to toolgroups.list() and accessing provider_resource_id instead of toolgroup_id are consistent with the new API structure.


87-91: LGTM! Sync version correctly mirrors the async implementation

The sync version of the toolgroups API changes is correctly implemented and consistent with the async version.

tests/unit/auth/test_noop_with_token.py (1)

10-31: LGTM! Well-structured test for default user authentication

The test correctly validates the extraction of user ID, username, and token from the authorization header with default values.

src/client.py (2)

19-53: LGTM! Well-structured singleton client holder

The implementation correctly encapsulates client initialization and provides proper error handling. The singleton pattern ensures a single client instance across the application.


19-88: Ensure thread‐safety in the Singleton metaclass

The current Singleton implementation in src/utils/types.py uses a shared _instances dict without synchronization. In a multithreaded context, two threads could both pass the if cls not in _instances check and create separate instances. To address this:

  • Import the threading module and add a class‐level lock (e.g. threading.RLock()).
  • Wrap the instantiation block in with cls._lock: to serialize access.

Suggested changes:

--- a/src/utils/types.py
+++ b/src/utils/types.py
@@
 class Singleton(type):
     """Metaclass for Singleton support."""
-
-    _instances = {}  # type: ignore
+    _instances = {}  # type: ignore
+    _lock = __import__("threading").RLock()  # ensure only one thread initializes an instance

     def __call__(cls, *args, **kwargs):  # type: ignore
         """Ensure a single instance is created."""
-        if cls not in cls._instances:
-            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
+        with cls._lock:
+            if cls not in cls._instances:
+                cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
         return cls._instances[cls]
  • File to update: src/utils/types.py
  • Import/threading concern: ensure threading is available in your environment.
README.md (1)

312-354: LGTM! Comprehensive documentation for the new Data Collector Service

The documentation clearly explains the purpose, features, configuration, and usage of the new data collector service. The configuration example is well-structured and includes all necessary parameters.

src/utils/mcp_headers.py (1)

76-89: LGTM! Well-implemented URL validation and toolgroup mapping

The function correctly validates URLs and maps toolgroup names to their corresponding URLs from the configuration. The logic handles both URL and toolgroup name keys appropriately.

tests/e2e/features/steps/common_http.py (3)

36-52: Good addition of input validation and documentation.

The assertion ensures that the JSON payload is present before attempting to parse it, preventing unclear JSON parsing errors. The enhanced docstring clearly describes the function's requirements and behavior.


61-84: Proper input validation for table parameters.

The assertion ensures that request parameters are provided through context.table, preventing AttributeError when iterating over None. The improved documentation clearly explains the expected input format.


249-266: Good addition of payload validation.

The assertion ensures that the JSON payload is present before attempting to parse it, providing clearer error messages. The enhanced documentation clearly describes the function's behavior.

tests/unit/auth/test_k8s.py (6)

18-62: Well-structured mock classes for Kubernetes API responses.

The mock classes appropriately simulate the Kubernetes API response structure, with proper handling of authenticated vs non-authenticated states.


64-69: Good singleton pattern test.

Properly verifies that multiple instantiations return the same object instance.


71-101: Comprehensive test for valid token authentication.

The test properly mocks both TokenReview and SubjectAccessReview, and verifies that correct user information is returned for valid tokens.


103-133: Good test coverage for invalid token scenario.

Properly verifies that invalid tokens result in HTTP 403 Forbidden responses.


135-172: Good test for kube:admin special handling.

Properly verifies that the cluster ID is used as the user_uid when the user is kube:admin, which is an important security consideration.


188-225: Excellent error handling coverage.

The test thoroughly covers various failure scenarios including KeyError, TypeError, ApiException, and generic exceptions, ensuring robust error handling in cluster ID retrieval.

tests/unit/test_configuration.py (4)

8-41: Good test coverage for new configuration properties.

The additions properly verify that accessing authentication_configuration and customization properties before loading configuration raises the expected exceptions.


71-71: Consistent test data with updated configuration schema.

Adding "customization": None to config dictionaries ensures tests remain valid with the updated configuration schema.

Also applies to: 133-133, 241-241, 277-277


150-150: Good practice: explicit UTF-8 encoding.

Explicitly specifying UTF-8 encoding when writing files prevents potential encoding issues across different platforms.

Also applies to: 182-182


335-437: Comprehensive test coverage for new customization features.

The new tests properly cover:

  1. Exception handling for unloaded configuration properties
  2. Loading system prompts from file paths
  3. Loading inline system prompts

All test cases are well-structured and provide good coverage.

docs/openapi.json (4)

10-10: Appropriate version bump.

The version bump from 0.0.1 to 0.1.0 correctly reflects the addition of new features (authentication and customization).


438-479: Well-defined authentication configuration schema.

The schema properly defines authentication settings with appropriate validation constraints for URIs and file paths.


538-567: Flexible customization schema design.

The schema properly supports both file-based and inline system prompts with appropriate validation.


503-526: Proper integration of new configuration properties.

The authentication and customization properties are correctly added as optional fields with sensible defaults (noop authentication module).

src/auth/k8s.py (2)

168-195: Well-implemented token validation function.

The function properly handles token validation with appropriate error handling for both API exceptions and unexpected errors.


213-271: Robust Kubernetes authentication implementation.

The dependency properly implements the full authentication and authorization flow:

  • Token extraction and validation
  • Special handling for kube:admin user
  • SubjectAccessReview for RBAC authorization
  • Appropriate error handling with 403 responses
tests/unit/utils/test_endpoints.py (1)

1-143: Well-structured test coverage for the get_system_prompt utility!

The test file provides comprehensive coverage for all scenarios of the get_system_prompt function, including default prompts, customized prompts, query-specific prompts, and the disabled query prompt scenario. The fixtures are well-organized and the test cases properly validate both successful paths and error conditions.

src/app/endpoints/query.py (2)

115-155: Excellent error handling and client management improvements!

The addition of APIConnectionError handling provides better user experience when Llama Stack is unavailable. The use of LlamaStackClientHolder singleton and proper auth tuple unpacking are good architectural improvements.


202-263: Well-implemented MCP headers integration!

The function properly handles MCP headers from multiple sources, maintains backward compatibility, and correctly integrates MCP servers into the toolgroups. The serialization of headers into the agent's extra_headers is clean and follows the expected format.

src/app/endpoints/streaming_query.py (3)

41-42: Cache configuration matches the non-streaming endpoint.

Good consistency between the streaming and non-streaming endpoints with identical cache configuration.


45-70: Correct async implementation of agent caching!

The async version properly mirrors the synchronous implementation with appropriate use of async/await patterns and AsyncAgent.


186-258: Well-implemented streaming with metadata collection!

The endpoint properly handles connection errors, generates SSE-formatted events, and collects metadata during streaming. The inclusion of referenced documents in the end event is a nice enhancement for the client.

tests/unit/app/endpoints/test_query.py (3)

55-61: Good test hygiene with cache cleanup!

The fixture properly cleans up the agent cache after each test, preventing test pollution.


867-1136: Excellent test coverage for agent caching functionality!

The tests comprehensively cover all aspects of the get_agent function including cache hits/misses, various shield configurations, multiple MCP servers, and session persistence. The mocking is properly set up and assertions are thorough.


551-756: Thorough testing of MCP headers integration!

The tests properly verify MCP header handling including authorization token formatting, header merging, and filtering of unknown servers. The assertions correctly check the serialized header format in agent.extra_headers.

tests/unit/app/endpoints/test_streaming_query.py (2)

24-47: Excellent test coverage for metadata extraction!

The test data includes both valid and invalid metadata formats, properly testing the robustness of the metadata extraction logic. The test correctly verifies that only documents with properly formatted metadata (containing both "docs_url" and "title") are included in the referenced documents.

Also applies to: 229-235


822-1115: Proper async implementation of agent caching tests!

The tests correctly use async/await patterns and AsyncMock for testing the async agent caching functionality. All scenarios from the synchronous tests are properly covered with appropriate async adaptations.

Comment on lines +30 to +33
run: |
echo "DEV_TAG=dev-$(date +%Y%m%d)-$(git rev-parse --short HEAD)" >> $GITHUB_ENV
echo "dev image tag: ${{ env.DEV_TAG }}"
- name: Build image with Buildah
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Bug: ${{ env.DEV_TAG }} isn’t available inside the same step

The variable is written to $GITHUB_ENV during this step and becomes available only after the step finishes, so the echo prints an empty string. Use the shell variable instead:

-echo "dev image tag: ${{ env.DEV_TAG }}"
+echo "dev image tag: $DEV_TAG"

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In .github/workflows/build_and_push_dev.yaml around lines 30 to 33, the
environment variable DEV_TAG is written to $GITHUB_ENV but referenced in the
same step using ${{ env.DEV_TAG }}, which is not yet available. To fix this,
replace ${{ env.DEV_TAG }} with the shell variable $DEV_TAG within the same run
step to correctly print the tag value.

Comment on lines +29 to +35
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Reconsider overly permissive CORS settings for security.

The current CORS configuration allows all origins, methods, and headers, which poses security risks in production environments. Consider restricting these settings based on your actual requirements.

 app.add_middleware(
     CORSMiddleware,
-    allow_origins=["*"],
+    allow_origins=["http://localhost:3000", "https://yourdomain.com"],  # Specify actual origins
     allow_credentials=True,
-    allow_methods=["*"],
+    allow_methods=["GET", "POST", "PUT", "DELETE"],  # Specify required methods
-    allow_headers=["*"],
+    allow_headers=["Authorization", "Content-Type"],  # Specify required headers
 )

Alternatively, consider making CORS settings configurable through your configuration system.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:3000", "https://yourdomain.com"], # Specify actual origins
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE"], # Specify required methods
allow_headers=["Authorization", "Content-Type"], # Specify required headers
)
🤖 Prompt for AI Agents
In src/app/main.py around lines 29 to 35, the CORS middleware is configured with
overly permissive settings allowing all origins, methods, and headers. To fix
this, restrict allow_origins, allow_methods, and allow_headers to only those
necessary for your application based on actual requirements. Alternatively,
modify the code to load these CORS settings from a configuration file or
environment variables to make them configurable and more secure in production.

def authorized_endpoint_handler(_request: Request) -> AuthorizedResponse:
"""Handle request to the /authorized endpoint."""
# Ignore the user token, we should not return it in the response
user_id, user_name, _ = asyncio.run(auth_dependency(_request))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential issue with asyncio.run() usage in web server context.

Using asyncio.run() in a FastAPI endpoint handler can be problematic because it creates a new event loop, but FastAPI already runs in an existing event loop. This could lead to runtime errors.

Consider making the handler async or ensuring the auth dependency is called properly:

-@router.post("/authorized", responses=authorized_responses)
-def authorized_endpoint_handler(_request: Request) -> AuthorizedResponse:
-    """Handle request to the /authorized endpoint."""
-    # Ignore the user token, we should not return it in the response
-    user_id, user_name, _ = asyncio.run(auth_dependency(_request))
-    return AuthorizedResponse(user_id=user_id, username=user_name)
+@router.post("/authorized", responses=authorized_responses)
+async def authorized_endpoint_handler(_request: Request) -> AuthorizedResponse:
+    """Handle request to the /authorized endpoint."""
+    # Ignore the user token, we should not return it in the response
+    user_id, user_name, _ = await auth_dependency(_request)
+    return AuthorizedResponse(user_id=user_id, username=user_name)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
user_id, user_name, _ = asyncio.run(auth_dependency(_request))
@router.post("/authorized", responses=authorized_responses)
async def authorized_endpoint_handler(_request: Request) -> AuthorizedResponse:
"""Handle request to the /authorized endpoint."""
# Ignore the user token, we should not return it in the response
user_id, user_name, _ = await auth_dependency(_request)
return AuthorizedResponse(user_id=user_id, username=user_name)
🤖 Prompt for AI Agents
In src/app/endpoints/authorized.py at line 37, replace the use of asyncio.run()
to call the async auth_dependency function because FastAPI already runs inside
an event loop and using asyncio.run() will cause runtime errors. Instead, make
the endpoint handler async and await the auth_dependency call directly to
properly integrate with FastAPI's async event loop.

Comment on lines +9 to +13
def test_get_auth_dependency_noop():
"""Test getting Noop authentication dependency."""
configuration.authentication_configuration.module = AUTH_MOD_NOOP
auth_dependency = get_auth_dependency()
assert isinstance(auth_dependency, noop.NoopAuthDependency)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider test isolation and missing error case coverage.

The test directly modifies global configuration state, which could lead to test pollution between tests. Also, there's no test coverage for the error case when an unsupported module is configured.

Consider adding proper test isolation and error case coverage:

+import pytest
+
+def test_get_auth_dependency_unsupported_module():
+    """Test getting authentication dependency for unsupported module."""
+    configuration.authentication_configuration.module = "unsupported"
+    with pytest.raises(ValueError, match="Unsupported authentication module"):
+        get_auth_dependency()
+
+@pytest.fixture(autouse=True)
+def reset_configuration():
+    """Reset configuration after each test."""
+    yield
+    # Reset to a known state after each test
+    configuration.authentication_configuration.module = AUTH_MOD_NOOP
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_get_auth_dependency_noop():
"""Test getting Noop authentication dependency."""
configuration.authentication_configuration.module = AUTH_MOD_NOOP
auth_dependency = get_auth_dependency()
assert isinstance(auth_dependency, noop.NoopAuthDependency)
import pytest
def test_get_auth_dependency_noop():
"""Test getting Noop authentication dependency."""
configuration.authentication_configuration.module = AUTH_MOD_NOOP
auth_dependency = get_auth_dependency()
assert isinstance(auth_dependency, noop.NoopAuthDependency)
def test_get_auth_dependency_unsupported_module():
"""Test getting authentication dependency for unsupported module."""
configuration.authentication_configuration.module = "unsupported"
with pytest.raises(ValueError, match="Unsupported authentication module"):
get_auth_dependency()
@pytest.fixture(autouse=True)
def reset_configuration():
"""Reset configuration after each test."""
yield
# Reset to a known state after each test
configuration.authentication_configuration.module = AUTH_MOD_NOOP
🤖 Prompt for AI Agents
In tests/unit/auth/test_auth.py around lines 9 to 13, the test modifies global
configuration directly, risking test pollution, and lacks coverage for
unsupported module errors. Refactor the test to isolate configuration changes,
such as using setup and teardown methods or context managers to restore original
state after the test. Additionally, add a new test case that sets an unsupported
authentication module and asserts that the appropriate error is raised to cover
the error scenario.

virtual_path: str = constants.DEFAULT_VIRTUAL_PATH,
) -> AuthInterface:
"""Select the configured authentication dependency interface."""
module = configuration.authentication_configuration.module # pyright: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Address the type safety issue with authentication_configuration.

The pyright: ignore comment suggests there's a type safety issue. The code assumes configuration.authentication_configuration is not None, but based on the configuration property type hint Optional[AuthenticationConfiguration], it could be None.

Add proper null checking:

-module = configuration.authentication_configuration.module  # pyright: ignore
+auth_config = configuration.authentication_configuration
+if auth_config is None:
+    err_msg = "Authentication configuration is not loaded"
+    logger.error(err_msg)
+    raise ValueError(err_msg)
+
+module = auth_config.module
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
module = configuration.authentication_configuration.module # pyright: ignore
auth_config = configuration.authentication_configuration
if auth_config is None:
err_msg = "Authentication configuration is not loaded"
logger.error(err_msg)
raise ValueError(err_msg)
module = auth_config.module
🤖 Prompt for AI Agents
In src/auth/__init__.py at line 18, the code accesses
configuration.authentication_configuration.module without checking if
authentication_configuration is None, causing a type safety issue. Fix this by
adding a null check to ensure authentication_configuration is not None before
accessing its module attribute, and handle the None case appropriately to avoid
runtime errors.

Comment on lines +42 to +45
AsyncLlamaStackAsLibraryClient, AsyncLlamaStackClientHolder().get_client()
)
await client.initialize()
await _register_mcp_toolgroups_async(client, configuration.mcp_servers, logger)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential issue with double initialization of async library client

The async library client is being initialized twice - once in AsyncLlamaStackClientHolder.load() and again here with await client.initialize(). This could lead to unexpected behavior or resource leaks.

-        client = cast(
-            AsyncLlamaStackAsLibraryClient, AsyncLlamaStackClientHolder().get_client()
-        )
-        await client.initialize()
+        # Client is already initialized in AsyncLlamaStackClientHolder.load()
+        client = cast(
+            AsyncLlamaStackAsLibraryClient, AsyncLlamaStackClientHolder().get_client()
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
AsyncLlamaStackAsLibraryClient, AsyncLlamaStackClientHolder().get_client()
)
await client.initialize()
await _register_mcp_toolgroups_async(client, configuration.mcp_servers, logger)
# Client is already initialized in AsyncLlamaStackClientHolder.load()
client = cast(
AsyncLlamaStackAsLibraryClient, AsyncLlamaStackClientHolder().get_client()
)
- await client.initialize()
await _register_mcp_toolgroups_async(client, configuration.mcp_servers, logger)
🤖 Prompt for AI Agents
In src/utils/common.py around lines 42 to 45, the async library client is being
initialized twice: once in AsyncLlamaStackClientHolder.load() and again with
await client.initialize(). To fix this, remove the redundant await
client.initialize() call here to avoid double initialization and potential
resource leaks.

Comment on lines +78 to +79
async def test_noop_with_token_auth_dependency_no_bearer():
"""Test the NoopWithTokenAuthDependency class with no token."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix docstring to match test behavior

The docstring incorrectly states "with no token" but the test is actually checking for a non-Bearer authorization header.

-async def test_noop_with_token_auth_dependency_no_bearer():
-    """Test the NoopWithTokenAuthDependency class with no token."""
+async def test_noop_with_token_auth_dependency_no_bearer():
+    """Test the NoopWithTokenAuthDependency class with non-Bearer authorization header."""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def test_noop_with_token_auth_dependency_no_bearer():
"""Test the NoopWithTokenAuthDependency class with no token."""
async def test_noop_with_token_auth_dependency_no_bearer():
"""Test the NoopWithTokenAuthDependency class with non-Bearer authorization header."""
🤖 Prompt for AI Agents
In tests/unit/auth/test_noop_with_token.py around lines 78 to 79, update the
docstring of the test_noop_with_token_auth_dependency_no_bearer function to
accurately describe that the test checks for a non-Bearer authorization header
instead of stating "with no token". Modify the docstring to reflect this
behavior clearly.

Comment on lines +38 to +53
mcp_headers_string = request.headers.get("MCP-HEADERS", "")
mcp_headers = {}
if mcp_headers_string:
try:
mcp_headers = json.loads(mcp_headers_string)
except json.decoder.JSONDecodeError as e:
logger.error("MCP headers decode error: %s", e)

if not isinstance(mcp_headers, dict):
logger.error(
"MCP headers wrong type supplied (mcp headers must be a dictionary), "
"but type %s was supplied",
type(mcp_headers),
)
mcp_headers = {}
return mcp_headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider size validation for MCP headers

The function parses JSON from user-provided headers without validating the size. Large JSON payloads could potentially cause memory issues or DoS attacks.

 def extract_mcp_headers(request: Request) -> dict[str, dict[str, str]]:
     """Extract mcp headers from MCP-HEADERS header.
 
     Args:
         request: The FastAPI request object
 
     Returns:
         The mcp headers dictionary, or empty dictionary if not found or on json decoding error
     """
     mcp_headers_string = request.headers.get("MCP-HEADERS", "")
     mcp_headers = {}
+    
+    # Validate header size (e.g., 10KB limit)
+    MAX_HEADER_SIZE = 10 * 1024  # 10KB
+    if len(mcp_headers_string) > MAX_HEADER_SIZE:
+        logger.error("MCP headers exceed maximum allowed size of %d bytes", MAX_HEADER_SIZE)
+        return {}
+    
     if mcp_headers_string:
         try:
             mcp_headers = json.loads(mcp_headers_string)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
mcp_headers_string = request.headers.get("MCP-HEADERS", "")
mcp_headers = {}
if mcp_headers_string:
try:
mcp_headers = json.loads(mcp_headers_string)
except json.decoder.JSONDecodeError as e:
logger.error("MCP headers decode error: %s", e)
if not isinstance(mcp_headers, dict):
logger.error(
"MCP headers wrong type supplied (mcp headers must be a dictionary), "
"but type %s was supplied",
type(mcp_headers),
)
mcp_headers = {}
return mcp_headers
def extract_mcp_headers(request: Request) -> dict[str, dict[str, str]]:
"""Extract mcp headers from MCP-HEADERS header.
Args:
request: The FastAPI request object
Returns:
The mcp headers dictionary, or empty dictionary if not found or on json decoding error
"""
mcp_headers_string = request.headers.get("MCP-HEADERS", "")
mcp_headers = {}
# Validate header size (e.g., 10KB limit)
MAX_HEADER_SIZE = 10 * 1024 # 10KB
if len(mcp_headers_string) > MAX_HEADER_SIZE:
logger.error("MCP headers exceed maximum allowed size of %d bytes", MAX_HEADER_SIZE)
return {}
if mcp_headers_string:
try:
mcp_headers = json.loads(mcp_headers_string)
except json.decoder.JSONDecodeError as e:
logger.error("MCP headers decode error: %s", e)
if not isinstance(mcp_headers, dict):
logger.error(
"MCP headers wrong type supplied (mcp headers must be a dictionary), "
"but type %s was supplied",
type(mcp_headers),
)
mcp_headers = {}
return mcp_headers
🤖 Prompt for AI Agents
In src/utils/mcp_headers.py around lines 38 to 53, the code parses JSON from the
MCP-HEADERS request header without checking its size, which could lead to memory
issues or DoS attacks. Add a size validation step before parsing the JSON string
by checking the length of mcp_headers_string and rejecting or logging an error
if it exceeds a reasonable threshold. This prevents processing excessively large
payloads and improves security.

Comment on lines +281 to 285
"description": "Service Unavailable",
"detail": {
"response": "Configuration is no loaded"
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo in error message.

There's a typo in the 503 response description.

-                            "response": "Configuration is no loaded"
+                            "response": "Configuration is not loaded"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"description": "Service Unavailable",
"detail": {
"response": "Configuration is no loaded"
}
}
"description": "Service Unavailable",
"detail": {
- "response": "Configuration is no loaded"
+ "response": "Configuration is not loaded"
}
}
🤖 Prompt for AI Agents
In docs/openapi.json around lines 281 to 285, there is a typo in the 503 error
response message where "Configuration is no loaded" should be corrected to
"Configuration is not loaded". Update the "response" field in the "detail"
object to fix this typo.

Comment on lines +142 to +146
except TypeError as e:
logger.error(
"Failed to get cluster_id, version object is: %s", version_data
)
raise ClusterIDUnavailableError("Failed to get cluster ID") from e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined variable in error logging.

The variable version_data is not defined in this exception handler scope, which will cause a NameError if this exception path is triggered.

         except TypeError as e:
             logger.error(
-                "Failed to get cluster_id, version object is: %s", version_data
+                "Failed to get cluster_id from version object due to TypeError"
             )
             raise ClusterIDUnavailableError("Failed to get cluster ID") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except TypeError as e:
logger.error(
"Failed to get cluster_id, version object is: %s", version_data
)
raise ClusterIDUnavailableError("Failed to get cluster ID") from e
except TypeError as e:
logger.error(
"Failed to get cluster_id from version object due to TypeError"
)
raise ClusterIDUnavailableError("Failed to get cluster ID") from e
🤖 Prompt for AI Agents
In src/auth/k8s.py around lines 142 to 146, the variable version_data used in
the logger.error call is not defined in the except block, causing a NameError.
To fix this, ensure that version_data is defined or accessible in this scope
before logging it, or replace it with a variable that is available and contains
relevant information about the version object.

- Bumped versions for `astroid` (3.3.10 to 3.3.11) and `certifi` (2025.7.9 to 2025.7.14) in the lock file, including updated source URLs and hashes.
- Removed outdated `typing-extensions` package entry from the lock file.
- Cleaned up import statements in `config.py` to resolve merge conflicts and improve readability.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.