Skip to content

UN-3314 [FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings#1848

Merged
hari-kuriakose merged 33 commits into
mainfrom
fix/litellm-cohere-embed-timeout
Mar 23, 2026
Merged

UN-3314 [FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings#1848
hari-kuriakose merged 33 commits into
mainfrom
fix/litellm-cohere-embed-timeout

Conversation

@pk-zipstack
Copy link
Copy Markdown
Contributor

@pk-zipstack pk-zipstack commented Mar 11, 2026

What

Monkey-patch litellm's cohere embed handler to correctly forward the timeout parameter to client.post() calls, fixing "Connection timed out after None seconds" errors when indexing large documents with AWS Bedrock embedding models.

Why

litellm (v1.82.3) has a bug in litellm/llms/cohere/embed/handler.py where both embedding() and async_embedding() receive a timeout parameter but never forward it to client.post(). This causes the timeout to default to None, which surfaces as:

Connection timed out after None seconds.

This affects all Bedrock Cohere embedding operations (e.g. cohere.embed-multilingual-v3) and is especially visible with large documents. The bug is present on litellm's latest main branch as well — no upstream fix exists.

How

  • Added a monkey-patch module (unstract/sdk1/patches/litellm_cohere_timeout.py) that replaces the affected functions with versions that correctly pass timeout=timeout to client.post()
  • Patches three targets: the cohere handler module, the async variant, and bedrock's direct import binding (cohere_embedding)
  • Includes a version guard that skips the patch entirely if litellm is upgraded past 1.82.3, with a DeprecationWarning to prompt verification
  • Each patched function has inline comments marking the single line changed (# ONLY CHANGE)
  • Patch is activated via side-effect import from unstract.sdk1.embedding

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No. The patched functions are exact copies of litellm 1.82.3's originals with only timeout=timeout added to client.post() calls. litellm is pinned at 1.82.3 in sdk1, so the source won't change. If litellm is later upgraded, the version guard skips the patch entirely and emits a warning.

Database Migrations

  • None

Env Config

  • None

Relevant Docs

  • N/A

Related Issues or PRs

  • Fixes Bedrock embedding timeout issue reported on staging with large documents (e.g. Large_1040.pdf)

Dependencies Versions

  • litellm: 1.82.3 (pinned, bug present)
  • httpx: 0.28.1

Notes on Testing

  • 6 unit tests added covering:
    • Sync path: timeout value (600.0), None timeout, httpx.Timeout object all forwarded correctly
    • Async path: timeout value forwarded correctly
    • Monkey-patch wiring: cohere handler and bedrock handler both point to patched functions
  • All tests pass: uv run pytest tests/patches/test_litellm_cohere_timeout.py -v

Screenshots

N/A

Checklist

I have read and understood the Contribution Guidelines.

🤖 Generated with Claude Code

pk-zipstack and others added 3 commits March 10, 2026 14:44
litellm's cohere embed handler (1.80.0) receives a timeout parameter
but doesn't pass it to client.post(), causing "Connection timed out
after None seconds" on large Bedrock embedding requests.

This adds a monkey-patch that replaces the affected functions with
versions that correctly forward timeout. Includes version guard,
source comments, and unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ions

- Version guard now skips patch entirely when litellm > 1.80.0
  instead of just warning
- Test assertions now check exact timeout value received by
  client.post(), not just that it was called
- Inline comments at client.post() calls marked with ONLY CHANGE

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a new patches package and a litellm Cohere timeout monkey-patch (sync + async) that forwards timeout arguments to underlying HTTP client calls; the patch is applied by a side-effect import in the SDK embedding module and unit tests verify timeout propagation and wiring.

Changes

Cohort / File(s) Summary
Patch package initializer
unstract/sdk1/src/unstract/sdk1/patches/__init__.py
New package initializer documenting that patches are applied via side-effect imports and noting activation from the SDK embedding module.
Cohere timeout monkey-patch
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
New monkey-patch adding _patched_async_embedding and _patched_embedding that forward timeout to underlying HTTP client post() calls, install into litellm handlers under a version guard, and emit a RuntimeWarning when not applied.
Activation via embedding module
unstract/sdk1/src/unstract/sdk1/embedding.py
Single-line side-effect import added to load and apply the patch when the embedding module is imported.
Tests for the patch
unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
New tests covering sync and async embedding paths, asserting timeout propagation (numeric, None, httpx.Timeout) to HTTP client post() and verifying litellm handler replacement.
Project metadata
requirements.txt, pyproject.toml
Manifest files present in the diff set (no functional edits reported).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant SDK as "SDK embedding\n(unstract.sdk1.embedding)"
    participant Patch as "Patch module\n(litellm_cohere_timeout)"
    participant Litellm as "litellm embed\nhandler (wrapped)"
    participant HTTP as "HTTP client\n(e.g., httpx)"

    User->>SDK: call embed(inputs, timeout=...)
    SDK->>Patch: import (side-effect) / use patched handler
    Patch->>Litellm: invoke wrapped handler
    Litellm->>HTTP: client.post(..., timeout=passed_timeout)
    HTTP-->>Litellm: response
    Litellm-->>Patch: embedding result
    Patch-->>SDK: return embeddings
    SDK-->>User: return embeddings
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The pull request description is comprehensive and follows the required template with all major sections filled out including What, Why, How, risk assessment, testing notes, and dependencies.
Title check ✅ Passed The title directly describes the main change: adding a monkey-patch for litellm cohere embed timeout to fix Bedrock embeddings issues, which is the primary focus of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/litellm-cohere-embed-timeout

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

pre-commit-ci Bot and others added 3 commits March 11, 2026 10:45
- Remove useless self-assignment `model = model`
- Use int timeout values in tests to avoid float equality checks
- Use `is` identity checks instead of `==` for timeout assertions
- Replace async mock_post with AsyncMock to properly use async features

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py (1)

7-10: Exercise the production import hook in one test.

Because this file imports the patch module directly, the wiring checks only prove direct-import behavior. Add one integration test that imports unstract.sdk1.embedding and asserts the handler is patched, so Line 7 in unstract/sdk1/src/unstract/sdk1/embedding.py cannot regress silently.

Also applies to: 194-203

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py` around lines 7 -
10, Add an integration test that imports unstract.sdk1.embedding (instead of
importing the patch module directly) and asserts the production import hook
applied the patch: after importing unstract.sdk1.embedding, verify the module's
embedding handler uses the patched implementations by checking that its async
and sync embedding callables resolve to the symbols _patched_async_embedding and
_patched_embedding (or that their identities/reference equality match those
functions imported from unstract.sdk1.patches.litellm_cohere_timeout); update or
add this assertion into test_litellm_cohere_timeout.py alongside the existing
direct-import checks so the production import wiring (line referencing
unstract.sdk1.embedding) is exercised and cannot regress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 26-43: The patch currently imports private litellm.llms.* modules
before checking _SKIP_PATCH and uses a loose version check, so move all
private/internal imports (e.g., validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse and any litellm.llms.* import lines) to after the guard that
computes _SKIP_PATCH, change the version gate from a ">" check to an exact
equality check against the known compatible LiteLLM version, and when skipping
emit a visible warning via warnings.warn (not just DeprecationWarning swallowed
by default) so callers know the patch was skipped; ensure functions/classes
referenced in the patch (validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) are only imported after the guard.

---

Nitpick comments:
In `@unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py`:
- Around line 7-10: Add an integration test that imports unstract.sdk1.embedding
(instead of importing the patch module directly) and asserts the production
import hook applied the patch: after importing unstract.sdk1.embedding, verify
the module's embedding handler uses the patched implementations by checking that
its async and sync embedding callables resolve to the symbols
_patched_async_embedding and _patched_embedding (or that their
identities/reference equality match those functions imported from
unstract.sdk1.patches.litellm_cohere_timeout); update or add this assertion into
test_litellm_cohere_timeout.py alongside the existing direct-import checks so
the production import wiring (line referencing unstract.sdk1.embedding) is
exercised and cannot regress.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f70cb6f9-69dd-431a-97bc-906d92c597ff

📥 Commits

Reviewing files that changed from the base of the PR and between c41e05e and dcd0663.

📒 Files selected for processing (5)
  • unstract/sdk1/src/unstract/sdk1/embedding.py
  • unstract/sdk1/src/unstract/sdk1/patches/__init__.py
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
  • unstract/sdk1/tests/patches/__init__.py
  • unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py

Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)

26-64: ⚠️ Potential issue | 🟠 Major

Move the private LiteLLM imports behind the guard and gate on the exact tested version.

Lines 26-43 still import private litellm.llms.* modules before _SKIP_PATCH is evaluated, so a future LiteLLM reorg can fail at import time instead of cleanly skipping. Also, Line 56 only skips > 1.80.0; for a copied private implementation, != is safer. The skip warning on Lines 58-64 uses DeprecationWarning, which is usually hidden for imported library code, so operators may never notice that the patch was not applied. With the provided dependency context showing LiteLLM 1.81.7, this path would currently skip.

🔧 Suggested fix
 import importlib.metadata
 import json
 import logging
 import warnings
 from collections.abc import Callable

 import httpx
 import litellm
-import litellm.llms.bedrock.embed.embedding as _bedrock_embed
-import litellm.llms.cohere.embed.handler as _cohere_handler
 from litellm.litellm_core_utils.litellm_logging import (
     Logging as LiteLLMLoggingObj,
 )
-from litellm.llms.cohere.embed.handler import (
-    validate_environment,
-)
-from litellm.llms.cohere.embed.v1_transformation import (
-    CohereEmbeddingConfig,
-)
-from litellm.llms.custom_httpx.http_handler import (
-    AsyncHTTPHandler,
-    HTTPHandler,
-    get_async_httpx_client,
-)
-from litellm.types.llms.bedrock import CohereEmbeddingRequest
-from litellm.types.utils import EmbeddingResponse
 from packaging.version import Version

 logger = logging.getLogger(__name__)

 _DEFAULT_TIMEOUT = httpx.Timeout(None)

 _PATCHED_LITELLM_VERSION = "1.80.0"
 _litellm_version = importlib.metadata.version("litellm")
-_SKIP_PATCH = Version(_litellm_version) > Version(_PATCHED_LITELLM_VERSION)
+_SKIP_PATCH = Version(_litellm_version) != Version(_PATCHED_LITELLM_VERSION)
 if _SKIP_PATCH:
     warnings.warn(
         "litellm_cohere_timeout patch was SKIPPED — not applied. "
         f"Current litellm version: {_litellm_version}. "
         f"Patch was written for: {_PATCHED_LITELLM_VERSION}. "
         "Please verify the upstream fix and remove this module.",
-        DeprecationWarning,
+        RuntimeWarning,
         stacklevel=2,
     )
+else:
+    import litellm.llms.bedrock.embed.embedding as _bedrock_embed
+    import litellm.llms.cohere.embed.handler as _cohere_handler
+    from litellm.llms.cohere.embed.handler import validate_environment
+    from litellm.llms.cohere.embed.v1_transformation import CohereEmbeddingConfig
+    from litellm.llms.custom_httpx.http_handler import (
+        AsyncHTTPHandler,
+        HTTPHandler,
+        get_async_httpx_client,
+    )
+    from litellm.types.llms.bedrock import CohereEmbeddingRequest
+    from litellm.types.utils import EmbeddingResponse
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` around
lines 26 - 64, The current file imports private litellm modules before computing
_SKIP_PATCH, which can cause import-time failures; move all imports of
litellm.llms.* and litellm.litellm_core_utils.* (symbols: _bedrock_embed,
_cohere_handler, LiteLLMLoggingObj, validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) so they occur only after computing _litellm_version and
_SKIP_PATCH; change the version check to use equality against
_PATCHED_LITELLM_VERSION (i.e. skip only when Version(_litellm_version) ==
Version(_PATCHED_LITELLM_VERSION) or gate with != as requested) and replace the
DeprecationWarning with a visible warning class such as UserWarning or
RuntimeWarning so operators see the warning when the patch is skipped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 26-64: The current file imports private litellm modules before
computing _SKIP_PATCH, which can cause import-time failures; move all imports of
litellm.llms.* and litellm.litellm_core_utils.* (symbols: _bedrock_embed,
_cohere_handler, LiteLLMLoggingObj, validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) so they occur only after computing _litellm_version and
_SKIP_PATCH; change the version check to use equality against
_PATCHED_LITELLM_VERSION (i.e. skip only when Version(_litellm_version) ==
Version(_PATCHED_LITELLM_VERSION) or gate with != as requested) and replace the
DeprecationWarning with a visible warning class such as UserWarning or
RuntimeWarning so operators see the warning when the patch is skipped.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 651cf418-be2b-4238-bfe1-13ad72694bba

📥 Commits

Reviewing files that changed from the base of the PR and between dcd0663 and b93ac45.

📒 Files selected for processing (2)
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
  • unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py

pk-zipstack and others added 2 commits March 12, 2026 09:29
- Move all litellm.llms.* imports inside the `if not _SKIP_PATCH`
  block so they're only loaded when the patch will be applied
- Change version gate from `>` to exact equality (`!=`) — patch
  only applies on the exact known-affected version (1.80.0)
- Use RuntimeWarning instead of DeprecationWarning for skip message
  so it's visible by default (DeprecationWarning is filtered)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 30-41: The version guard (_PATCHED_LITELLM_VERSION,
_litellm_version, _SKIP_PATCH) causes the patch to be skipped because sdk1 uses
LiteLLM 1.81.7; update the constant _PATCHED_LITELLM_VERSION to "1.81.7" (and
run tests/verify that the Cohere timeout fix is effective in that release) or,
alternatively, revert sdk1's litellm dependency to "1.80.0" so the existing
patch applies—ensure the chosen approach makes _SKIP_PATCH False so the patch
code executes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0422bc8a-b039-47cf-aeeb-e54e2c5640e2

📥 Commits

Reviewing files that changed from the base of the PR and between b93ac45 and f6181a6.

📒 Files selected for processing (1)
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py

Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Outdated
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Outdated
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Comment thread unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Outdated
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Comment thread unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Copy link
Copy Markdown
Contributor

@hari-kuriakose hari-kuriakose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pk-zipstack LGTM overall.

However request few critical changes though.

pk-zipstack and others added 2 commits March 12, 2026 21:31
Co-authored-by: Hari John Kuriakose <hari@zipstack.com>
Signed-off-by: Praveen Kumar <praveen@zipstack.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 12, 2026

Greptile Summary

This PR introduces a monkey-patch (unstract/sdk1/patches/litellm_cohere_timeout.py) to fix a known bug in litellm 1.82.3 where timeout is received by the cohere embed handler but never forwarded to client.post(), causing "Connection timed out after None seconds" errors on AWS Bedrock Cohere embedding calls with large documents. The patch is activated via a side-effect import in unstract.sdk1.embedding and is guarded by a strict version equality check so it is automatically skipped (with a log warning) if litellm is upgraded.

  • The core patch logic is correct: both _patched_embedding (sync) and _patched_async_embedding (async) are faithful copies of the litellm 1.82.3 originals with only timeout=timeout added to client.post(), and both module-level and direct-import bindings (_bedrock_embed.cohere_embedding) are updated.
  • A duplicate "pytest-asyncio>=0.23.0" entry is left in pyproject.toml alongside the newly added "pytest-asyncio>=0.24.0" — the old entry should be removed.
  • The PR description claims an async timeout-forwarding test exists ("Async path: timeout value forwarded correctly"), but the test file only verifies patch wiring by identity (handler.async_embedding is _patched_async_embedding); there is no test that actually exercises _patched_async_embedding and asserts the timeout reaches client.post().
  • The uv.lock files for backend, platform-service, prompt-service, filesystem, and workers are all updated to reflect the new dependency, which is expected for lock-file synchronisation.

Confidence Score: 3/5

  • The patch itself is safe, but the test suite has a gap (no async path coverage) and a stale duplicate dependency entry should be cleaned up before merging.
  • The core fix is straightforward and low-risk — it is a minimal, well-commented change to an exact copy of the upstream function with a version guard preventing accidental application to a different litellm release. However, confidence is reduced because: (1) the async timeout-forwarding code path has no direct test coverage despite the PR description claiming otherwise, and (2) a duplicate pytest-asyncio specifier was introduced in pyproject.toml.
  • unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py — async timeout forwarding is untested. unstract/sdk1/pyproject.toml — duplicate pytest-asyncio dependency entry.

Important Files Changed

Filename Overview
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py New monkey-patch module for litellm 1.82.3 cohere embed timeout bug; correctly imports module references and wires patches, with a version guard that skips and logs a warning for any other litellm version.
unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py 6 unit tests covering sync timeout forwarding (3 cases) and patch wiring (3 cases); async timeout forwarding is NOT tested despite the PR description claiming it is, and the top-level import of patched names will fail with ImportError when litellm != 1.82.3.
unstract/sdk1/src/unstract/sdk1/embedding.py Adds a side-effect import of the patch module; no logic changes to the Embedding class itself.
unstract/sdk1/pyproject.toml Adds pytest-asyncio>=0.24.0 as a test dependency but leaves the old >=0.23.0 entry in place, resulting in a duplicate specifier for the same package.
unstract/sdk1/src/unstract/sdk1/patches/init.py New package init with a docstring noting that patches are only activated via unstract.sdk1.embedding; no code changes.

Sequence Diagram

sequenceDiagram
    participant App
    participant embedding.py
    participant litellm_cohere_timeout.py
    participant litellm cohere handler
    participant litellm bedrock handler

    App->>embedding.py: import unstract.sdk1.embedding
    embedding.py->>litellm_cohere_timeout.py: side-effect import (noqa: F401)
    litellm_cohere_timeout.py->>litellm_cohere_timeout.py: check litellm version == 1.82.3
    alt version matches
        litellm_cohere_timeout.py->>litellm cohere handler: handler.embedding = _patched_embedding
        litellm_cohere_timeout.py->>litellm cohere handler: handler.async_embedding = _patched_async_embedding
        litellm_cohere_timeout.py->>litellm bedrock handler: cohere_embedding = _patched_embedding
    else version mismatch
        litellm_cohere_timeout.py->>litellm_cohere_timeout.py: logger.warning("patch SKIPPED")
    end
    App->>embedding.py: Embedding.get_embedding(text)
    embedding.py->>litellm: litellm.embedding(model, input, timeout=...)
    litellm->>litellm cohere handler: embedding(..., timeout=600)
    litellm cohere handler->>litellm cohere handler: client.post(..., timeout=600)  ← patched
Loading

Comments Outside Diff (1)

  1. unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py, line 1-10 (link)

    No async timeout-forwarding test despite PR description claiming one exists

    The PR description's "Notes on Testing" states: "Async path: timeout value forwarded correctly". However, examining the test file, there is no test that actually invokes _patched_async_embedding and verifies that timeout is forwarded to client.post(). The TestMonkeyPatchApplied.test_cohere_handler_patched test only verifies identity (handler.async_embedding is _patched_async_embedding), not that the function body actually forwards the timeout.

    The async code path has a real bug risk — _patched_async_embedding was written to be identical to the sync fix, and the absence of a coverage test for it means any accidental regression would go undetected. A test analogous to test_timeout_passed_to_client_post (using AsyncMock and pytest.mark.asyncio) should cover this path.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: unstract/sdk1/pyproject.toml
Line: 74-76

Comment:
**Duplicate `pytest-asyncio` dependency entries**

Both `"pytest-asyncio>=0.23.0"` (line 74) and `"pytest-asyncio>=0.24.0"` (line 76) are present in the `test` extras. The intent appears to be to raise the minimum version to `0.24.0`, but the old entry was not removed. While tools like `uv`/`pip` will satisfy both constraints by picking `>=0.24.0`, having two entries for the same package is misleading and may confuse future maintainers. The `>=0.23.0` entry should be removed.

```suggestion
    "pytest-asyncio>=0.24.0",
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Line: 1-10

Comment:
**No async timeout-forwarding test despite PR description claiming one exists**

The PR description's "Notes on Testing" states: *"Async path: timeout value forwarded correctly"*. However, examining the test file, there is no test that actually invokes `_patched_async_embedding` and verifies that `timeout` is forwarded to `client.post()`. The `TestMonkeyPatchApplied.test_cohere_handler_patched` test only verifies identity (`handler.async_embedding is _patched_async_embedding`), not that the function body actually forwards the timeout.

The async code path has a real bug risk — `_patched_async_embedding` was written to be identical to the sync fix, and the absence of a coverage test for it means any accidental regression would go undetected. A test analogous to `test_timeout_passed_to_client_post` (using `AsyncMock` and `pytest.mark.asyncio`) should cover this path.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (17): Last reviewed commit: "Merge branch 'fix/litellm-cohere-embed-t..." | Re-trigger Greptile

Comment thread unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)

15-15: Consider adding a TODO marker for discoverability.

Adding TODO: makes this line searchable via grep/IDE tooling, helping ensure timely removal once upstream fixes the bug.

-Remove this patch when litellm ships a fix upstream.
+TODO: Remove this patch when litellm ships a fix upstream.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` at line
15, Update the top-of-file note in litellm_cohere_timeout.py so it includes a
searchable TODO marker; specifically replace or edit the comment "Remove this
patch when litellm ships a fix upstream." to start with "TODO:" (e.g., "TODO:
Remove this patch when litellm ships a fix upstream.") so it is discoverable via
grep/IDE and clearly signals removal once the upstream bug in litellm is fixed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 42-65: The code references module aliases `_cohere_handler` and
`_bedrock_embed` later but only imports individual symbols (e.g.,
validate_environment, CohereEmbeddingConfig, AsyncHTTPHandler,
get_async_httpx_client, LiteLLMLoggingObj, CohereEmbeddingRequest,
EmbeddingResponse); add explicit module imports for the cohere and bedrock embed
modules (import the modules as `_cohere_handler` and `_bedrock_embed`) inside
the same else block so the later references to `_cohere_handler` and
`_bedrock_embed` resolve at runtime.
- Around line 30-32: The version guard currently pins the patch to
_PATCHED_LITELLM_VERSION = "1.80.0" causing _SKIP_PATCH to be true at runtime
for the dependency _litellm_version (which is 1.81.7), so the timeout patch is
skipped; either update _PATCHED_LITELLM_VERSION to "1.81.7" if the timeout bug
still exists in that release, or remove this patch block entirely if the
upstream bug is fixed—modify the constant _PATCHED_LITELLM_VERSION accordingly
and verify the behavior of _SKIP_PATCH (which compares Version(_litellm_version)
!= Version(_PATCHED_LITELLM_VERSION)) so the patch will run only when intended.

---

Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Line 15: Update the top-of-file note in litellm_cohere_timeout.py so it
includes a searchable TODO marker; specifically replace or edit the comment
"Remove this patch when litellm ships a fix upstream." to start with "TODO:"
(e.g., "TODO: Remove this patch when litellm ships a fix upstream.") so it is
discoverable via grep/IDE and clearly signals removal once the upstream bug in
litellm is fixed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 42ebcad6-b1fa-4a6e-91c5-b75e12a511b5

📥 Commits

Reviewing files that changed from the base of the PR and between f6181a6 and 644da84.

📒 Files selected for processing (1)
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py

Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Outdated
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)

48-65: ⚠️ Potential issue | 🔴 Critical

Import the modules you rebind later.

This block imports symbols from the Cohere and Bedrock modules, but lines 213-215 assign through _cohere_handler and _bedrock_embed, which are never defined. On LiteLLM 1.81.7, importing this patch will raise NameError before the monkey-patch is applied, breaking the side-effect activation path from unstract.sdk1.embedding.

🐛 Proposed fix
     import httpx
     import litellm
+    import litellm.llms.bedrock.embed.embedding as _bedrock_embed
+    import litellm.llms.cohere.embed.handler as _cohere_handler
     from litellm.litellm_core_utils.litellm_logging import (
         Logging as LiteLLMLoggingObj,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` around
lines 48 - 65, The patch imports Cohere/Bedrock symbols but later rebinds
globals like _cohere_handler and _bedrock_embed without first defining or
importing them, causing a NameError on import; to fix, import or define the
original symbols before rebinding (e.g., import the existing cohere handler and
bedrock embed functions used in lines that assign to _cohere_handler and
_bedrock_embed), then perform the monkey-patch assignments; locate the rebinding
logic targeting _cohere_handler and _bedrock_embed and ensure the original
symbols are referenced (via imports or safe getattr fallbacks) so the module can
be imported without raising NameError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 48-65: The patch imports Cohere/Bedrock symbols but later rebinds
globals like _cohere_handler and _bedrock_embed without first defining or
importing them, causing a NameError on import; to fix, import or define the
original symbols before rebinding (e.g., import the existing cohere handler and
bedrock embed functions used in lines that assign to _cohere_handler and
_bedrock_embed), then perform the monkey-patch assignments; locate the rebinding
logic targeting _cohere_handler and _bedrock_embed and ensure the original
symbols are referenced (via imports or safe getattr fallbacks) so the module can
be imported without raising NameError.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e8692340-9a40-4b25-ac09-56dbde0e3fb8

📥 Commits

Reviewing files that changed from the base of the PR and between 644da84 and 419d5e9.

📒 Files selected for processing (1)
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py

pk-zipstack and others added 2 commits March 12, 2026 23:12
- Restore `_bedrock_embed` and `_cohere_handler` imports that were
  silently removed by ruff auto-fix (marked with noqa: F811)
- Add test verifying patch activation through the production import
  path (unstract.sdk1.embedding) per reviewer feedback
- Tests are tox-compatible — already covered by [testenv:sdk1] in
  tox.ini

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Outdated
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Praveen Kumar <praveen@zipstack.com>
Comment thread unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
pk-zipstack and others added 2 commits March 13, 2026 15:15
Patch validate_environment alongside CohereEmbeddingConfig to
isolate tests from real litellm code. Prevents test breakage if
litellm changes validate_environment behavior in future versions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Comment thread unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py Outdated
pk-zipstack and others added 2 commits March 13, 2026 15:27
Replace asyncio.run() wrapper with @pytest.mark.asyncio to avoid
RuntimeError when pytest-asyncio is present with a running event
loop. Add pytest-asyncio>=0.24.0 to test dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
pk-zipstack and others added 5 commits March 18, 2026 12:39
DeprecationWarning is suppressed by default in production, so operators
would not notice if the cohere timeout patch silently stops applying
after a litellm upgrade. Switch to logger.warning for visibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Async embedding paths are not used in our codebase. Removing the async
tests avoids the pytest-asyncio dependency and lock file churn across
6 services.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverting lock files to their original branch state. Removing
pytest-asyncio from pyproject.toml caused lock file mismatch
that broke Docker builds using --locked.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The lock file was missing unstract-core and had stale package
versions, causing --locked builds to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@hari-kuriakose hari-kuriakose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pk-zipstack LGTM overall

@pk-zipstack pk-zipstack changed the title [FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings UN-3314 [FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings Mar 23, 2026
@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

Test Results

Summary
  • Runner Tests: 11 passed, 0 failed (11 total)
  • SDK1 Tests: 98 passed, 0 failed (98 total)

Runner Tests - Full Report
filepath function $$\textcolor{#23d18b}{\tt{passed}}$$ SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_logs}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_client\_init}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_run\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$ $$\textcolor{#23d18b}{\tt{11}}$$ $$\textcolor{#23d18b}{\tt{11}}$$
SDK1 Tests - Full Report
filepath function $$\textcolor{#23d18b}{\tt{passed}}$$ SUBTOTAL
$$\textcolor{#23d18b}{\tt{tests/patches/test\_litellm\_cohere\_timeout.py}}$$ $$\textcolor{#23d18b}{\tt{TestPatchedEmbeddingSyncTimeoutForwarding.test\_timeout\_passed\_to\_client\_post}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/patches/test\_litellm\_cohere\_timeout.py}}$$ $$\textcolor{#23d18b}{\tt{TestPatchedEmbeddingSyncTimeoutForwarding.test\_none\_timeout\_passed\_to\_client\_post}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/patches/test\_litellm\_cohere\_timeout.py}}$$ $$\textcolor{#23d18b}{\tt{TestPatchedEmbeddingSyncTimeoutForwarding.test\_httpx\_timeout\_object\_forwarded}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/patches/test\_litellm\_cohere\_timeout.py}}$$ $$\textcolor{#23d18b}{\tt{TestMonkeyPatchApplied.test\_cohere\_handler\_patched}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/patches/test\_litellm\_cohere\_timeout.py}}$$ $$\textcolor{#23d18b}{\tt{TestMonkeyPatchApplied.test\_bedrock\_handler\_patched}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/patches/test\_litellm\_cohere\_timeout.py}}$$ $$\textcolor{#23d18b}{\tt{TestMonkeyPatchApplied.test\_patch\_module\_loaded\_via\_embedding\_import}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatFromLlm.test\_from\_llm\_reuses\_llm\_instance}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatFromLlm.test\_from\_llm\_returns\_llmcompat\_instance}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatFromLlm.test\_from\_llm\_sets\_model\_name}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatFromLlm.test\_from\_llm\_does\_not\_call\_init}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_complete\_delegates\_to\_llm}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_chat\_delegates\_to\_llm\_complete}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_chat\_forwards\_kwargs\_to\_llm}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_complete\_forwards\_kwargs\_to\_llm}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_acomplete\_delegates\_to\_llm}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_achat\_delegates\_to\_llm\_acomplete}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_stream\_chat\_not\_implemented}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_stream\_complete\_not\_implemented}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_astream\_chat\_not\_implemented}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_astream\_complete\_not\_implemented}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_metadata\_returns\_emulated\_type}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_get\_model\_name\_delegates}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_get\_metrics\_delegates}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestLLMCompatDelegation.test\_test\_connection\_delegates}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestEmulatedTypes.test\_message\_role\_values}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestEmulatedTypes.test\_chat\_message\_defaults}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestEmulatedTypes.test\_chat\_response\_message\_access}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestEmulatedTypes.test\_completion\_response\_text}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestEmulatedTypes.test\_llm\_metadata\_defaults}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestMessagesToPrompt.test\_single\_user\_message}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestMessagesToPrompt.test\_none\_content\_becomes\_empty\_string}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestMessagesToPrompt.test\_preserves\_all\_messages}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestMessagesToPrompt.test\_multi\_turn\_conversation}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestMessagesToPrompt.test\_empty\_messages\_returns\_empty\_string}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_llm\_compat.py}}$$ $$\textcolor{#23d18b}{\tt{TestMessagesToPrompt.test\_string\_role\_fallback}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_success\_on\_first\_attempt}}$$ $$\textcolor{#23d18b}{\tt{2}}$$ $$\textcolor{#23d18b}{\tt{2}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_retry\_on\_connection\_error}}$$ $$\textcolor{#23d18b}{\tt{2}}$$ $$\textcolor{#23d18b}{\tt{2}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_non\_retryable\_http\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_retryable\_http\_errors}}$$ $$\textcolor{#23d18b}{\tt{3}}$$ $$\textcolor{#23d18b}{\tt{3}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_post\_method\_retry}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_platform.py}}$$ $$\textcolor{#23d18b}{\tt{TestPlatformHelperRetry.test\_retry\_logging}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_prompt.py}}$$ $$\textcolor{#23d18b}{\tt{TestPromptToolRetry.test\_success\_on\_first\_attempt}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_prompt.py}}$$ $$\textcolor{#23d18b}{\tt{TestPromptToolRetry.test\_retry\_on\_errors}}$$ $$\textcolor{#23d18b}{\tt{2}}$$ $$\textcolor{#23d18b}{\tt{2}}$$
$$\textcolor{#23d18b}{\tt{tests/test\_prompt.py}}$$ $$\textcolor{#23d18b}{\tt{TestPromptToolRetry.test\_wrapper\_methods\_retry}}$$ $$\textcolor{#23d18b}{\tt{4}}$$ $$\textcolor{#23d18b}{\tt{4}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_connection\_error\_is\_retryable}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_timeout\_is\_retryable}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_http\_error\_retryable\_status\_codes}}$$ $$\textcolor{#23d18b}{\tt{3}}$$ $$\textcolor{#23d18b}{\tt{3}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_http\_error\_non\_retryable\_status\_codes}}$$ $$\textcolor{#23d18b}{\tt{5}}$$ $$\textcolor{#23d18b}{\tt{5}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_http\_error\_without\_response}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_os\_error\_retryable\_errno}}$$ $$\textcolor{#23d18b}{\tt{5}}$$ $$\textcolor{#23d18b}{\tt{5}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_os\_error\_non\_retryable\_errno}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestIsRetryableError.test\_other\_exception\_not\_retryable}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCalculateDelay.test\_exponential\_backoff\_without\_jitter}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCalculateDelay.test\_exponential\_backoff\_with\_jitter}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCalculateDelay.test\_max\_delay\_cap}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCalculateDelay.test\_max\_delay\_cap\_with\_jitter}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_successful\_call\_first\_attempt}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_retry\_after\_transient\_failure}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_max\_retries\_exceeded}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_retry\_with\_custom\_predicate}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_no\_retry\_with\_predicate\_false}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryWithExponentialBackoff.test\_exception\_not\_in\_tuple\_not\_retried}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_default\_configuration}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_environment\_variable\_configuration}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_invalid\_max\_retries}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_invalid\_base\_delay}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_invalid\_multiplier}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_jitter\_values}}$$ $$\textcolor{#23d18b}{\tt{2}}$$ $$\textcolor{#23d18b}{\tt{2}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_custom\_exceptions\_only}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_custom\_predicate\_only}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_both\_exceptions\_and\_predicate}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestCreateRetryDecorator.test\_exceptions\_match\_but\_predicate\_false}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestPreconfiguredDecorators.test\_retry\_platform\_service\_call\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestPreconfiguredDecorators.test\_retry\_prompt\_service\_call\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestPreconfiguredDecorators.test\_platform\_service\_decorator\_retries\_on\_connection\_error}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestPreconfiguredDecorators.test\_prompt\_service\_decorator\_retries\_on\_timeout}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryLogging.test\_warning\_logged\_on\_retry}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryLogging.test\_info\_logged\_on\_success\_after\_retry}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{tests/utils/test\_retry\_utils.py}}$$ $$\textcolor{#23d18b}{\tt{TestRetryLogging.test\_exception\_logged\_on\_giving\_up}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$ $$\textcolor{#23d18b}{\tt{98}}$$ $$\textcolor{#23d18b}{\tt{98}}$$

@hari-kuriakose hari-kuriakose merged commit dcb226b into main Mar 23, 2026
9 checks passed
@hari-kuriakose hari-kuriakose deleted the fix/litellm-cohere-embed-timeout branch March 23, 2026 10:21
chandrasekharan-zipstack pushed a commit that referenced this pull request May 20, 2026
…1976)

Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged
litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth
path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but
vulnerability scanners flag the installed package regardless of which
code path is reachable.

Bump to 1.83.10 — the exact version recommended by the upstream advisory
(v1.83.10-stable) and the smallest jump that clears the CVE range while
keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping
python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to
move 0.9 -> 0.12 to satisfy litellm's pin.

Switch source back to PyPI now that the PyPI quarantine is over,
reversing the temporary fork in #1873.

Cohere embed timeout patch: verified that
litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3,
v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug
fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains
OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on
the new version, confirming the monkey-patch still binds correctly.

Other cleanup from #1873:
- Drop git apt-install from worker-unified and tool Dockerfiles (no
  git-sourced deps remain in any uv.lock)
- Bump tool versions: structure 0.0.100 -> 0.0.101,
  classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76

Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing
corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that
blocked incremental resolution. Regenerated from scratch.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jaseemjaskp pushed a commit that referenced this pull request May 21, 2026
* [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918)

* [FIX] Use importlib.util.find_spec for pluggable worker discovery

_verify_pluggable_worker_exists() previously checked for the literal file
`pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin
has been compiled to a .so (Nuitka, Cython, or any C extension) — the
module is perfectly importable but the pre-check rejects it because only
the .py extension is considered.

Replace the filesystem check with importlib.util.find_spec(), which is
Python's standard way to ask "is this module resolvable by the import
system?". It honors every registered finder — source .py, compiled .so,
bytecode .pyc, namespace packages, zipimports — so the function now
matches what its docstring claims: verifying the module can be loaded,
not that a specific file extension is present.

Behavior is preserved for existing deployments:
- Images with no `pluggable_worker/<name>/` subpackage → find_spec
  raises ModuleNotFoundError (ImportError subclass) → returns False.
- Images with source .py → find_spec resolves the .py → returns True.
- Images with compiled .so → find_spec resolves the .so → returns True.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Handle ValueError from find_spec in pluggable worker verification

Greptile-flagged edge case: importlib.util.find_spec() can raise
ValueError (not just ImportError) when sys.modules has a partially
initialised module entry with __spec__ = None from a prior failed import.
Broaden the except to catch both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Resolve api-deployment worker directory from enum import path

worker.py:452 did worker_type.value.replace("-", "_") to derive the
on-disk dir name. All WorkerType enum values already use underscores,
so the replace was a no-op; for API_DEPLOYMENT whose dir is
"api-deployment" (hyphen), it resolved to "api_deployment" and the
os.path.exists() check failed. Boot then logged a spurious
"❌ Worker directory not found: /app/api_deployment" at ERROR level.

The task registration path (builder + celery autodiscover via
to_import_path) is unaffected, so this was purely log noise — but
noise at ERROR level that masks real failures in log scans.

Fix: derive the directory from the authoritative to_import_path()
which already handles the hyphen case (api_deployment -> api-deployment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944)

* [FEAT] Allow Bedrock to fall through to boto3's default credential chain

Match the S3/MinIO connector pattern: when AWS access keys are left blank
on the Bedrock LLM and embedding adapter forms, drop them from the kwargs
dict so boto3's default credential chain handles authentication. This
unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on
hosts that already have ambient AWS credentials (e.g. EKS workers with
IRSA, EC2 with an instance profile).

- llm1/static/bedrock.json: clarify access-key descriptions to mention
  IRSA and instance profile (already non-required at v0.163.2 base).
- embedding1/static/bedrock.json: drop aws_access_key_id and
  aws_secret_access_key from top-level required; same description fix;
  expose aws_profile_name for parity with the LLM form.
- base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters
  now strip empty access-key values from the validated kwargs before
  returning, so empty strings don't override boto3's default chain.
  AWSBedrockEmbeddingParameters fields gain explicit None defaults
  and an aws_profile_name field.

Backward-compatible: existing adapters with access keys filled in
continue to work unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FEAT] Add Authentication Type selector to Bedrock adapter form

Add an explicit `auth_type` selector with two options, making the auth
choice clear to users:

- "Access Keys" (default): existing flow, keys required
- "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on
  boto3's default credential chain (IRSA on EKS, task role on ECS,
  instance profile on EC2). Description on the selector explicitly notes
  this option is only for AWS-hosted Unstract deployments.

The form-only auth_type field is stripped before LiteLLM validation in
both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters.
validate(). Empty access keys continue to be stripped so boto3 falls
through to the default chain even when the access_keys arm is selected
without values (matches the S3/MinIO connector pattern).

Backward-compatible: legacy adapters without auth_type behave as
"Access Keys" mode (the default), and existing keys are forwarded
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [REVIEW] Address Bedrock auth_type review feedback

Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on
PR #1944.

Behaviour fixes:
- Stale-key leak in IAM Role mode: switching an existing adapter from
  Access Keys to IAM Role would carry truthy stored access keys through
  the strip-empty-only loop, so boto3 silently authenticated with the
  old long-lived credentials instead of falling through to the host's
  IRSA / instance-profile identity. Both LLM and embedding paths were
  affected.
- Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or
  a malformed payload from a non-UI client passed through the dict
  comprehension untouched, with no enum guard.
- Cross-field validation gap: explicit Access Keys mode with blank or
  whitespace-only values silently fell through to the default
  credential chain instead of surfacing the misconfiguration.

Implementation:
- Add a module-level _resolve_bedrock_aws_credentials helper used by
  both AWSBedrockLLMParameters.validate() and AWSBedrock
  EmbeddingParameters.validate(), so the auth-type contract is
  expressed once.
  - Validates auth_type against an allowlist (None | "access_keys" |
    "iam_role"); raises ValueError on anything else.
  - iam_role: unconditionally drops aws_access_key_id and
    aws_secret_access_key.
  - access_keys (explicit): requires non-blank values; raises ValueError
    if either is empty or whitespace-only.
  - Legacy (auth_type absent): retains the lenient strip behaviour so
    pre-PR adapter configurations continue to deserialise unchanged.
- Restore aws_region_name as required (no `= None` default) on
  AWSBedrockEmbeddingParameters; only credentials may legitimately be
  absent.
- Drop the orphan aws_profile_name field from
  embedding1/static/bedrock.json: it was added for parity with the LLM
  form but lives outside the auth_type oneOf and contradicts the
  selector's "no further input" semantics. The LLM form already had
  aws_profile_name pre-PR and is left alone for backwards compatibility.

Tests:
- New tests/test_bedrock_adapter.py covers 15 cases across LLM and
  embedding adapters: legacy-no-auth-type, explicit access_keys with
  valid/blank/whitespace keys, iam_role with stale/no keys, unknown
  auth_type rejection, cross-field validation, and preservation of
  unrelated params (model_id, aws_profile_name, region, thinking).

Skipped (P2 nice-to-have):
- Comment-scope clarification, MinIO reference rewording,
  validate-mutates-caller'\''s-dict, and the LLM form description nit
  about aws_profile_name visibility. These don'\''t change behaviour
  and can be addressed in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [HOTFIX] Bump litellm to 1.83.10 from PyPI to clear CVE-2026-42208 (#1976)

Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged
litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth
path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but
vulnerability scanners flag the installed package regardless of which
code path is reachable.

Bump to 1.83.10 — the exact version recommended by the upstream advisory
(v1.83.10-stable) and the smallest jump that clears the CVE range while
keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping
python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to
move 0.9 -> 0.12 to satisfy litellm's pin.

Switch source back to PyPI now that the PyPI quarantine is over,
reversing the temporary fork in #1873.

Cohere embed timeout patch: verified that
litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3,
v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug
fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains
OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on
the new version, confirming the monkey-patch still binds correctly.

Other cleanup from #1873:
- Drop git apt-install from worker-unified and tool Dockerfiles (no
  git-sourced deps remain in any uv.lock)
- Bump tool versions: structure 0.0.100 -> 0.0.101,
  classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76

Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing
corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that
blocked incremental resolution. Regenerated from scratch.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Align cohere patch docstring with version-guard semantics

Reviewer flagged that the docstring claimed the patch is "confirmed in
every release between 1.82.3 and 1.83.14-stable", but the guard at
_PATCHED_LITELLM_VERSION activates only on the exact pinned version. A
future maintainer reading the old text could reasonably expect bumping
to e.g. 1.83.11 to keep the fix active; in reality it silently turns
off.

Rewritten to reference _PATCHED_LITELLM_VERSION as the single source of
truth and to drop the rot-prone "as of 2026-05-20" calendar date.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
harini-venkataraman pushed a commit that referenced this pull request May 21, 2026
* [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918)

* [FIX] Use importlib.util.find_spec for pluggable worker discovery

_verify_pluggable_worker_exists() previously checked for the literal file
`pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin
has been compiled to a .so (Nuitka, Cython, or any C extension) — the
module is perfectly importable but the pre-check rejects it because only
the .py extension is considered.

Replace the filesystem check with importlib.util.find_spec(), which is
Python's standard way to ask "is this module resolvable by the import
system?". It honors every registered finder — source .py, compiled .so,
bytecode .pyc, namespace packages, zipimports — so the function now
matches what its docstring claims: verifying the module can be loaded,
not that a specific file extension is present.

Behavior is preserved for existing deployments:
- Images with no `pluggable_worker/<name>/` subpackage → find_spec
  raises ModuleNotFoundError (ImportError subclass) → returns False.
- Images with source .py → find_spec resolves the .py → returns True.
- Images with compiled .so → find_spec resolves the .so → returns True.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Handle ValueError from find_spec in pluggable worker verification

Greptile-flagged edge case: importlib.util.find_spec() can raise
ValueError (not just ImportError) when sys.modules has a partially
initialised module entry with __spec__ = None from a prior failed import.
Broaden the except to catch both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Resolve api-deployment worker directory from enum import path

worker.py:452 did worker_type.value.replace("-", "_") to derive the
on-disk dir name. All WorkerType enum values already use underscores,
so the replace was a no-op; for API_DEPLOYMENT whose dir is
"api-deployment" (hyphen), it resolved to "api_deployment" and the
os.path.exists() check failed. Boot then logged a spurious
"❌ Worker directory not found: /app/api_deployment" at ERROR level.

The task registration path (builder + celery autodiscover via
to_import_path) is unaffected, so this was purely log noise — but
noise at ERROR level that masks real failures in log scans.

Fix: derive the directory from the authoritative to_import_path()
which already handles the hyphen case (api_deployment -> api-deployment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944)

* [FEAT] Allow Bedrock to fall through to boto3's default credential chain

Match the S3/MinIO connector pattern: when AWS access keys are left blank
on the Bedrock LLM and embedding adapter forms, drop them from the kwargs
dict so boto3's default credential chain handles authentication. This
unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on
hosts that already have ambient AWS credentials (e.g. EKS workers with
IRSA, EC2 with an instance profile).

- llm1/static/bedrock.json: clarify access-key descriptions to mention
  IRSA and instance profile (already non-required at v0.163.2 base).
- embedding1/static/bedrock.json: drop aws_access_key_id and
  aws_secret_access_key from top-level required; same description fix;
  expose aws_profile_name for parity with the LLM form.
- base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters
  now strip empty access-key values from the validated kwargs before
  returning, so empty strings don't override boto3's default chain.
  AWSBedrockEmbeddingParameters fields gain explicit None defaults
  and an aws_profile_name field.

Backward-compatible: existing adapters with access keys filled in
continue to work unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FEAT] Add Authentication Type selector to Bedrock adapter form

Add an explicit `auth_type` selector with two options, making the auth
choice clear to users:

- "Access Keys" (default): existing flow, keys required
- "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on
  boto3's default credential chain (IRSA on EKS, task role on ECS,
  instance profile on EC2). Description on the selector explicitly notes
  this option is only for AWS-hosted Unstract deployments.

The form-only auth_type field is stripped before LiteLLM validation in
both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters.
validate(). Empty access keys continue to be stripped so boto3 falls
through to the default chain even when the access_keys arm is selected
without values (matches the S3/MinIO connector pattern).

Backward-compatible: legacy adapters without auth_type behave as
"Access Keys" mode (the default), and existing keys are forwarded
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [REVIEW] Address Bedrock auth_type review feedback

Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on
PR #1944.

Behaviour fixes:
- Stale-key leak in IAM Role mode: switching an existing adapter from
  Access Keys to IAM Role would carry truthy stored access keys through
  the strip-empty-only loop, so boto3 silently authenticated with the
  old long-lived credentials instead of falling through to the host's
  IRSA / instance-profile identity. Both LLM and embedding paths were
  affected.
- Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or
  a malformed payload from a non-UI client passed through the dict
  comprehension untouched, with no enum guard.
- Cross-field validation gap: explicit Access Keys mode with blank or
  whitespace-only values silently fell through to the default
  credential chain instead of surfacing the misconfiguration.

Implementation:
- Add a module-level _resolve_bedrock_aws_credentials helper used by
  both AWSBedrockLLMParameters.validate() and AWSBedrock
  EmbeddingParameters.validate(), so the auth-type contract is
  expressed once.
  - Validates auth_type against an allowlist (None | "access_keys" |
    "iam_role"); raises ValueError on anything else.
  - iam_role: unconditionally drops aws_access_key_id and
    aws_secret_access_key.
  - access_keys (explicit): requires non-blank values; raises ValueError
    if either is empty or whitespace-only.
  - Legacy (auth_type absent): retains the lenient strip behaviour so
    pre-PR adapter configurations continue to deserialise unchanged.
- Restore aws_region_name as required (no `= None` default) on
  AWSBedrockEmbeddingParameters; only credentials may legitimately be
  absent.
- Drop the orphan aws_profile_name field from
  embedding1/static/bedrock.json: it was added for parity with the LLM
  form but lives outside the auth_type oneOf and contradicts the
  selector's "no further input" semantics. The LLM form already had
  aws_profile_name pre-PR and is left alone for backwards compatibility.

Tests:
- New tests/test_bedrock_adapter.py covers 15 cases across LLM and
  embedding adapters: legacy-no-auth-type, explicit access_keys with
  valid/blank/whitespace keys, iam_role with stale/no keys, unknown
  auth_type rejection, cross-field validation, and preservation of
  unrelated params (model_id, aws_profile_name, region, thinking).

Skipped (P2 nice-to-have):
- Comment-scope clarification, MinIO reference rewording,
  validate-mutates-caller'\''s-dict, and the LLM form description nit
  about aws_profile_name visibility. These don'\''t change behaviour
  and can be addressed in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [HOTFIX] Bump litellm to 1.83.10 from PyPI to clear CVE-2026-42208 (#1976)

Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged
litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth
path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but
vulnerability scanners flag the installed package regardless of which
code path is reachable.

Bump to 1.83.10 — the exact version recommended by the upstream advisory
(v1.83.10-stable) and the smallest jump that clears the CVE range while
keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping
python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to
move 0.9 -> 0.12 to satisfy litellm's pin.

Switch source back to PyPI now that the PyPI quarantine is over,
reversing the temporary fork in #1873.

Cohere embed timeout patch: verified that
litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3,
v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug
fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains
OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on
the new version, confirming the monkey-patch still binds correctly.

Other cleanup from #1873:
- Drop git apt-install from worker-unified and tool Dockerfiles (no
  git-sourced deps remain in any uv.lock)
- Bump tool versions: structure 0.0.100 -> 0.0.101,
  classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76

Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing
corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that
blocked incremental resolution. Regenerated from scratch.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Align cohere patch docstring with version-guard semantics

Reviewer flagged that the docstring claimed the patch is "confirmed in
every release between 1.82.3 and 1.83.14-stable", but the guard at
_PATCHED_LITELLM_VERSION activates only on the exact pinned version. A
future maintainer reading the old text could reasonably expect bumping
to e.g. 1.83.11 to keep the fix active; in reality it silently turns
off.

Rewritten to reference _PATCHED_LITELLM_VERSION as the single source of
truth and to drop the rot-prone "as of 2026-05-20" calendar date.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants