UN-3314 [FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings#1848
Conversation
litellm's cohere embed handler (1.80.0) receives a timeout parameter but doesn't pass it to client.post(), causing "Connection timed out after None seconds" on large Bedrock embedding requests. This adds a monkey-patch that replaces the affected functions with versions that correctly forward timeout. Includes version guard, source comments, and unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ions - Version guard now skips patch entirely when litellm > 1.80.0 instead of just warning - Test assertions now check exact timeout value received by client.post(), not just that it was called - Inline comments at client.post() calls marked with ONLY CHANGE Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new patches package and a litellm Cohere timeout monkey-patch (sync + async) that forwards timeout arguments to underlying HTTP client calls; the patch is applied by a side-effect import in the SDK embedding module and unit tests verify timeout propagation and wiring. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant SDK as "SDK embedding\n(unstract.sdk1.embedding)"
participant Patch as "Patch module\n(litellm_cohere_timeout)"
participant Litellm as "litellm embed\nhandler (wrapped)"
participant HTTP as "HTTP client\n(e.g., httpx)"
User->>SDK: call embed(inputs, timeout=...)
SDK->>Patch: import (side-effect) / use patched handler
Patch->>Litellm: invoke wrapped handler
Litellm->>HTTP: client.post(..., timeout=passed_timeout)
HTTP-->>Litellm: response
Litellm-->>Patch: embedding result
Patch-->>SDK: return embeddings
SDK-->>User: return embeddings
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
- Remove useless self-assignment `model = model` - Use int timeout values in tests to avoid float equality checks - Use `is` identity checks instead of `==` for timeout assertions - Replace async mock_post with AsyncMock to properly use async features Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py (1)
7-10: Exercise the production import hook in one test.Because this file imports the patch module directly, the wiring checks only prove direct-import behavior. Add one integration test that imports
unstract.sdk1.embeddingand asserts the handler is patched, so Line 7 inunstract/sdk1/src/unstract/sdk1/embedding.pycannot regress silently.Also applies to: 194-203
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py` around lines 7 - 10, Add an integration test that imports unstract.sdk1.embedding (instead of importing the patch module directly) and asserts the production import hook applied the patch: after importing unstract.sdk1.embedding, verify the module's embedding handler uses the patched implementations by checking that its async and sync embedding callables resolve to the symbols _patched_async_embedding and _patched_embedding (or that their identities/reference equality match those functions imported from unstract.sdk1.patches.litellm_cohere_timeout); update or add this assertion into test_litellm_cohere_timeout.py alongside the existing direct-import checks so the production import wiring (line referencing unstract.sdk1.embedding) is exercised and cannot regress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 26-43: The patch currently imports private litellm.llms.* modules
before checking _SKIP_PATCH and uses a loose version check, so move all
private/internal imports (e.g., validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse and any litellm.llms.* import lines) to after the guard that
computes _SKIP_PATCH, change the version gate from a ">" check to an exact
equality check against the known compatible LiteLLM version, and when skipping
emit a visible warning via warnings.warn (not just DeprecationWarning swallowed
by default) so callers know the patch was skipped; ensure functions/classes
referenced in the patch (validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) are only imported after the guard.
---
Nitpick comments:
In `@unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py`:
- Around line 7-10: Add an integration test that imports unstract.sdk1.embedding
(instead of importing the patch module directly) and asserts the production
import hook applied the patch: after importing unstract.sdk1.embedding, verify
the module's embedding handler uses the patched implementations by checking that
its async and sync embedding callables resolve to the symbols
_patched_async_embedding and _patched_embedding (or that their
identities/reference equality match those functions imported from
unstract.sdk1.patches.litellm_cohere_timeout); update or add this assertion into
test_litellm_cohere_timeout.py alongside the existing direct-import checks so
the production import wiring (line referencing unstract.sdk1.embedding) is
exercised and cannot regress.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f70cb6f9-69dd-431a-97bc-906d92c597ff
📒 Files selected for processing (5)
unstract/sdk1/src/unstract/sdk1/embedding.pyunstract/sdk1/src/unstract/sdk1/patches/__init__.pyunstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.pyunstract/sdk1/tests/patches/__init__.pyunstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)
26-64:⚠️ Potential issue | 🟠 MajorMove the private LiteLLM imports behind the guard and gate on the exact tested version.
Lines 26-43 still import private
litellm.llms.*modules before_SKIP_PATCHis evaluated, so a future LiteLLM reorg can fail at import time instead of cleanly skipping. Also, Line 56 only skips> 1.80.0; for a copied private implementation,!=is safer. The skip warning on Lines 58-64 usesDeprecationWarning, which is usually hidden for imported library code, so operators may never notice that the patch was not applied. With the provided dependency context showing LiteLLM 1.81.7, this path would currently skip.🔧 Suggested fix
import importlib.metadata import json import logging import warnings from collections.abc import Callable import httpx import litellm -import litellm.llms.bedrock.embed.embedding as _bedrock_embed -import litellm.llms.cohere.embed.handler as _cohere_handler from litellm.litellm_core_utils.litellm_logging import ( Logging as LiteLLMLoggingObj, ) -from litellm.llms.cohere.embed.handler import ( - validate_environment, -) -from litellm.llms.cohere.embed.v1_transformation import ( - CohereEmbeddingConfig, -) -from litellm.llms.custom_httpx.http_handler import ( - AsyncHTTPHandler, - HTTPHandler, - get_async_httpx_client, -) -from litellm.types.llms.bedrock import CohereEmbeddingRequest -from litellm.types.utils import EmbeddingResponse from packaging.version import Version logger = logging.getLogger(__name__) _DEFAULT_TIMEOUT = httpx.Timeout(None) _PATCHED_LITELLM_VERSION = "1.80.0" _litellm_version = importlib.metadata.version("litellm") -_SKIP_PATCH = Version(_litellm_version) > Version(_PATCHED_LITELLM_VERSION) +_SKIP_PATCH = Version(_litellm_version) != Version(_PATCHED_LITELLM_VERSION) if _SKIP_PATCH: warnings.warn( "litellm_cohere_timeout patch was SKIPPED — not applied. " f"Current litellm version: {_litellm_version}. " f"Patch was written for: {_PATCHED_LITELLM_VERSION}. " "Please verify the upstream fix and remove this module.", - DeprecationWarning, + RuntimeWarning, stacklevel=2, ) +else: + import litellm.llms.bedrock.embed.embedding as _bedrock_embed + import litellm.llms.cohere.embed.handler as _cohere_handler + from litellm.llms.cohere.embed.handler import validate_environment + from litellm.llms.cohere.embed.v1_transformation import CohereEmbeddingConfig + from litellm.llms.custom_httpx.http_handler import ( + AsyncHTTPHandler, + HTTPHandler, + get_async_httpx_client, + ) + from litellm.types.llms.bedrock import CohereEmbeddingRequest + from litellm.types.utils import EmbeddingResponse🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` around lines 26 - 64, The current file imports private litellm modules before computing _SKIP_PATCH, which can cause import-time failures; move all imports of litellm.llms.* and litellm.litellm_core_utils.* (symbols: _bedrock_embed, _cohere_handler, LiteLLMLoggingObj, validate_environment, CohereEmbeddingConfig, AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest, EmbeddingResponse) so they occur only after computing _litellm_version and _SKIP_PATCH; change the version check to use equality against _PATCHED_LITELLM_VERSION (i.e. skip only when Version(_litellm_version) == Version(_PATCHED_LITELLM_VERSION) or gate with != as requested) and replace the DeprecationWarning with a visible warning class such as UserWarning or RuntimeWarning so operators see the warning when the patch is skipped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 26-64: The current file imports private litellm modules before
computing _SKIP_PATCH, which can cause import-time failures; move all imports of
litellm.llms.* and litellm.litellm_core_utils.* (symbols: _bedrock_embed,
_cohere_handler, LiteLLMLoggingObj, validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) so they occur only after computing _litellm_version and
_SKIP_PATCH; change the version check to use equality against
_PATCHED_LITELLM_VERSION (i.e. skip only when Version(_litellm_version) ==
Version(_PATCHED_LITELLM_VERSION) or gate with != as requested) and replace the
DeprecationWarning with a visible warning class such as UserWarning or
RuntimeWarning so operators see the warning when the patch is skipped.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 651cf418-be2b-4238-bfe1-13ad72694bba
📒 Files selected for processing (2)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.pyunstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
🚧 Files skipped from review as they are similar to previous changes (1)
- unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
- Move all litellm.llms.* imports inside the `if not _SKIP_PATCH` block so they're only loaded when the patch will be applied - Change version gate from `>` to exact equality (`!=`) — patch only applies on the exact known-affected version (1.80.0) - Use RuntimeWarning instead of DeprecationWarning for skip message so it's visible by default (DeprecationWarning is filtered) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 30-41: The version guard (_PATCHED_LITELLM_VERSION,
_litellm_version, _SKIP_PATCH) causes the patch to be skipped because sdk1 uses
LiteLLM 1.81.7; update the constant _PATCHED_LITELLM_VERSION to "1.81.7" (and
run tests/verify that the Cohere timeout fix is effective in that release) or,
alternatively, revert sdk1's litellm dependency to "1.80.0" so the existing
patch applies—ensure the chosen approach makes _SKIP_PATCH False so the patch
code executes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0422bc8a-b039-47cf-aeeb-e54e2c5640e2
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
hari-kuriakose
left a comment
There was a problem hiding this comment.
@pk-zipstack LGTM overall.
However request few critical changes though.
Co-authored-by: Hari John Kuriakose <hari@zipstack.com> Signed-off-by: Praveen Kumar <praveen@zipstack.com>
for more information, see https://pre-commit.ci
Greptile SummaryThis PR introduces a monkey-patch (
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant App
participant embedding.py
participant litellm_cohere_timeout.py
participant litellm cohere handler
participant litellm bedrock handler
App->>embedding.py: import unstract.sdk1.embedding
embedding.py->>litellm_cohere_timeout.py: side-effect import (noqa: F401)
litellm_cohere_timeout.py->>litellm_cohere_timeout.py: check litellm version == 1.82.3
alt version matches
litellm_cohere_timeout.py->>litellm cohere handler: handler.embedding = _patched_embedding
litellm_cohere_timeout.py->>litellm cohere handler: handler.async_embedding = _patched_async_embedding
litellm_cohere_timeout.py->>litellm bedrock handler: cohere_embedding = _patched_embedding
else version mismatch
litellm_cohere_timeout.py->>litellm_cohere_timeout.py: logger.warning("patch SKIPPED")
end
App->>embedding.py: Embedding.get_embedding(text)
embedding.py->>litellm: litellm.embedding(model, input, timeout=...)
litellm->>litellm cohere handler: embedding(..., timeout=600)
litellm cohere handler->>litellm cohere handler: client.post(..., timeout=600) ← patched
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)
15-15: Consider adding a TODO marker for discoverability.Adding
TODO:makes this line searchable via grep/IDE tooling, helping ensure timely removal once upstream fixes the bug.-Remove this patch when litellm ships a fix upstream. +TODO: Remove this patch when litellm ships a fix upstream.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` at line 15, Update the top-of-file note in litellm_cohere_timeout.py so it includes a searchable TODO marker; specifically replace or edit the comment "Remove this patch when litellm ships a fix upstream." to start with "TODO:" (e.g., "TODO: Remove this patch when litellm ships a fix upstream.") so it is discoverable via grep/IDE and clearly signals removal once the upstream bug in litellm is fixed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 42-65: The code references module aliases `_cohere_handler` and
`_bedrock_embed` later but only imports individual symbols (e.g.,
validate_environment, CohereEmbeddingConfig, AsyncHTTPHandler,
get_async_httpx_client, LiteLLMLoggingObj, CohereEmbeddingRequest,
EmbeddingResponse); add explicit module imports for the cohere and bedrock embed
modules (import the modules as `_cohere_handler` and `_bedrock_embed`) inside
the same else block so the later references to `_cohere_handler` and
`_bedrock_embed` resolve at runtime.
- Around line 30-32: The version guard currently pins the patch to
_PATCHED_LITELLM_VERSION = "1.80.0" causing _SKIP_PATCH to be true at runtime
for the dependency _litellm_version (which is 1.81.7), so the timeout patch is
skipped; either update _PATCHED_LITELLM_VERSION to "1.81.7" if the timeout bug
still exists in that release, or remove this patch block entirely if the
upstream bug is fixed—modify the constant _PATCHED_LITELLM_VERSION accordingly
and verify the behavior of _SKIP_PATCH (which compares Version(_litellm_version)
!= Version(_PATCHED_LITELLM_VERSION)) so the patch will run only when intended.
---
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Line 15: Update the top-of-file note in litellm_cohere_timeout.py so it
includes a searchable TODO marker; specifically replace or edit the comment
"Remove this patch when litellm ships a fix upstream." to start with "TODO:"
(e.g., "TODO: Remove this patch when litellm ships a fix upstream.") so it is
discoverable via grep/IDE and clearly signals removal once the upstream bug in
litellm is fixed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 42ebcad6-b1fa-4a6e-91c5-b75e12a511b5
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)
48-65:⚠️ Potential issue | 🔴 CriticalImport the modules you rebind later.
This block imports symbols from the Cohere and Bedrock modules, but lines 213-215 assign through
_cohere_handlerand_bedrock_embed, which are never defined. On LiteLLM1.81.7, importing this patch will raiseNameErrorbefore the monkey-patch is applied, breaking the side-effect activation path fromunstract.sdk1.embedding.🐛 Proposed fix
import httpx import litellm + import litellm.llms.bedrock.embed.embedding as _bedrock_embed + import litellm.llms.cohere.embed.handler as _cohere_handler from litellm.litellm_core_utils.litellm_logging import ( Logging as LiteLLMLoggingObj, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` around lines 48 - 65, The patch imports Cohere/Bedrock symbols but later rebinds globals like _cohere_handler and _bedrock_embed without first defining or importing them, causing a NameError on import; to fix, import or define the original symbols before rebinding (e.g., import the existing cohere handler and bedrock embed functions used in lines that assign to _cohere_handler and _bedrock_embed), then perform the monkey-patch assignments; locate the rebinding logic targeting _cohere_handler and _bedrock_embed and ensure the original symbols are referenced (via imports or safe getattr fallbacks) so the module can be imported without raising NameError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 48-65: The patch imports Cohere/Bedrock symbols but later rebinds
globals like _cohere_handler and _bedrock_embed without first defining or
importing them, causing a NameError on import; to fix, import or define the
original symbols before rebinding (e.g., import the existing cohere handler and
bedrock embed functions used in lines that assign to _cohere_handler and
_bedrock_embed), then perform the monkey-patch assignments; locate the rebinding
logic targeting _cohere_handler and _bedrock_embed and ensure the original
symbols are referenced (via imports or safe getattr fallbacks) so the module can
be imported without raising NameError.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e8692340-9a40-4b25-ac09-56dbde0e3fb8
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
- Restore `_bedrock_embed` and `_cohere_handler` imports that were silently removed by ruff auto-fix (marked with noqa: F811) - Add test verifying patch activation through the production import path (unstract.sdk1.embedding) per reviewer feedback - Tests are tox-compatible — already covered by [testenv:sdk1] in tox.ini Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Praveen Kumar <praveen@zipstack.com>
Patch validate_environment alongside CohereEmbeddingConfig to isolate tests from real litellm code. Prevents test breakage if litellm changes validate_environment behavior in future versions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Replace asyncio.run() wrapper with @pytest.mark.asyncio to avoid RuntimeError when pytest-asyncio is present with a running event loop. Add pytest-asyncio>=0.24.0 to test dependencies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DeprecationWarning is suppressed by default in production, so operators would not notice if the cohere timeout patch silently stops applying after a litellm upgrade. Switch to logger.warning for visibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Async embedding paths are not used in our codebase. Removing the async tests avoids the pytest-asyncio dependency and lock file churn across 6 services. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverting lock files to their original branch state. Removing pytest-asyncio from pyproject.toml caused lock file mismatch that broke Docker builds using --locked. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The lock file was missing unstract-core and had stale package versions, causing --locked builds to fail. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
hari-kuriakose
left a comment
There was a problem hiding this comment.
@pk-zipstack LGTM overall
|
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
…1976) Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but vulnerability scanners flag the installed package regardless of which code path is reachable. Bump to 1.83.10 — the exact version recommended by the upstream advisory (v1.83.10-stable) and the smallest jump that clears the CVE range while keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to move 0.9 -> 0.12 to satisfy litellm's pin. Switch source back to PyPI now that the PyPI quarantine is over, reversing the temporary fork in #1873. Cohere embed timeout patch: verified that litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3, v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on the new version, confirming the monkey-patch still binds correctly. Other cleanup from #1873: - Drop git apt-install from worker-unified and tool Dockerfiles (no git-sourced deps remain in any uv.lock) - Bump tool versions: structure 0.0.100 -> 0.0.101, classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76 Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that blocked incremental resolution. Regenerated from scratch. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918) * [FIX] Use importlib.util.find_spec for pluggable worker discovery _verify_pluggable_worker_exists() previously checked for the literal file `pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin has been compiled to a .so (Nuitka, Cython, or any C extension) — the module is perfectly importable but the pre-check rejects it because only the .py extension is considered. Replace the filesystem check with importlib.util.find_spec(), which is Python's standard way to ask "is this module resolvable by the import system?". It honors every registered finder — source .py, compiled .so, bytecode .pyc, namespace packages, zipimports — so the function now matches what its docstring claims: verifying the module can be loaded, not that a specific file extension is present. Behavior is preserved for existing deployments: - Images with no `pluggable_worker/<name>/` subpackage → find_spec raises ModuleNotFoundError (ImportError subclass) → returns False. - Images with source .py → find_spec resolves the .py → returns True. - Images with compiled .so → find_spec resolves the .so → returns True. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Handle ValueError from find_spec in pluggable worker verification Greptile-flagged edge case: importlib.util.find_spec() can raise ValueError (not just ImportError) when sys.modules has a partially initialised module entry with __spec__ = None from a prior failed import. Broaden the except to catch both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Resolve api-deployment worker directory from enum import path worker.py:452 did worker_type.value.replace("-", "_") to derive the on-disk dir name. All WorkerType enum values already use underscores, so the replace was a no-op; for API_DEPLOYMENT whose dir is "api-deployment" (hyphen), it resolved to "api_deployment" and the os.path.exists() check failed. Boot then logged a spurious "❌ Worker directory not found: /app/api_deployment" at ERROR level. The task registration path (builder + celery autodiscover via to_import_path) is unaffected, so this was purely log noise — but noise at ERROR level that masks real failures in log scans. Fix: derive the directory from the authoritative to_import_path() which already handles the hyphen case (api_deployment -> api-deployment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944) * [FEAT] Allow Bedrock to fall through to boto3's default credential chain Match the S3/MinIO connector pattern: when AWS access keys are left blank on the Bedrock LLM and embedding adapter forms, drop them from the kwargs dict so boto3's default credential chain handles authentication. This unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on hosts that already have ambient AWS credentials (e.g. EKS workers with IRSA, EC2 with an instance profile). - llm1/static/bedrock.json: clarify access-key descriptions to mention IRSA and instance profile (already non-required at v0.163.2 base). - embedding1/static/bedrock.json: drop aws_access_key_id and aws_secret_access_key from top-level required; same description fix; expose aws_profile_name for parity with the LLM form. - base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters now strip empty access-key values from the validated kwargs before returning, so empty strings don't override boto3's default chain. AWSBedrockEmbeddingParameters fields gain explicit None defaults and an aws_profile_name field. Backward-compatible: existing adapters with access keys filled in continue to work unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FEAT] Add Authentication Type selector to Bedrock adapter form Add an explicit `auth_type` selector with two options, making the auth choice clear to users: - "Access Keys" (default): existing flow, keys required - "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on boto3's default credential chain (IRSA on EKS, task role on ECS, instance profile on EC2). Description on the selector explicitly notes this option is only for AWS-hosted Unstract deployments. The form-only auth_type field is stripped before LiteLLM validation in both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters. validate(). Empty access keys continue to be stripped so boto3 falls through to the default chain even when the access_keys arm is selected without values (matches the S3/MinIO connector pattern). Backward-compatible: legacy adapters without auth_type behave as "Access Keys" mode (the default), and existing keys are forwarded unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REVIEW] Address Bedrock auth_type review feedback Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on PR #1944. Behaviour fixes: - Stale-key leak in IAM Role mode: switching an existing adapter from Access Keys to IAM Role would carry truthy stored access keys through the strip-empty-only loop, so boto3 silently authenticated with the old long-lived credentials instead of falling through to the host's IRSA / instance-profile identity. Both LLM and embedding paths were affected. - Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or a malformed payload from a non-UI client passed through the dict comprehension untouched, with no enum guard. - Cross-field validation gap: explicit Access Keys mode with blank or whitespace-only values silently fell through to the default credential chain instead of surfacing the misconfiguration. Implementation: - Add a module-level _resolve_bedrock_aws_credentials helper used by both AWSBedrockLLMParameters.validate() and AWSBedrock EmbeddingParameters.validate(), so the auth-type contract is expressed once. - Validates auth_type against an allowlist (None | "access_keys" | "iam_role"); raises ValueError on anything else. - iam_role: unconditionally drops aws_access_key_id and aws_secret_access_key. - access_keys (explicit): requires non-blank values; raises ValueError if either is empty or whitespace-only. - Legacy (auth_type absent): retains the lenient strip behaviour so pre-PR adapter configurations continue to deserialise unchanged. - Restore aws_region_name as required (no `= None` default) on AWSBedrockEmbeddingParameters; only credentials may legitimately be absent. - Drop the orphan aws_profile_name field from embedding1/static/bedrock.json: it was added for parity with the LLM form but lives outside the auth_type oneOf and contradicts the selector's "no further input" semantics. The LLM form already had aws_profile_name pre-PR and is left alone for backwards compatibility. Tests: - New tests/test_bedrock_adapter.py covers 15 cases across LLM and embedding adapters: legacy-no-auth-type, explicit access_keys with valid/blank/whitespace keys, iam_role with stale/no keys, unknown auth_type rejection, cross-field validation, and preservation of unrelated params (model_id, aws_profile_name, region, thinking). Skipped (P2 nice-to-have): - Comment-scope clarification, MinIO reference rewording, validate-mutates-caller'\''s-dict, and the LLM form description nit about aws_profile_name visibility. These don'\''t change behaviour and can be addressed in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [HOTFIX] Bump litellm to 1.83.10 from PyPI to clear CVE-2026-42208 (#1976) Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but vulnerability scanners flag the installed package regardless of which code path is reachable. Bump to 1.83.10 — the exact version recommended by the upstream advisory (v1.83.10-stable) and the smallest jump that clears the CVE range while keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to move 0.9 -> 0.12 to satisfy litellm's pin. Switch source back to PyPI now that the PyPI quarantine is over, reversing the temporary fork in #1873. Cohere embed timeout patch: verified that litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3, v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on the new version, confirming the monkey-patch still binds correctly. Other cleanup from #1873: - Drop git apt-install from worker-unified and tool Dockerfiles (no git-sourced deps remain in any uv.lock) - Bump tool versions: structure 0.0.100 -> 0.0.101, classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76 Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that blocked incremental resolution. Regenerated from scratch. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Align cohere patch docstring with version-guard semantics Reviewer flagged that the docstring claimed the patch is "confirmed in every release between 1.82.3 and 1.83.14-stable", but the guard at _PATCHED_LITELLM_VERSION activates only on the exact pinned version. A future maintainer reading the old text could reasonably expect bumping to e.g. 1.83.11 to keep the fix active; in reality it silently turns off. Rewritten to reference _PATCHED_LITELLM_VERSION as the single source of truth and to drop the rot-prone "as of 2026-05-20" calendar date. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918) * [FIX] Use importlib.util.find_spec for pluggable worker discovery _verify_pluggable_worker_exists() previously checked for the literal file `pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin has been compiled to a .so (Nuitka, Cython, or any C extension) — the module is perfectly importable but the pre-check rejects it because only the .py extension is considered. Replace the filesystem check with importlib.util.find_spec(), which is Python's standard way to ask "is this module resolvable by the import system?". It honors every registered finder — source .py, compiled .so, bytecode .pyc, namespace packages, zipimports — so the function now matches what its docstring claims: verifying the module can be loaded, not that a specific file extension is present. Behavior is preserved for existing deployments: - Images with no `pluggable_worker/<name>/` subpackage → find_spec raises ModuleNotFoundError (ImportError subclass) → returns False. - Images with source .py → find_spec resolves the .py → returns True. - Images with compiled .so → find_spec resolves the .so → returns True. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Handle ValueError from find_spec in pluggable worker verification Greptile-flagged edge case: importlib.util.find_spec() can raise ValueError (not just ImportError) when sys.modules has a partially initialised module entry with __spec__ = None from a prior failed import. Broaden the except to catch both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Resolve api-deployment worker directory from enum import path worker.py:452 did worker_type.value.replace("-", "_") to derive the on-disk dir name. All WorkerType enum values already use underscores, so the replace was a no-op; for API_DEPLOYMENT whose dir is "api-deployment" (hyphen), it resolved to "api_deployment" and the os.path.exists() check failed. Boot then logged a spurious "❌ Worker directory not found: /app/api_deployment" at ERROR level. The task registration path (builder + celery autodiscover via to_import_path) is unaffected, so this was purely log noise — but noise at ERROR level that masks real failures in log scans. Fix: derive the directory from the authoritative to_import_path() which already handles the hyphen case (api_deployment -> api-deployment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944) * [FEAT] Allow Bedrock to fall through to boto3's default credential chain Match the S3/MinIO connector pattern: when AWS access keys are left blank on the Bedrock LLM and embedding adapter forms, drop them from the kwargs dict so boto3's default credential chain handles authentication. This unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on hosts that already have ambient AWS credentials (e.g. EKS workers with IRSA, EC2 with an instance profile). - llm1/static/bedrock.json: clarify access-key descriptions to mention IRSA and instance profile (already non-required at v0.163.2 base). - embedding1/static/bedrock.json: drop aws_access_key_id and aws_secret_access_key from top-level required; same description fix; expose aws_profile_name for parity with the LLM form. - base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters now strip empty access-key values from the validated kwargs before returning, so empty strings don't override boto3's default chain. AWSBedrockEmbeddingParameters fields gain explicit None defaults and an aws_profile_name field. Backward-compatible: existing adapters with access keys filled in continue to work unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FEAT] Add Authentication Type selector to Bedrock adapter form Add an explicit `auth_type` selector with two options, making the auth choice clear to users: - "Access Keys" (default): existing flow, keys required - "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on boto3's default credential chain (IRSA on EKS, task role on ECS, instance profile on EC2). Description on the selector explicitly notes this option is only for AWS-hosted Unstract deployments. The form-only auth_type field is stripped before LiteLLM validation in both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters. validate(). Empty access keys continue to be stripped so boto3 falls through to the default chain even when the access_keys arm is selected without values (matches the S3/MinIO connector pattern). Backward-compatible: legacy adapters without auth_type behave as "Access Keys" mode (the default), and existing keys are forwarded unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REVIEW] Address Bedrock auth_type review feedback Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on PR #1944. Behaviour fixes: - Stale-key leak in IAM Role mode: switching an existing adapter from Access Keys to IAM Role would carry truthy stored access keys through the strip-empty-only loop, so boto3 silently authenticated with the old long-lived credentials instead of falling through to the host's IRSA / instance-profile identity. Both LLM and embedding paths were affected. - Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or a malformed payload from a non-UI client passed through the dict comprehension untouched, with no enum guard. - Cross-field validation gap: explicit Access Keys mode with blank or whitespace-only values silently fell through to the default credential chain instead of surfacing the misconfiguration. Implementation: - Add a module-level _resolve_bedrock_aws_credentials helper used by both AWSBedrockLLMParameters.validate() and AWSBedrock EmbeddingParameters.validate(), so the auth-type contract is expressed once. - Validates auth_type against an allowlist (None | "access_keys" | "iam_role"); raises ValueError on anything else. - iam_role: unconditionally drops aws_access_key_id and aws_secret_access_key. - access_keys (explicit): requires non-blank values; raises ValueError if either is empty or whitespace-only. - Legacy (auth_type absent): retains the lenient strip behaviour so pre-PR adapter configurations continue to deserialise unchanged. - Restore aws_region_name as required (no `= None` default) on AWSBedrockEmbeddingParameters; only credentials may legitimately be absent. - Drop the orphan aws_profile_name field from embedding1/static/bedrock.json: it was added for parity with the LLM form but lives outside the auth_type oneOf and contradicts the selector's "no further input" semantics. The LLM form already had aws_profile_name pre-PR and is left alone for backwards compatibility. Tests: - New tests/test_bedrock_adapter.py covers 15 cases across LLM and embedding adapters: legacy-no-auth-type, explicit access_keys with valid/blank/whitespace keys, iam_role with stale/no keys, unknown auth_type rejection, cross-field validation, and preservation of unrelated params (model_id, aws_profile_name, region, thinking). Skipped (P2 nice-to-have): - Comment-scope clarification, MinIO reference rewording, validate-mutates-caller'\''s-dict, and the LLM form description nit about aws_profile_name visibility. These don'\''t change behaviour and can be addressed in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [HOTFIX] Bump litellm to 1.83.10 from PyPI to clear CVE-2026-42208 (#1976) Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but vulnerability scanners flag the installed package regardless of which code path is reachable. Bump to 1.83.10 — the exact version recommended by the upstream advisory (v1.83.10-stable) and the smallest jump that clears the CVE range while keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to move 0.9 -> 0.12 to satisfy litellm's pin. Switch source back to PyPI now that the PyPI quarantine is over, reversing the temporary fork in #1873. Cohere embed timeout patch: verified that litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3, v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on the new version, confirming the monkey-patch still binds correctly. Other cleanup from #1873: - Drop git apt-install from worker-unified and tool Dockerfiles (no git-sourced deps remain in any uv.lock) - Bump tool versions: structure 0.0.100 -> 0.0.101, classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76 Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that blocked incremental resolution. Regenerated from scratch. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Align cohere patch docstring with version-guard semantics Reviewer flagged that the docstring claimed the patch is "confirmed in every release between 1.82.3 and 1.83.14-stable", but the guard at _PATCHED_LITELLM_VERSION activates only on the exact pinned version. A future maintainer reading the old text could reasonably expect bumping to e.g. 1.83.11 to keep the fix active; in reality it silently turns off. Rewritten to reference _PATCHED_LITELLM_VERSION as the single source of truth and to drop the rot-prone "as of 2026-05-20" calendar date. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>



What
Monkey-patch litellm's cohere embed handler to correctly forward the
timeoutparameter toclient.post()calls, fixing "Connection timed out after None seconds" errors when indexing large documents with AWS Bedrock embedding models.Why
litellm (v1.82.3) has a bug in
litellm/llms/cohere/embed/handler.pywhere bothembedding()andasync_embedding()receive atimeoutparameter but never forward it toclient.post(). This causes the timeout to default toNone, which surfaces as:This affects all Bedrock Cohere embedding operations (e.g.
cohere.embed-multilingual-v3) and is especially visible with large documents. The bug is present on litellm's latestmainbranch as well — no upstream fix exists.How
unstract/sdk1/patches/litellm_cohere_timeout.py) that replaces the affected functions with versions that correctly passtimeout=timeouttoclient.post()cohere_embedding)DeprecationWarningto prompt verification# ONLY CHANGE)unstract.sdk1.embeddingCan this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
No. The patched functions are exact copies of litellm 1.82.3's originals with only
timeout=timeoutadded toclient.post()calls. litellm is pinned at1.82.3in sdk1, so the source won't change. If litellm is later upgraded, the version guard skips the patch entirely and emits a warning.Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Large_1040.pdf)Dependencies Versions
litellm: 1.82.3 (pinned, bug present)httpx: 0.28.1Notes on Testing
uv run pytest tests/patches/test_litellm_cohere_timeout.py -vScreenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.
🤖 Generated with Claude Code