[MISC] Handle reasoning models in OpenAI-compatible LLM adapter#1984
[MISC] Handle reasoning models in OpenAI-compatible LLM adapter#1984chandrasekharan-zipstack wants to merge 2 commits into
Conversation
gpt-5 and o-series reject `temperature` != 1 and require `max_completion_tokens` over `max_tokens`. litellm cannot auto-correct these for the generic `custom_openai` provider since it has no model metadata, so the adapter now detects reasoning models by name and reshapes the sampling params itself. Adds an `enable_reasoning` toggle for `reasoning_effort` tuning and as an override for reasoning models served under non-standard names. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
Summary by CodeRabbit
WalkthroughAdds reasoning-model autodetection and sampling-parameter compatibility to the OpenAI-compatible adapter, introduces ChangesOpenAI Reasoning Model Support
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
unstract/sdk1/tests/test_openai_compatible_adapter.py (1)
153-170: ⚡ Quick winAdd a regression test for explicit disable precedence.
Please add a case where payload includes
enable_reasoning=Falseandreasoning_effort="high"and assert reasoning transforms do not run. This guards the explicit-toggle contract.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@unstract/sdk1/tests/test_openai_compatible_adapter.py` around lines 153 - 170, Add a new test (e.g., test_openai_compatible_validate_reasoning_toggle_explicit_disable) that calls OpenAICompatibleLLMParameters.validate with a payload including model="gateway-reasoner", reasoning_effort="high", and enable_reasoning=False; assert that the reasoning transforms do not run by checking that "max_completion_tokens" is not added to the validated dict, that original params like "temperature" are preserved, and that "enable_reasoning" is removed from the final validated output (use the same assertion style as test_openai_compatible_validate_reasoning_toggle_overrides_unknown_name to locate expectations).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py`:
- Around line 395-399: The code auto-enables reasoning by checking
adapter_metadata.get("reasoning_effort") even when the caller explicitly set
enable_reasoning=False; change the logic so enable_reasoning is only
auto-enabled when the caller did not provide the enable_reasoning key.
Concretely, replace the current two-line block that sets enable_reasoning from
adapter_metadata.get("enable_reasoning", False) and then flips it if
adapter_metadata.get("reasoning_effort") with a check like: set enable_reasoning
to adapter_metadata.get("enable_reasoning", False) but only set enable_reasoning
= True from adapter_metadata.get("reasoning_effort") when "enable_reasoning" is
not present in adapter_metadata; reference symbols: adapter_metadata,
enable_reasoning, reasoning_effort.
---
Nitpick comments:
In `@unstract/sdk1/tests/test_openai_compatible_adapter.py`:
- Around line 153-170: Add a new test (e.g.,
test_openai_compatible_validate_reasoning_toggle_explicit_disable) that calls
OpenAICompatibleLLMParameters.validate with a payload including
model="gateway-reasoner", reasoning_effort="high", and enable_reasoning=False;
assert that the reasoning transforms do not run by checking that
"max_completion_tokens" is not added to the validated dict, that original params
like "temperature" are preserved, and that "enable_reasoning" is removed from
the final validated output (use the same assertion style as
test_openai_compatible_validate_reasoning_toggle_overrides_unknown_name to
locate expectations).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7230f98d-a344-4b23-9868-fe271bf52f93
📒 Files selected for processing (3)
unstract/sdk1/src/unstract/sdk1/adapters/base1.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.jsonunstract/sdk1/tests/test_openai_compatible_adapter.py
| # `reasoning_effort` applies only when reasoning is explicitly enabled. | ||
| enable_reasoning = adapter_metadata.get("enable_reasoning", False) | ||
| if not enable_reasoning and adapter_metadata.get("reasoning_effort"): | ||
| enable_reasoning = True | ||
|
|
There was a problem hiding this comment.
Respect explicit enable_reasoning=False instead of auto-enabling from stale reasoning_effort.
Line 397-Line 399 currently flips reasoning on whenever reasoning_effort is present, even if the caller explicitly sets enable_reasoning: false. That can silently apply reasoning-only param rewrites against user intent.
Suggested fix
- enable_reasoning = adapter_metadata.get("enable_reasoning", False)
- if not enable_reasoning and adapter_metadata.get("reasoning_effort"):
- enable_reasoning = True
+ enable_reasoning = adapter_metadata.get("enable_reasoning", False)
+ has_explicit_toggle = "enable_reasoning" in adapter_metadata
+ has_reasoning_effort = (
+ "reasoning_effort" in adapter_metadata
+ and adapter_metadata.get("reasoning_effort") is not None
+ )
+ # Only infer reasoning during re-validation flows where the control
+ # toggle is absent.
+ if not has_explicit_toggle and has_reasoning_effort:
+ enable_reasoning = True🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py` around lines 395 - 399,
The code auto-enables reasoning by checking
adapter_metadata.get("reasoning_effort") even when the caller explicitly set
enable_reasoning=False; change the logic so enable_reasoning is only
auto-enabled when the caller did not provide the enable_reasoning key.
Concretely, replace the current two-line block that sets enable_reasoning from
adapter_metadata.get("enable_reasoning", False) and then flips it if
adapter_metadata.get("reasoning_effort") with a check like: set enable_reasoning
to adapter_metadata.get("enable_reasoning", False) but only set enable_reasoning
= True from adapter_metadata.get("reasoning_effort") when "enable_reasoning" is
not present in adapter_metadata; reference symbols: adapter_metadata,
enable_reasoning, reasoning_effort.
|
| Filename | Overview |
|---|---|
| unstract/sdk1/src/unstract/sdk1/adapters/base1.py | Adds _is_reasoning_model() and _apply_reasoning_param_compat() helpers, plus reasoning_effort field and updated validate() logic to OpenAICompatibleLLMParameters. Logic is well-structured; detection regex is correctly anchored against look-alike names. |
| unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json | Adds enable_reasoning toggle and conditional reasoning_effort property via JSON Schema allOf/if/then. The second allOf entry (enable_reasoning=false → empty then) is a no-op and can be removed. |
| unstract/sdk1/tests/test_openai_compatible_adapter.py | Well-rounded tests cover the happy path, look-alike rejection, prefixed model IDs, and the toggle override. No gaps observed. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[adapter_metadata] --> B[validate_model\nprefix custom_openai/]
B --> C{enable_reasoning\nflag set?}
C -- No --> D{reasoning_effort\nalready present?}
D -- Yes --> E[enable_reasoning = True]
D -- No --> F[enable_reasoning = False]
C -- Yes --> E
E --> G[set reasoning_effort\ndefault medium]
F --> H[exclude reasoning_effort\nfrom validation_metadata]
G --> I[OpenAICompatibleLLMParameters\nPydantic validation]
H --> I
I --> J[model_dump → validated dict]
J --> K{enable_reasoning?}
K -- Yes --> L[inject reasoning_effort\ninto validated]
K -- No --> M[pop reasoning_effort\nfrom validated]
L --> N{enable_reasoning OR\n_is_reasoning_model?}
M --> N
N -- Yes --> O[_apply_reasoning_param_compat\ndrop temperature\nrename max_tokens→max_completion_tokens]
N -- No --> P[return validated unchanged]
O --> Q[return validated]
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
unstract/sdk1/src/unstract/sdk1/adapters/base1.py:353
The `o[1-9]` pattern only matches single-digit o-series model names. If OpenAI releases two-digit variants (e.g. `o10`, `o11`), they will silently bypass auto-detection and params will be sent verbatim, reproducing the 400s this PR was written to fix. The `(?:^|/)` prefix anchor already prevents false positives like `some-service-o10`, so extending to `o[1-9][0-9]*` is safe.
```suggestion
re.compile(r"(?:^|/)o[1-9][0-9]*(?=$|[-.])"),
```
### Issue 2 of 2
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json:94-107
The second `allOf` entry (when `enable_reasoning` is `false`, apply empty `then`) is a no-op — JSON Schema's `if/then` already passes vacuously when the `then` block has no constraints. Removing it avoids schema clutter and eliminates the same vacuous-true hazard present in the first entry (when `enable_reasoning` is absent, `properties: { enable_reasoning: { const: false } }` also evaluates as vacuously true, triggering the empty `then` harmlessly but unexpectedly).
```suggestion
]
}
```
Reviews (2): Last reviewed commit: "[MISC] Use pytest.approx for float compa..." | Re-trigger Greptile
| { | ||
| "if": { | ||
| "properties": { | ||
| "enable_reasoning": { | ||
| "const": true | ||
| } | ||
| } | ||
| }, |
There was a problem hiding this comment.
In JSON Schema, a
properties constraint in an if block only applies to properties that are present in the instance — if enable_reasoning is absent, the constraint is vacuously true, the if evaluates to true, and reasoning_effort becomes unexpectedly required. Any caller that omits enable_reasoning (e.g. an HTML checkbox that sends nothing when unchecked, or a programmatic API call without the field) will receive an unexpected validation error about reasoning_effort. Adding "required": ["enable_reasoning"] inside the if block pins the condition to instances where the property is actually present.
| { | |
| "if": { | |
| "properties": { | |
| "enable_reasoning": { | |
| "const": true | |
| } | |
| } | |
| }, | |
| { | |
| "if": { | |
| "required": ["enable_reasoning"], | |
| "properties": { | |
| "enable_reasoning": { | |
| "const": true | |
| } | |
| } | |
| }, |
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
Line: 67-74
Comment:
In JSON Schema, a `properties` constraint in an `if` block only applies to properties that are **present** in the instance — if `enable_reasoning` is absent, the constraint is vacuously true, the `if` evaluates to `true`, and `reasoning_effort` becomes unexpectedly required. Any caller that omits `enable_reasoning` (e.g. an HTML checkbox that sends nothing when unchecked, or a programmatic API call without the field) will receive an unexpected validation error about `reasoning_effort`. Adding `"required": ["enable_reasoning"]` inside the `if` block pins the condition to instances where the property is actually present.
```suggestion
{
"if": {
"required": ["enable_reasoning"],
"properties": {
"enable_reasoning": {
"const": true
}
}
},
```
How can I resolve this? If you propose a fix, please make it concise.Resolves SonarCloud python:S1244 (float equality check). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|
Duplicate of #1983 |



What
The OpenAI-compatible LLM adapter now supports reasoning models (gpt-5, o-series). It detects them by model name and reshapes sampling params, and adds an
Enable Reasoningtoggle forreasoning_efforttuning.Why
Testing the OpenAI-compatible adapter against a reasoning model failed with provider 400s:
temperature!= 1 rejected, andmax_tokensnot supported (max_completion_tokensrequired). litellm auto-corrects these for providers it has model metadata for (openai/,azure/), but the genericcustom_openaiprovider has none, so the params were sent verbatim and rejected.How
base1.py: added_is_reasoning_model()(name-based detection, anchored regex to avoid look-alike ids) and_apply_reasoning_param_compat()which dropstemperatureand renamesmax_tokens→max_completion_tokens.OpenAICompatibleLLMParameters.validate()applies this for recognized reasoning models, or when reasoning is explicitly enabled.custom_openai.json: addedenable_reasoningtoggle + conditionalreasoning_effort, mirroring the OpenAI adapter schema.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
No. Non-reasoning OpenAI-compatible models are unaffected —
temperatureandmax_tokensflow through unchanged as before. The param reshaping is gated to recognized reasoning model names or the explicit toggle.Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
test_openai_compatible_adapter.py: auto-detect of reasoning models, look-alike name rejection, prefixed model ids, toggle override for non-standard names.cd unstract/sdk1 && uv run pytest tests/test_openai_compatible_adapter.pyScreenshots
Checklist
I have read and understood the Contribution Guidelines.
🤖 Generated with Claude Code