test_generate_from_raw_with_format flaky due to missing MAX_NEW_TOKENS

## `test_generate_from_raw_with_format` flaky due to missing MAX_NEW_TOKENS

### Description

`test/backends/test_openai_vllm.py::test_generate_from_raw_with_format` fails intermittently with truncated JSON output.

The test doesn't set `MAX_NEW_TOKENS`, so it falls back to vLLM's `SamplingParams` default of **16 tokens**. A valid JSON response (e.g. `{"name": "get_sum", "value": 2}`) requires ~20+ tokens, so output is truncated.

The chat-based equivalent (`test_format` in the same file) already sets `MAX_NEW_TOKENS: 256` and passes reliably.

### Error

```
pydantic_core._pydantic_core.ValidationError: 1 validation error for Answer
  Invalid JSON: EOF while parsing an object at line 3 column 12
  input_value='{ \n  "name": "get_sum",\n  "value": 2'
```

### Fix

Add `model_options={ModelOption.MAX_NEW_TOKENS: 256}` to the `generate_from_raw` call in the test, matching what `test_format` already does.

### Environment

Observed on LSF cluster (CUDA, vLLM 0.13+).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test_generate_from_raw_with_format flaky due to missing MAX_NEW_TOKENS #591

`test_generate_from_raw_with_format` flaky due to missing MAX_NEW_TOKENS

Description

Error

Fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

test_generate_from_raw_with_format flaky due to missing MAX_NEW_TOKENS #591

Description

test_generate_from_raw_with_format flaky due to missing MAX_NEW_TOKENS

Description

Error

Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`test_generate_from_raw_with_format` flaky due to missing MAX_NEW_TOKENS