Local model API request fails if prompt ingestion takes more than 10 minutes

### App Version

3.16.6

### API Provider

LM Studio

### Model Used

Qwen3-32B

### 🔁 Steps to Reproduce

I am trying to use a local Qwen3-32B model via llama.cpp. To do so, I use the LMStudio integration that I point to the local server. Everything works fine, but after 10 minutes (600 seconds), the connection is dropped and I get an API Request Failed message. The inference is on cpu and quite slow, but I would be happy to let it crunch while I'm doing something else. If I use the tiny Qwen3 0.6B model, the inference is fast enough and everything works as expected (although with very mediocre results).

When it fails, llama.cpp finishes processing the prompt anyway. It succeeds on retry, the prompt being already cached.

### 💥 Outcome Summary (Optional)

_No response_

### 📄 Relevant Logs or Errors

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local model API request fails if prompt ingestion takes more than 10 minutes #3621

App Version

API Provider

Model Used

🔁 Steps to Reproduce

💥 Outcome Summary (Optional)

📄 Relevant Logs or Errors

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Local model API request fails if prompt ingestion takes more than 10 minutes #3621

Description

App Version

API Provider

Model Used

🔁 Steps to Reproduce

💥 Outcome Summary (Optional)

📄 Relevant Logs or Errors

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions