Summary
When using a custom OpenAI-compatible provider, image attachments sent through OpenCode sessions are not reaching the model as usable vision input.
In my case:
- provider:
longent
- model:
gpt-5.4(xhigh)
- transport: OpenCode desktop / session API
The exact same provider/model can successfully answer a vision request when called directly via the provider's OpenAI-compatible /chat/completions API using image_url.
But when I send an image through OpenCode as a file part (mime: image/png), the assistant responds with:
I can't read remote-test.png here because this model doesn't support image input.
So this looks like an OpenCode adapter/translation issue in the custom provider + file attachment path, not a model capability issue.
Expected behavior
If a custom OpenAI-compatible provider/model supports image input, sending an image attachment through OpenCode should be translated into a valid multimodal request and the model should be able to read the image.
Actual behavior
OpenCode session messages with file image parts end up behaving as if the model does not support image input, even though the same model works with direct API vision calls.
Repro
1. Configure a custom OpenAI-compatible provider
Example config shape:
{
"provider": {
"longent": {
"name": "Longent",
"npm": "@ai-sdk/openai-compatible",
"options": {
"baseURL": "https://longent.tech/v1",
"apiKey": "REDACTED"
},
"models": {
"gpt-5.4(xhigh)": {
"name": "GPT-5.4 XHigh",
"options": {
"reasoningEffort": "xhigh"
}
}
}
}
}
}
2. Direct provider API works
A direct request to the same provider/model with image_url succeeds.
Example payload:
{
"model": "gpt-5.4(xhigh)",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What color is this image? Reply with one word."},
{"type": "image_url", "image_url": {"url": "https://dummyimage.com/10x10/ff0000/ffffff.png"}}
]
}
],
"max_tokens": 30
}
Response is correct (red / Red).
3. OpenCode session API fails for image file attachment
Create a session, then send a message with parts:
{
"model": {"providerID": "longent", "modelID": "gpt-5.4(xhigh)"},
"agent": "build",
"parts": [
{"type": "text", "text": "What color is this image? Reply with one word."},
{
"type": "file",
"mime": "image/png",
"filename": "remote-test.png",
"url": "https://dummyimage.com/10x10/ff0000/ffffff.png"
}
]
}
Returned assistant text:
I can't read `remote-test.png` here because this model doesn't support image input.
Additional observations
- I also reproduced the issue from the OpenCode desktop app by attaching a local PNG file.
- In the recorded session message, the user input is stored as a
file part with mime: image/png.
- The session itself is definitely using the custom provider/model (
providerID=longent, modelID=gpt-5.4(xhigh)).
- This strongly suggests the problem is in how OpenCode maps
file image parts for custom OpenAI-compatible providers.
Environment
- OpenCode desktop
- OpenCode session version observed:
1.3.13
- macOS
Hypothesis
OpenCode may be handling image attachments correctly for some built-in providers, but for @ai-sdk/openai-compatible custom providers it looks like file image parts are not being converted into the multimodal format the upstream model expects.
Possibly the provider path needs to translate image file parts into something equivalent to image_url / image content parts for OpenAI-compatible chat requests.
Summary
When using a custom OpenAI-compatible provider, image attachments sent through OpenCode sessions are not reaching the model as usable vision input.
In my case:
longentgpt-5.4(xhigh)The exact same provider/model can successfully answer a vision request when called directly via the provider's OpenAI-compatible
/chat/completionsAPI usingimage_url.But when I send an image through OpenCode as a
filepart (mime: image/png), the assistant responds with:So this looks like an OpenCode adapter/translation issue in the custom provider + file attachment path, not a model capability issue.
Expected behavior
If a custom OpenAI-compatible provider/model supports image input, sending an image attachment through OpenCode should be translated into a valid multimodal request and the model should be able to read the image.
Actual behavior
OpenCode session messages with
fileimage parts end up behaving as if the model does not support image input, even though the same model works with direct API vision calls.Repro
1. Configure a custom OpenAI-compatible provider
Example config shape:
{ "provider": { "longent": { "name": "Longent", "npm": "@ai-sdk/openai-compatible", "options": { "baseURL": "https://longent.tech/v1", "apiKey": "REDACTED" }, "models": { "gpt-5.4(xhigh)": { "name": "GPT-5.4 XHigh", "options": { "reasoningEffort": "xhigh" } } } } } }2. Direct provider API works
A direct request to the same provider/model with
image_urlsucceeds.Example payload:
{ "model": "gpt-5.4(xhigh)", "messages": [ { "role": "user", "content": [ {"type": "text", "text": "What color is this image? Reply with one word."}, {"type": "image_url", "image_url": {"url": "https://dummyimage.com/10x10/ff0000/ffffff.png"}} ] } ], "max_tokens": 30 }Response is correct (
red/Red).3. OpenCode session API fails for image file attachment
Create a session, then send a message with parts:
{ "model": {"providerID": "longent", "modelID": "gpt-5.4(xhigh)"}, "agent": "build", "parts": [ {"type": "text", "text": "What color is this image? Reply with one word."}, { "type": "file", "mime": "image/png", "filename": "remote-test.png", "url": "https://dummyimage.com/10x10/ff0000/ffffff.png" } ] }Returned assistant text:
Additional observations
filepart withmime: image/png.providerID=longent,modelID=gpt-5.4(xhigh)).fileimage parts for custom OpenAI-compatible providers.Environment
1.3.13Hypothesis
OpenCode may be handling image attachments correctly for some built-in providers, but for
@ai-sdk/openai-compatiblecustom providers it looks likefileimage parts are not being converted into the multimodal format the upstream model expects.Possibly the provider path needs to translate image
fileparts into something equivalent toimage_url/ image content parts for OpenAI-compatible chat requests.