Summary
When OpenCode triggers automatic context compaction near the context window limit, the compaction request is delivered to the model as a regular user message with no signal indicating it originated from the compaction system. The model treats it as a genuine user question and responds with a full summary. If compaction triggers again shortly after, the model produces a second summary, doubling token consumption. The user sees what appears to be the assistant inexplicably repeating itself.
刚才你让我"停一停,先总结现状",我总结完后你又问了一次"What did we do so far?",我又重复总结了一遍。现在这条 continue 指令出现得有点突兀
Reproduction
Have a long conversation that approaches the context window limit
Continue interacting until auto-compaction triggers
Observe that the assistant produces a summary response
Continue further until compaction triggers again
Observe a second, near-identical summary
In my session, the model received two consecutive prompts:
What did we do so far?
Continue if you have next steps, or stop and ask for clarification if you are unsure how to proceed.
Both were auto-compaction triggers, but indistinguishable from real user input. The model summarized twice and once asked for clarification on what to continue, neither of which served the actual user.
Impact
Token waste: Each compaction trigger spends output tokens on a summary the user did not request
User confusion: From the user's perspective, the assistant appears to be repeating itself for no reason
Workflow disruption: When the model interprets a compaction trigger as a user instruction, it may take unintended actions (e.g. asking clarifying questions, switching context, or attempting to continue work)
Cost: Particularly painful on expensive models (Opus, GPT-4 class) where each summary can cost meaningful amounts
Expected Behavior
The compaction trigger should be distinguishable from real user input. Options:
System message channel: Deliver compaction requests via a system/tool message rather than as a user turn, so the model can recognize the request type
Explicit marker: Wrap compaction triggers in a sentinel tag (e.g. <opencode_auto_compaction>...</opencode_auto_compaction>) similar to existing tags
Silent compaction: Perform compaction internally (summarize and replace older turns in the context) without injecting a visible user-turn message at all
User-visible indicator: Show a UI marker so the user knows compaction happened and is not surprised by the summary
Environment
OpenCode CLI
Model: claude-opus-4-7 (anthropic/claude-opus-4-7)
Platform: darwin
Additional Notes
The user noticed this only because compaction triggered twice in close succession, producing two summaries. Single-trigger cases likely go unnoticed but still waste tokens on every long session.
Summary
When OpenCode triggers automatic context compaction near the context window limit, the compaction request is delivered to the model as a regular user message with no signal indicating it originated from the compaction system. The model treats it as a genuine user question and responds with a full summary. If compaction triggers again shortly after, the model produces a second summary, doubling token consumption. The user sees what appears to be the assistant inexplicably repeating itself.
刚才你让我"停一停,先总结现状",我总结完后你又问了一次"What did we do so far?",我又重复总结了一遍。现在这条 continue 指令出现得有点突兀
Reproduction
Have a long conversation that approaches the context window limit
Continue interacting until auto-compaction triggers
Observe that the assistant produces a summary response
Continue further until compaction triggers again
Observe a second, near-identical summary
In my session, the model received two consecutive prompts:
What did we do so far?
Continue if you have next steps, or stop and ask for clarification if you are unsure how to proceed.
Both were auto-compaction triggers, but indistinguishable from real user input. The model summarized twice and once asked for clarification on what to continue, neither of which served the actual user.
Impact
Token waste: Each compaction trigger spends output tokens on a summary the user did not request
User confusion: From the user's perspective, the assistant appears to be repeating itself for no reason
Workflow disruption: When the model interprets a compaction trigger as a user instruction, it may take unintended actions (e.g. asking clarifying questions, switching context, or attempting to continue work)
Cost: Particularly painful on expensive models (Opus, GPT-4 class) where each summary can cost meaningful amounts
Expected Behavior
The compaction trigger should be distinguishable from real user input. Options:
System message channel: Deliver compaction requests via a system/tool message rather than as a user turn, so the model can recognize the request type
Explicit marker: Wrap compaction triggers in a sentinel tag (e.g. <opencode_auto_compaction>...</opencode_auto_compaction>) similar to existing tags
Silent compaction: Perform compaction internally (summarize and replace older turns in the context) without injecting a visible user-turn message at all
User-visible indicator: Show a UI marker so the user knows compaction happened and is not surprised by the summary
Environment
OpenCode CLI
Model: claude-opus-4-7 (anthropic/claude-opus-4-7)
Platform: darwin
Additional Notes
The user noticed this only because compaction triggered twice in close succession, producing two summaries. Single-trigger cases likely go unnoticed but still waste tokens on every long session.