Skip to content

Add pre-flight rate limiting with token accounting#2

Merged
TTK95 merged 1 commit into
devfrom
claude/rwth-api-rate-limits-algv0
Apr 21, 2026
Merged

Add pre-flight rate limiting with token accounting#2
TTK95 merged 1 commit into
devfrom
claude/rwth-api-rate-limits-algv0

Conversation

@TTK95
Copy link
Copy Markdown
Owner

@TTK95 TTK95 commented Apr 21, 2026

Extends the existing per-provider sliding-window rate limiter with token
windows and a pre-flight check() that blocks requests before they hit
the wire. Stops RWTH (and other) rate-limited providers from surfacing
generic "An internal error occurred" messages after a 429 — the limit
is learned on the first 429, persisted to opencode.json, and on
subsequent runs the pre-flight gate throws a RateLimitError with a
friendly "retry in Ns" message that the retry layer honours via resetAt.

  • rate-limit.ts: token windows (minute + day), check() gate, recordUsage,
    estimateRequestTokens, configure() to seed from opencode.json options,
    onRateLimitError persists token limits too
  • provider.ts fetch wrapper: pre-flight check, RateLimit.configure from
    options.rateLimit, tick with token estimate
  • session/processor.ts: call RateLimit.recordUsage on finish-step so
    the pending estimate is replaced with actual usage
  • session/retry.ts: RateLimitError is retryable; delay() honours resetAt
  • provider/error.ts: friendly message for 429 responses; mark retryable
  • session/message-v2.ts: RateLimitError NamedError, converts to APIError
    shape in fromError so the SDK type surface is unchanged
  • config/provider.ts: tokensPerMinute / tokensPerDay schema fields
  • cli/cmd/stats.ts: RATE LIMITS section reads persisted limits

https://claude.ai/code/session_01S3gS4AgfQQBNHvgZqkuiRa

Extends the existing per-provider sliding-window rate limiter with token
windows and a pre-flight check() that blocks requests before they hit
the wire. Stops RWTH (and other) rate-limited providers from surfacing
generic "An internal error occurred" messages after a 429 — the limit
is learned on the first 429, persisted to opencode.json, and on
subsequent runs the pre-flight gate throws a RateLimitError with a
friendly "retry in Ns" message that the retry layer honours via resetAt.

- rate-limit.ts: token windows (minute + day), check() gate, recordUsage,
  estimateRequestTokens, configure() to seed from opencode.json options,
  onRateLimitError persists token limits too
- provider.ts fetch wrapper: pre-flight check, RateLimit.configure from
  options.rateLimit, tick with token estimate
- session/processor.ts: call RateLimit.recordUsage on finish-step so
  the pending estimate is replaced with actual usage
- session/retry.ts: RateLimitError is retryable; delay() honours resetAt
- provider/error.ts: friendly message for 429 responses; mark retryable
- session/message-v2.ts: RateLimitError NamedError, converts to APIError
  shape in fromError so the SDK type surface is unchanged
- config/provider.ts: tokensPerMinute / tokensPerDay schema fields
- cli/cmd/stats.ts: RATE LIMITS section reads persisted limits

https://claude.ai/code/session_01S3gS4AgfQQBNHvgZqkuiRa
@TTK95 TTK95 merged commit ba0bc34 into dev Apr 21, 2026
@TTK95 TTK95 deleted the claude/rwth-api-rate-limits-algv0 branch April 21, 2026 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants