Skip to content

Treat CAPIError: 429 429 quota exceeded as non-retryable in copilot_harness.cjs #39479

@aoxiangtianyu-go

Description

@aoxiangtianyu-go

Treat CAPIError: 429 429 quota exceeded as non-retryable in copilot_harness.cjs

Summary

When Copilot returns the following observed error:

CAPIError: 429 429 quota exceeded

copilot_harness.cjs currently treats the failed run as a generic partial-execution failure and retries it with --continue.

This causes repeated retries even though the observed error is quota exhaustion.

Current behavior

The Copilot CLI reports quota exhaustion:

Failed to get response from the AI model; retried 5 times (total retry wait time: 92.63 seconds) (Request-ID 2BC0:3FC2C9:DA2E94:F30DE3:6A30B419) Last error: CAPIError: 429 429 quota exceeded

Changes   +0 -0
Duration  1m 34s

Then copilot-harness treats it as partial execution and retries:

[copilot-harness] 2026-06-16T02:26:07.647Z attempt 1 failed: exitCode=1 isCAPIError400=false isMCPPolicyError=false isModelNotSupportedError=false isNullTypeToolCallError=false isAuthError=false isAuthenticationFailedError=false permissionDeniedCount=0 hasNumerousPermissionDenied=false hasOutput=true retriesRemaining=3
[copilot-harness] 2026-06-16T02:26:07.647Z attempt 1: partial execution — will retry with --continue (attempt 2/4)
[copilot-harness] 2026-06-16T02:26:07.647Z retry 1/3: sleeping 5000ms before next attempt (--continue)

The same error repeats, and the harness eventually exhausts all retries:

Failed to get response from the AI model; retried 5 times (total retry wait time: 91.86 seconds) (Request-ID 2BC0:3FC2C9:DFEEB7:F9691C:6A30B55E) Last error: CAPIError: 429 429 quota exceeded

Changes   +0 -0
Duration  6m 58s
[copilot-harness] 2026-06-16T02:31:31.753Z attempt 4 failed: exitCode=1 isCAPIError400=false isMCPPolicyError=false isModelNotSupportedError=false isNullTypeToolCallError=false isAuthError=false isAuthenticationFailedError=false permissionDeniedCount=0 hasNumerousPermissionDenied=false hasOutput=true retriesRemaining=0
[copilot-harness] 2026-06-16T02:31:31.753Z all 3 retries exhausted — giving up (exitCode=1)
[copilot-harness] 2026-06-16T02:31:31.758Z done: exitCode=1 totalDuration=6m 59s

Expected behavior

CAPIError: 429 429 quota exceeded should be treated as non-retryable by copilot_harness.cjs.

The harness should fail fast instead of retrying with --continue.

A clearer log message would be useful, for example:

attempt 1: Copilot quota exceeded — not retrying

Proposed implementation

Please update actions/setup/js/copilot_harness.cjs.

1. Add a narrow pattern for the observed error

Near the existing Copilot error patterns:

const CAPI_ERROR_400_PATTERN = /CAPIError:\s*400/;

add:

// Pattern to detect the observed Copilot/CAPI quota exhaustion error.
const CAPI_QUOTA_EXCEEDED_PATTERN = /CAPIError:\s*429\s+429\s+quota exceeded/i;

2. Add a helper function

Near isTransientCAPIError, add:

/**
 * Determines if the collected output contains the observed Copilot/CAPI quota exhaustion error.
 * @param {string} output - Collected stdout+stderr from the process
 * @returns {boolean}
 */
function isCAPIQuotaExceededError(output) {
  return CAPI_QUOTA_EXCEEDED_PATTERN.test(output);
}

3. Use the helper inside the retry loop

In the retry loop, near the existing detections:

const isCAPIError = isTransientCAPIError(result.output);
const isMCPPolicy = isMCPPolicyError(result.output);
const isModelNotSupported = isModelNotSupportedError(result.output);
const isAuthErr = isNoAuthInfoError(result.output);
const isAuthenticationFailed = isAuthenticationFailedError(result.output);
const proxyAuthDiagnostic = buildCopilotProxyAuthFailureDiagnostic(result.output, process.env);
const isNullTypeToolCall = isNullTypeToolCallError(result.output);

add:

const isQuotaExceeded = isCAPIQuotaExceededError(result.output);

Then include it in the diagnostic log:

log(
  `attempt ${attempt + 1} failed:` +
    ` exitCode=${result.exitCode}` +
    ` isCAPIError400=${isCAPIError}` +
    ` isCAPIQuotaExceededError=${isQuotaExceeded}` +
    ` isMCPPolicyError=${isMCPPolicy}` +
    ` isModelNotSupportedError=${isModelNotSupported}` +
    ` isNullTypeToolCallError=${isNullTypeToolCall}` +
    ` isAuthError=${isAuthErr}` +
    ` isAuthenticationFailedError=${isAuthenticationFailed}` +
    ` permissionDeniedCount=${permissionDeniedCount}` +
    ` hasNumerousPermissionDenied=${hasNumerousPermissionDenied}` +
    ` hasOutput=${result.hasOutput}` +
    ` retriesRemaining=${MAX_RETRIES - attempt}`
);

4. Stop before the generic partial-execution retry branch

Add this check before the existing generic partial-execution retry branch:

// The observed quota exhaustion error is not useful to retry with --continue.
if (isQuotaExceeded) {
  log(`attempt ${attempt + 1}: Copilot quota exceeded — not retrying`);
  break;
}

This should be placed before:

if (attempt < MAX_RETRIES && result.hasOutput) {
  const reason = isCAPIError ? "CAPIError 400 (transient)" : "partial execution";
  ...
  continue;
}

Otherwise the existing partial-execution retry rule will still retry the quota error.

5. Export the helper for tests

At the bottom of copilot_harness.cjs, add the helper to module.exports:

isCAPIQuotaExceededError,

Test updates

Please update actions/setup/js/copilot_harness.test.cjs.

1. Import the new helper

Add isCAPIQuotaExceededError to the existing destructuring import from copilot_harness.cjs:

const {
  ...
  isCAPIQuotaExceededError,
  ...
} = require("./copilot_harness.cjs");

2. Add unit tests for the observed quota-exhaustion error

Add a new describe block near the existing CAPIError detection tests:

describe("CAPI quota-exceeded detection pattern", () => {
  it("matches the observed CAPIError 429 quota exceeded error", () => {
    expect(isCAPIQuotaExceededError("CAPIError: 429 429 quota exceeded")).toBe(true);
  });

  it("matches the observed error when embedded in Copilot CLI output", () => {
    const output =
      "Failed to get response from the AI model; retried 5 times " +
      "(Request-ID ABC123) Last error: CAPIError: 429 429 quota exceeded";
    expect(isCAPIQuotaExceededError(output)).toBe(true);
  });

  it("matches the observed error with extra spacing", () => {
    expect(isCAPIQuotaExceededError("CAPIError: 429   429   quota exceeded")).toBe(true);
  });

  it("does not match CAPIError 400", () => {
    expect(isCAPIQuotaExceededError("CAPIError: 400 Bad Request")).toBe(false);
  });

  it("does not match generic 429 output without the observed quota-exceeded message", () => {
    expect(isCAPIQuotaExceededError("CAPIError: 429 Too Many Requests")).toBe(false);
  });

  it("does not match unrelated errors", () => {
    expect(isCAPIQuotaExceededError("Error: connection reset by peer")).toBe(false);
    expect(isCAPIQuotaExceededError("Authentication failed")).toBe(false);
    expect(isCAPIQuotaExceededError("")).toBe(false);
  });
});

3. Update retry-policy tests

In the existing describe("retry policy: continue on partial execution", ...) block, add the same narrow pattern to the local test helper:

const CAPI_QUOTA_EXCEEDED_PATTERN = /CAPIError:\s*429\s+429\s+quota exceeded/i;

Update the inlined shouldRetry helper from:

function shouldRetry(result, attempt) {
  if (result.exitCode === 0) return false;
  if (hasNumerousPermissionDeniedIssues(result.output)) return false;
  return attempt < MAX_RETRIES && result.hasOutput;
}

to:

function shouldRetry(result, attempt) {
  if (result.exitCode === 0) return false;
  if (hasNumerousPermissionDeniedIssues(result.output)) return false;
  if (CAPI_QUOTA_EXCEEDED_PATTERN.test(result.output)) return false;
  return attempt < MAX_RETRIES && result.hasOutput;
}

Then add:

it("does not retry the observed CAPIError 429 quota exceeded error even when session produced output", () => {
  const result = {
    exitCode: 1,
    hasOutput: true,
    output: "Failed to get response from the AI model; retried 5 times. Last error: CAPIError: 429 429 quota exceeded",
  };

  expect(shouldRetry(result, 0)).toBe(false);
});

it("still retries generic partial-execution errors with output", () => {
  const result = {
    exitCode: 1,
    hasOutput: true,
    output: "Error: connection reset by peer",
  };

  expect(shouldRetry(result, 0)).toBe(true);
});

Validation

Please run:

make test
make build
make lint
make agent-finish

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions