Skip to content

Improve Copilot harness classification for opaque exitCode=1 failures#39959

Merged
pelikhan merged 5 commits into
mainfrom
copilot/aw-failures-fix-cli-exit-code
Jun 18, 2026
Merged

Improve Copilot harness classification for opaque exitCode=1 failures#39959
pelikhan merged 5 commits into
mainfrom
copilot/aw-failures-fix-cli-exit-code

Conversation

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Unclassified Copilot harness failures were exhausting retries with exitCode=1 while every built-in classifier stayed false, leaving no actionable cause in the run output. This was especially visible in the chronic Daily Model Inventory failures and AOAI smoke failures, where the underlying signal was present in stderr/output but never surfaced as a distinct failure class.

  • Failure annotations

    • Add a named failureClass to Copilot harness failure logs instead of relying only on a matrix of booleans.
    • Emit a bounded outputTail for failed attempts so the last useful stderr/stdout context is preserved in the failure summary path.
  • New Copilot-specific classifications

    • Classify Copilot SDK session.idle timeouts as sdk_session_idle_timeout.
    • Classify gateway teardown messages as mcp_gateway_shutdown when they appear in agent output.
  • 429 / rate-limit handling

    • Extend Copilot/CAPI quota detection to recognize CAPIError: Too Many Requests in addition to the existing 429 429 quota exceeded form.
    • Keep runtime retry decisions and post-run error detection aligned by updating both harness-time and log-scan classifiers.
  • Representative output

    [copilot-harness] attempt 4 failed: exitCode=1 failureClass=sdk_session_idle_timeout ...
    [copilot-harness] attempt 4: outputTail="[copilot-sdk-driver-sample-node] Timeout after 60000ms waiting for session.idle"
    

This keeps the retry behavior intact for partial executions while making previously opaque failures directly attributable from the harness log alone.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix Copilot CLI exit code with no classified error Improve Copilot harness classification for opaque exitCode=1 failures Jun 18, 2026
Copilot AI requested a review from pelikhan June 18, 2026 05:56
@pelikhan pelikhan marked this pull request as ready for review June 18, 2026 06:00
Copilot AI review requested due to automatic review settings June 18, 2026 06:00
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

No ADR enforcement needed: PR #39959 does not have the 'implementation' label and has 0 new lines of code in business logic directories (threshold 100).

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions github-actions Bot mentioned this pull request Jun 18, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves diagnosability of previously opaque Copilot harness failures (notably exitCode=1 cases) by introducing an explicit failureClass and emitting a bounded outputTail, while also expanding Copilot/CAPI quota detection to recognize additional 429 “Too Many Requests” variants.

Changes:

  • Extend Copilot/CAPI quota detection to match CAPIError: Too Many Requests (and update tests + retry guard expectations accordingly).
  • Add new Copilot-specific failure classifiers for SDK session.idle timeouts and MCP gateway shutdown messages.
  • Emit a compact outputTail for failed attempts and log a named failureClass alongside existing boolean flags.
Show a summary per file
File Description
actions/setup/js/detect_agent_errors.cjs Expands the quota/rate-limit regex used for post-run log scanning (isCAPIQuotaExceededError).
actions/setup/js/detect_agent_errors.test.cjs Updates unit tests to cover the new “Too Many Requests” matching behavior.
actions/setup/js/copilot_harness.cjs Adds failure classification + output tail extraction and logs failureClass/outputTail on failed attempts.
actions/setup/js/copilot_harness.test.cjs Updates quota/rate-limit retry expectations and adds tests for the new helper classifiers/tail extraction.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 2

Comment on lines +60 to +63
// Pattern: Copilot/CAPI quota exhaustion.
// Matches the observed error: "CAPIError: 429 429 quota exceeded".
// Quota exhaustion is a persistent, non-retryable condition.
const CAPI_QUOTA_EXCEEDED_PATTERN = /CAPIError:\s*429\s+429\s+quota exceeded/i;
const CAPI_QUOTA_EXCEEDED_PATTERN = /CAPIError:\s*(?:429\s+)?(?:429\s+quota exceeded|Too Many Requests)/i;
Comment on lines +316 to +320
* isAuthErr?: boolean,
* isAuthenticationFailed?: boolean,
* isCAPIError?: boolean,
* isMCPGatewayShutdown?: boolean,
* isMCPPolicy?: boolean,
@github-actions

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 90/100 — Excellent

Analyzed 7 test(s): 7 design, 0 implementation, 0 guideline violations. Minor technical test inflation detected in detect_agent_errors.test.cjs (3 additions / 1 production addition = 3:1 ratio; −10 pts).

📊 Metrics & Test Classification (7 tests analyzed)
Metric Value
New/modified tests analyzed 7
✅ Design tests (behavioral contracts) 7 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 7 (100%)
Duplicate test clusters 0
Test inflation detected YES — detect_agent_errors.test.cjs (3 test additions / 1 production addition = 3:1; small absolute change, binary penalty applied)
🚨 Coding-guideline violations 0 (no mock libraries, no missing build tags)
Test File Classification Issues Detected
it("matches Copilot/CAPI 429 Too Many Requests output") copilot_harness.test.cjs:85 ✅ Design
it("does not retry Copilot/CAPI Too Many Requests output") copilot_harness.test.cjs:204 ✅ Design
it("classifies Copilot SDK session.idle timeouts distinctly") copilot_harness.test.cjs:257 ✅ Design
it("classifies MCP gateway shutdown distinctly when present in output") copilot_harness.test.cjs:262 ✅ Design
it("extracts a compact tail preview from large output") copilot_harness.test.cjs:267 ✅ Design
it("truncates very large output tails from the front") copilot_harness.test.cjs:272 ✅ Design
it("matches Copilot/CAPI Too Many Requests output") detect_agent_errors.test.cjs:146 ✅ Design

JavaScript: 7 (*.test.cjs). No Go or other language test files changed.

Verdict

Check passed. 0% implementation tests (threshold: 30%). All 7 new/modified tests verify observable behavioral contracts — correct error classification (isCAPIQuotaExceededError, isSDKSessionIdleTimeoutError, isMCPGatewayShutdownError, classifyCopilotFailure), retry suppression for rate-limit errors, and output tail formatting — with no mock-library misuse, no missing build tags, and no guideline violations.

References: §27740021565

🧪 Test quality analysis by Test Quality Sentinel ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 90/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 7 tests verify observable behavioral contracts.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /diagnose and /tdd — no blocking issues; a few targeted improvements suggested.

📋 Key Themes & Highlights

Key Themes

  • MCP_GATEWAY_SHUTDOWN_PATTERN breadth — the bare substring /Gateway shutdown initiated/ could match informational teardown logs on clean exits; worth tightening to require corroborating JSON fields from the observed payload.
  • extractOutputTail slice(-0) footgun — silent misbehaviour when maxChars < 2; unreachable with the default of 600 but easy to guard.
  • Stale inline comment in detect_agent_errors.cjs — doesn't document the new CAPIError: Too Many Requests form that the pattern now matches.
  • Retry behaviour untested for new classessdk_session_idle_timeout and mcp_gateway_shutdown intentionally fall through to partial_execution retries, but there are no shouldRetry tests capturing that intent.

Positive Highlights

  • ✅ The quota pattern fix (detect_agent_errors.cjs) is minimal, correct, and backed by assertion-flipped tests in both test files — exactly the right footprint for a bug fix.
  • classifyCopilotFailure cleanly replaces an implicit matrix of booleans with a readable priority chain; JSDoc is thorough.
  • extractOutputTail is well-bounded (600 chars / 12 lines), handles null bytes and CRLF, and is covered by the key truncation cases.
  • ✅ The PR description precisely documents the observable before/after — making the intent reviewable without needing to run the code.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer

@@ -60,7 +60,7 @@ const MODEL_NOT_SUPPORTED_PATTERN =
// Pattern: Copilot/CAPI quota exhaustion.
// Matches the observed error: "CAPIError: 429 429 quota exceeded".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] The comment still only documents the original CAPIError: 429 429 quota exceeded form, but the updated pattern also matches CAPIError: 429 Too Many Requests and CAPIError: Too Many Requests. A future reader will have an incomplete picture of what the pattern detects.

💡 Suggested comment update
// Matches Copilot/CAPI quota exhaustion in observed forms:
//   "CAPIError: 429 429 quota exceeded"  – original observed form
//   "CAPIError: 429 Too Many Requests"   – HTTP status form
//   "CAPIError: Too Many Requests"       – statusless form seen in retry wrapper output
// Quota exhaustion is a persistent, non-retryable condition.

Comment thread actions/setup/js/copilot_harness.cjs Outdated
// Pattern: Copilot SDK driver timed out waiting for the session to become idle.
const SDK_SESSION_IDLE_TIMEOUT_PATTERN = /Timeout after \d+ms waiting for session\.idle/;
// Pattern: MCP gateway shutdown surfaced in agent output.
const MCP_GATEWAY_SHUTDOWN_PATTERN = /Gateway shutdown initiated/;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] /Gateway shutdown initiated/ is a bare substring match that fires on any log line containing those three words — including informational teardown messages that MCP servers may emit on a clean exit. If a successful teardown log surfaces in a failed run's combined output (e.g., because the gateway correctly wound down after the copilot process crashed), the failure gets mis-classified as mcp_gateway_shutdown even when the root cause is something else.

💡 Tighten the pattern

The test fixture uses a JSON envelope: '{"message":"Gateway shutdown initiated","serversTerminated":2,"status":"closed"}'. Requiring at least one other field from that envelope reduces false positives from bare log lines:

// Require accompanying serversTerminated or status:closed to avoid matching
// informational teardown messages on clean exits.
const MCP_GATEWAY_SHUTDOWN_PATTERN = /Gateway shutdown initiated[\s\S]{0,150}(?:serversTerminated|"status"\s*:\s*"closed")/;

If the JSON format isn't guaranteed, even anchoring to the error path by requiring exitCode=1 context is already done by only calling this inside the failure branch — but the pattern itself is the first line of defence.

Comment thread actions/setup/js/copilot_harness.cjs Outdated
if (tailLines.length === 0) return "";
let tail = tailLines.join("\n");
if (tail.length > maxChars) {
tail = `…${tail.slice(-(maxChars - 1))}`;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] tail.slice(-(maxChars - 1)) silently misbehaves when maxChars === 1: the expression becomes tail.slice(-0), and in JavaScript slice(-0) is equivalent to slice(0) — it returns the entire string. The result is "..." + tail, which is longer than the requested maxChars. Unreachable with the default of 600, but a unit test would catch this and a guard removes the footgun for any caller that passes a small explicit value.

💡 Guard suggestion
if (tail.length > maxChars) {
  const keep = maxChars - 1; // characters available after the ellipsis
  tail = keep > 0 ? `...${tail.slice(-keep)}` : "...";
}

And a regression test:

it("never exceeds maxChars even when maxChars is 1", () => {
  const tail = extractOutputTail("abc", { maxLines: 5, maxChars: 1 });
  expect(tail.length).toBeLessThanOrEqual(1);
});

@@ -239,6 +254,30 @@ describe("copilot_harness.cjs", () => {
expect(shouldRetry(result, 1, true, 1)).toBe(false);
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The new tests confirm that sdk_session_idle_timeout and mcp_gateway_shutdown are correctly named, but there are no shouldRetry tests verifying what the harness does with them at retry time. Per the PR description, both classes intentionally fall through to the partial_execution retry path when hasOutput: true. Without an explicit test, a future guard clause that accidentally makes them non-retryable would go undetected.

💡 Suggested regression tests
it("retries SDK session.idle timeout as partial execution", () => {
  const result = {
    exitCode: 1,
    hasOutput: true,
    output: "[copilot-sdk-driver] Timeout after 60000ms waiting for session.idle",
  };
  expect(shouldRetry(result, 0)).toBe(true);
});

it("retries MCP gateway shutdown as partial execution", () => {
  const result = {
    exitCode: 1,
    hasOutput: true,
    output: '{"message":"Gateway shutdown initiated","serversTerminated":1,"status":"closed"}',
  };
  expect(shouldRetry(result, 0)).toBe(true);
});

These tests serve as executable documentation of the intentional retry-through behaviour described in the PR.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUEST_CHANGES — two correctness issues and three follow-ons need addressing before merge

Blocking issues:

  • CAPI_QUOTA_EXCEEDED_PATTERN now silently makes transient HTTP 429 throttles non-retryable, potentially aborting all retry attempts on a recoverable condition.
  • classifyCopilotFailure priority order buries sdk_session_idle_timeout and mcp_gateway_shutdown under permission_denied, defeating the PR's stated goal of surfacing previously opaque failures.
  • MCP_GATEWAY_SHUTDOWN_PATTERN (/Gateway shutdown initiated/) is insufficiently anchored and will misclassify any non-MCP shutdown message.

Non-blocking but worth fixing:

  • extractOutputTail doesn't normalize bare \r, garbling log output from progress-bar-style processes.
  • The inline comment and module JSDoc for CAPI_QUOTA_EXCEEDED_PATTERN still only document the original double-429 form; the new forms it matches are undocumented.
  • filter(Boolean) drops blank lines before slice(-maxLines), so maxLines silently counts non-empty lines rather than total lines.

🔎 Code quality review by PR Code Quality Reviewer

Comment thread actions/setup/js/copilot_harness.cjs Outdated
// Pattern: Copilot SDK driver timed out waiting for the session to become idle.
const SDK_SESSION_IDLE_TIMEOUT_PATTERN = /Timeout after \d+ms waiting for session\.idle/;
// Pattern: MCP gateway shutdown surfaced in agent output.
const MCP_GATEWAY_SHUTDOWN_PATTERN = /Gateway shutdown initiated/;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MCP_GATEWAY_SHUTDOWN_PATTERN is too generic — will misclassify any process that emits this string as an MCP failure.

💡 Details and suggested fix

The current pattern:

const MCP_GATEWAY_SHUTDOWN_PATTERN = /Gateway shutdown initiated/;

This matches any substring Gateway shutdown initiated regardless of source. Any API server, cloud SDK, or test fixture that emits this message will trigger mcp_gateway_shutdown, masking the real failure class.

The test shows the actual observed message is a JSON blob from the MCP driver:

{"message":"Gateway shutdown initiated","serversTerminated":2,"status":"closed"}

Anchor to the JSON key to reduce false positives:

// Matches the JSON payload the MCP gateway driver emits on shutdown.
const MCP_GATEWAY_SHUTDOWN_PATTERN = /"message"\s*:\s*"Gateway shutdown initiated"/;

Without this specificity, any verbose CI log from any gateway component flips this flag and swallows the true failure class.

// Matches the observed error: "CAPIError: 429 429 quota exceeded".
// Quota exhaustion is a persistent, non-retryable condition.
const CAPI_QUOTA_EXCEEDED_PATTERN = /CAPIError:\s*429\s+429\s+quota exceeded/i;
const CAPI_QUOTA_EXCEEDED_PATTERN = /CAPIError:\s*(?:429\s+)?(?:429\s+quota exceeded|Too Many Requests)/i;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CAPI_QUOTA_EXCEEDED_PATTERN now treats any CAPIError: Too Many Requests as non-retryable quota exhaustion — a transient throttle will burn the entire retry budget.

💡 Details and suggested fix

Old (narrow):```js
/CAPIError:\s*429\s+429\s+quota exceeded/i


New (broad):
```js
/CAPIError:\s*(?:429\s+)?(?:429\s+quota exceeded|Too Many Requests)/i

A standard HTTP 429 from a transiently rate-limited backend is now indistinguishable from true per-account quota exhaustion. Since isQuotaExceeded triggers an immediate break with no retries, a single transient 429 will abort all remaining attempts — the opposite of the intended retry behavior for recoverable conditions.

The test string "retried 5 times. Last error: CAPIError: Too Many Requests" contains the SDK's internal retry count as evidence of persistence, but the regex matches CAPIError: Too Many Requests anywhere in output with no such context requirement.

Options:

  1. Separate the two semantics: add a distinct isTransientRateLimitError that matches bare Too Many Requests and retries with back-off, while reserving capi_quota_exceeded for quota exceeded forms.
  2. Accept the broadening but update the inline comment (line 61–62) and the module-level JSDoc (line 20) to explicitly document that both throttling and quota exhaustion are treated as non-retryable — so the trade-off is visible to future reviewers.

Comment thread actions/setup/js/copilot_harness.cjs Outdated
if (detection.isNullTypeToolCall) return "null_type_tool_call";
if (detection.isAuthErr) return "no_auth_info";
if (detection.isAuthenticationFailed) return "authentication_failed";
if (detection.hasNumerousPermissionDenied) return "permission_denied";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasNumerousPermissionDenied outranks isSDKSessionIdleTimeout and isMCPGatewayShutdown in the priority chain — the new failure classes the PR is meant to surface get swallowed by the existing one.

💡 Details and suggested fix

Current order:

if (detection.hasNumerousPermissionDenied)  return "permission_denied";       // line 336
if (detection.isSDKSessionIdleTimeout)       return "sdk_session_idle_timeout"; // line 337
if (detection.isMCPGatewayShutdown)          return "mcp_gateway_shutdown";     // line 338

A Copilot SDK run that hits a session.idle timeout often also generates numerous permission-denied log lines as a side-effect of the shutdown sequence. Under the current ordering, classifyCopilotFailure returns "permission_denied" in that case — precisely the opaque result this PR is trying to eliminate.

The same problem applies to mcp_gateway_shutdown.

Suggested reorder — put the specific new signal classes above the generic volume-based one:

if (detection.isSDKSessionIdleTimeout)       return "sdk_session_idle_timeout";
if (detection.isMCPGatewayShutdown)          return "mcp_gateway_shutdown";
if (detection.hasNumerousPermissionDenied)   return "permission_denied";

This aligns with the PR's stated goal: make previously opaque failures directly attributable from the harness log alone.

Comment thread actions/setup/js/copilot_harness.cjs Outdated
if (typeof output !== "string" || !output) return "";
const maxChars = options?.maxChars ?? OUTPUT_TAIL_MAX_CHARS;
const maxLines = options?.maxLines ?? OUTPUT_TAIL_MAX_LINES;
const normalized = output.replace(/\0/g, "").replace(/\r\n/g, "\n").trim();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractOutputTail normalizes \r\n but leaves bare \r intact — progress-bar-style output will garble the logged tail.

💡 Details and suggested fix
const normalized = output.replace(/\0/g, "").replace(/\r\n/g, "\n").trim();

Some child processes (progress bars, build tools, terminal-mode programs) use bare \r to overwrite the current line in place. Only \r\n is normalized here; lone \r characters survive into tailLines. When the tail is joined on \n and emitted as JSON.stringify(outputTail), the \r causes the beginning of the line to be visually overwritten in the CI log, hiding the context the tail was meant to surface.

Fix — add a lone-CR pass after the \r\n normalization:

const normalized = output
  .replace(/\0/g, "")
  .replace(/\r\n/g, "\n")
  .replace(/\r/g, "\n")   // bare CR → LF
  .trim();

@@ -60,7 +60,7 @@ const MODEL_NOT_SUPPORTED_PATTERN =
// Pattern: Copilot/CAPI quota exhaustion.
// Matches the observed error: "CAPIError: 429 429 quota exceeded".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment — after the regex change it now matches three distinct forms, but the comment still only documents the original double-429 error string.

💡 Suggested update

Current (stale):

// Matches the observed error: "CAPIError: 429 429 quota exceeded".
// Quota exhaustion is a persistent, non-retryable condition.

The new regex also matches:

  • CAPIError: 429 Too Many Requests
  • CAPIError: Too Many Requests
  • CAPIError: 429 quota exceeded (single 429 + quota exceeded, as a side-effect of the optional group)

Suggested update:

// Matches Copilot/CAPI quota exhaustion and rate-limit responses:
//   "CAPIError: 429 429 quota exceeded"  (original observed form)
//   "CAPIError: 429 Too Many Requests"   (HTTP 429 form)
//   "CAPIError: Too Many Requests"       (no status code in message)
// All forms are treated as non-retryable.

The module-level JSDoc at line 20 (capi_quota_exceeded_error description) also needs the same update.

const tailLines = normalized
.split("\n")
.map(line => line.trimEnd())
.filter(Boolean)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter(Boolean) removes blank lines before slice(-maxLines)maxLines counts non-empty lines, not total lines, so callers silently get fewer context lines than requested.

💡 Details
.filter(Boolean)      // blank lines dropped here
.slice(-maxLines);    // maxLines applied to surviving lines only

If the last section of output is:

Error: timeout\n\n\nStack trace line 1\nStack trace line 2

...after filter(Boolean) you have 3 non-empty lines. A caller requesting maxLines: 2 sees only the two stack trace lines with no indication that the separating blank lines — and thus the visual structure of the error — were dropped.

If blank lines should be preserved, remove filter(Boolean). If they should be collapsed, document that maxLines applies after filtering so callers aren't surprised by unexpectedly thin output.

@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot run pr-finisher skill

…Error to isTransientCAPIError

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@copilot run pr-finisher skill

Done. Addressed both reviewer threads in commit c1ea40d:

  • Updated the CAPI_QUOTA_EXCEEDED_PATTERN comment in detect_agent_errors.cjs to list both matched error forms (CAPIError: 429 429 quota exceeded and CAPIError: Too Many Requests).
  • Renamed isCAPIErrorisTransientCAPIError in the classifyCopilotFailure JSDoc @param type and in the detection object passed at the call site, matching the underlying isTransientCAPIError function name.

All 347 JS test files and Go unit tests pass locally.

@pelikhan

Copy link
Copy Markdown
Collaborator

@copilot review comments from GitHub actions bot

Copilot AI and others added 2 commits June 18, 2026 11:25
…re priority, fix bare-CR/slice-0, improve comments, add regression tests

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@copilot review comments from GitHub actions bot

Addressed all GitHub Actions bot threads. Changes in the latest commits:

  • Blocking issues fixed:

    • MCP_GATEWAY_SHUTDOWN_PATTERN now anchors to the JSON "message" key (/"message"\s*:\s*"Gateway shutdown initiated"/) to avoid false positives from bare log lines
    • classifyCopilotFailure reordered so sdk_session_idle_timeout and mcp_gateway_shutdown outrank permission_denied, making the new signals actually reachable
    • CAPI_QUOTA_EXCEEDED_PATTERN comment and module JSDoc updated to document all three matched forms and explain the non-retryable rationale
  • Non-blocking fixes applied:

    • extractOutputTail now normalizes bare \r (progress-bar output) before splitting
    • extractOutputTail guards the slice(-0) edge case when maxChars=1
    • filter(Boolean) comment clarified to note maxLines counts non-empty lines
  • New regression tests added:

    • Priority ordering: sdk_session_idle_timeout/mcp_gateway_shutdown beats hasNumerousPermissionDenied when both flags are set
    • shouldRetry pass-through: both new failure classes remain retryable at the harness level
    • extractOutputTail never exceeds maxChars even when maxChars=1

@pelikhan pelikhan merged commit 8bd684a into main Jun 18, 2026
14 checks passed
@pelikhan pelikhan deleted the copilot/aw-failures-fix-cli-exit-code branch June 18, 2026 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants