Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions actions/setup/js/handle_agent_failure.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -1827,7 +1827,8 @@ function hasAgentTerminalReasonCompleted() {
* The log file is available in the conclusion job after the agent artifact is downloaded.
* @returns {string} Formatted context string, or empty string if no engine failure found
*/
function buildEngineFailureContext() {
function buildEngineFailureContext(options = {}) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing @param JSDoc for the new options parameter.

💡 Suggested addition

The JSDoc block above documents @returns but does not describe the new options parameter or its suppressEngineRateLimit429 key. In a file this size, undocumented optional object params are easy to misuse.

/**
 * ...
 * `@param` {Object} [options]
 * `@param` {boolean} [options.suppressEngineRateLimit429=false] When true, skip the dedicated
 *   engine 429/rate-limit context and fall through to the generic engine failure path.
 *   Use when a higher-level signal (e.g. max-AI-credits-exceeded) is already providing
 *   the correct root-cause message to avoid conflicting guidance.
 * `@returns` {string} Formatted context string, or empty string if no engine failure found
 */

const suppressEngineRateLimit429 = options.suppressEngineRateLimit429 === true;
// Derive agent-stdio.log path from the agent output file path (same directory)
const agentOutputFile = process.env.GH_AW_AGENT_OUTPUT;
const stdioLogPath = agentOutputFile ? path.join(path.dirname(agentOutputFile), "agent-stdio.log") : "/tmp/gh-aw/agent-stdio.log";
Expand Down Expand Up @@ -1858,7 +1859,7 @@ function buildEngineFailureContext() {
return "";
}

if (hasEngineRateLimit429Signal(logContent) || hasEngineRateLimit429InOTELMirror()) {
if (!suppressEngineRateLimit429 && (hasEngineRateLimit429Signal(logContent) || hasEngineRateLimit429InOTELMirror())) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppression is incomplete: raw 429 log text still surfaces in the fallback tail output, potentially undermining the intent.

💡 Details and suggested approach

When suppressEngineRateLimit429: true, the dedicated Engine Rate Limited (HTTP 429) section is correctly skipped. However, the canonical 429 log line (e.g. "Failed to get response from the AI model; retried 5 times. Last error: CAPIError: 429 429 Sorry, you have exceeded your rate limit for utility models.") does not match any of the named error-prefix patterns (ERROR:, Error:, panic:, etc.), so it falls through to the tail-fallback path and lands verbatim inside the Last agent output code block.

The user then sees:

  • Engine Failure section — raw CAPIError: 429 text visible
  • Max AI Credits Exceeded section — correct root-cause

This is a partial suppression only. Compare with the hasToolDenialsExceeded guard which short-circuits the entire engineFailureContext to "". If the intent here is the same — the max-AI-credits signal is the only actionable one — the call site could be simplified:

// mirrors the hasToolDenialsExceeded pattern
const engineFailureContext =
  agentConclusion === "failure" && !hasToolDenialsExceeded && !maxAICreditsExceeded
    ? buildEngineFailureContext()
    : "";

If partial suppression (raw diagnostics visible, guided 429 remediation hidden) is intentional, a short comment at the call site documenting that decision would prevent future engineers from tightening the assertion by accident.

core.info("Detected engine HTTP 429/rate-limit signal — using dedicated context message");
return buildEngineRateLimit429Context(engineLabel);
}
Expand Down Expand Up @@ -2717,7 +2718,7 @@ async function main() {
// Suppress when tool-denials-exceeded is present: the engine termination is a
// direct consequence of the SDK hitting the denial threshold, so the tool-denials
// context is the more actionable signal.
const engineFailureContext = agentConclusion === "failure" && !hasToolDenialsExceeded ? buildEngineFailureContext() : "";
const engineFailureContext = agentConclusion === "failure" && !hasToolDenialsExceeded ? buildEngineFailureContext({ suppressEngineRateLimit429: maxAICreditsExceeded }) : "";

// Build timeout context
const timeoutContext = buildTimeoutContext(isTimedOut, timeoutMinutes);
Expand Down Expand Up @@ -2941,7 +2942,7 @@ async function main() {
// Suppress when tool-denials-exceeded is present: the engine termination is a
// direct consequence of the SDK hitting the denial threshold, so the tool-denials
// context is the more actionable signal.
const engineFailureContext = agentConclusion === "failure" && !hasToolDenialsExceeded ? buildEngineFailureContext() : "";
const engineFailureContext = agentConclusion === "failure" && !hasToolDenialsExceeded ? buildEngineFailureContext({ suppressEngineRateLimit429: maxAICreditsExceeded }) : "";

// Build timeout context
const timeoutContext = buildTimeoutContext(isTimedOut, timeoutMinutes);
Expand Down
7 changes: 7 additions & 0 deletions actions/setup/js/handle_agent_failure.test.cjs
Original file line number Diff line number Diff line change
Expand Up @@ -1490,6 +1490,13 @@ describe("handle_agent_failure", () => {
expect(result).not.toContain("Last agent output");
});

it("suppresses engine 429 context when max-ai-credits-exceeded takes precedence", () => {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The test covers suppressEngineRateLimit429: true but there is no complementary test for the default case — verifying that when the option is absent (or false) and a 429 signal is present, the 429-specific context is still returned.

💡 Suggested test to add nearby
it("returns engine 429 context when suppressEngineRateLimit429 is false and signal is present", () => {
  fs.writeFileSync(stdioLogPath, "...CAPIError: 429...\n");
  const result = buildEngineFailureContext({ suppressEngineRateLimit429: false });
  expect(result).toContain("Engine Rate Limited (HTTP 429)");
});

This guards against a future change that accidentally hard-codes the flag to true or inverts the condition — the existing suppression test alone would not catch that regression.

fs.writeFileSync(stdioLogPath, "Failed to get response from the AI model; retried 5 times. Last error: CAPIError: 429 429 Sorry, you've exceeded your rate limit for utility models.\n");
const result = buildEngineFailureContext({ suppressEngineRateLimit429: true });
expect(result).not.toContain("Engine Rate Limited (HTTP 429)");
expect(result).toContain("Engine Failure");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] The "Engine Failure" string is a magic literal — if the fallback context heading is ever renamed, this assertion will silently lose its value or break unexpectedly.

💡 Suggestion

If a constant (or an exported string) is already used elsewhere for this heading, reference it here. Otherwise, consider asserting on a more stable/specific sub-string from the fallback context body rather than a section heading. For example:

// Instead of: expect(result).toContain("Engine Failure");
// Prefer something derived from a known constant, e.g.:
const { ENGINE_FAILURE_HEADING } = require("./constants");
expect(result).toContain(ENGINE_FAILURE_HEADING);

At minimum, add a brief comment explaining why the fallback falls through to this template when the 429 branch is suppressed, so future readers know the assertion is intentional.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test asserts the right output header, but does not verify whether the raw 429 log text is absent from the fallback result.

💡 Details

With suppressEngineRateLimit429: true and this log content, the ENGINE_RATE_LIMIT_429_RE check is bypassed, the line does not match any named-prefix pattern (ERROR:, Error:, etc.), and the fallback tail path runs. The result therefore contains "Engine Failure" (asserted ✓) and the raw text "CAPIError: 429 429 Sorry, you have exceeded your rate limit for utility models." inside the Last agent output code block (not asserted).

If the intent of the suppression is to remove all 429 noise from the failure report, the following assertion would catch a regression:

expect(result).not.toContain("CAPIError");

If the raw diagnostics are intentionally kept visible (only the actionable guided 429 section is suppressed), add a brief comment to the test explaining this so the assertion boundary is clear to future readers.

});

it("returns dedicated context when 429/rate-limit is only present in OTLP mirror", () => {
fs.writeFileSync(stdioLogPath, "Agent terminated unexpectedly without clear error details\n");
fs.writeFileSync(
Expand Down
Loading