Skip to content

Expose verifier feedback properties for hint generation#14

Merged
dzorlu merged 1 commit into
deniz/fleet_clientfrom
deniz/hint-feedback
Mar 17, 2026
Merged

Expose verifier feedback properties for hint generation#14
dzorlu merged 1 commit into
deniz/fleet_clientfrom
deniz/hint-feedback

Conversation

@dzorlu

@dzorlu dzorlu commented Mar 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add _tool_errors, _verifier_stdout, _verifier_error instance vars to FleetTaskEnv
  • Accumulate tool errors during step_async() (both MCP errors and exceptions)
  • Capture verifier stdout/error in _compute_reward() after verifier runs
  • Expose via properties: verifier_stdout, verifier_error, tool_errors_list

This enables SkyRL to build hints from failed rollout feedback without an LLM call. Used by fleet-ai/SkyRL#hint-augmentation PR.

Test plan

  • Verify existing tests still pass
  • Verify properties return None/[] when no errors occur
  • Verify tool errors accumulate across multiple steps
  • Verify verifier stdout is captured after _compute_reward()

🤖 Generated with Claude Code

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment thread src/envs/fleet_env/task_env.py
Comment thread src/envs/fleet_env/task_env.py
Add _tool_errors, _verifier_stdout, _verifier_error to FleetTaskEnv so
SkyRL can build hints from failed rollout feedback without an LLM call.

- Tool errors accumulated in step_async() on MCP errors and exceptions
- Verifier stdout/error captured in _compute_reward() after verifier runs
- Verifier exceptions also captured in _verifier_error (not just failures)
- All feedback properties reset in reset_async() to prevent cross-episode leakage
- Properties: verifier_stdout, verifier_error, tool_errors_list

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dzorlu dzorlu force-pushed the deniz/hint-feedback branch from 4d9d1c9 to 3566e7c Compare March 16, 2026 01:27
@dzorlu dzorlu changed the base branch from main to deniz/fleet_client March 16, 2026 01:27
@dzorlu dzorlu merged commit 438c7b3 into deniz/fleet_client Mar 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant