Rebuilt carry-over logic to fit with new infrastructure. Also added two standalone plotting tests.#11
Conversation
-Add explicit timeline vs episode metrics in MetricsTracker, including total_time_hours and episode_* mirrors for all counters. -Update environment reset/step to preserve state across episodes when carry‑over is enabled, while resetting episode metrics and continuing prices. -Track episode metrics alongside cumulative ones in job management and baseline steps. -Ensure plotting uses episode series for dashboards and cumulative series for savings. -Adjust sanity tests to validate carry‑over continuity and enforce a real state change.
-Introduce test_plotter.py with unit tests for dashboard and cumulative savings plots, including save/no‑save paths and suffix handling. -Provide a MockEnv and synthetic episode cost data to exercise plotter APIs deterministically. -Add test_plot_dummy.py as a runnable sample that generates dashboard and cumulative plots from a dummy env for quick visual inspection. -Mirror the refactored episode‑scoped metrics in the test fixtures to match current plotting behavior.
📝 WalkthroughWalkthroughThis PR introduces episode-level metric tracking and a carry-over-state feature to ComputeClusterEnv. Episode metrics are tracked across baseline job submissions, job completions, queue management, and costs. The environment's reset method now supports preserving timeline state across episodes via a Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ComputeClusterEnv
participant MetricsTracker
participant Timeline
User->>ComputeClusterEnv: __init__(carry_over_state=?)
ComputeClusterEnv->>MetricsTracker: reset_timeline_metrics()
MetricsTracker->>MetricsTracker: init timeline metrics
ComputeClusterEnv->>MetricsTracker: reset_episode_metrics()
MetricsTracker->>MetricsTracker: init episode metrics
ComputeClusterEnv->>Timeline: initialize state
User->>ComputeClusterEnv: reset(options)
alt carry_over_state == False
ComputeClusterEnv->>MetricsTracker: reset_timeline_metrics()
ComputeClusterEnv->>MetricsTracker: reset_episode_metrics()
ComputeClusterEnv->>ComputeClusterEnv: _reset_timeline_state(start_index)
ComputeClusterEnv->>Timeline: reset all state
else carry_over_state == True
ComputeClusterEnv->>MetricsTracker: reset_episode_metrics()
ComputeClusterEnv->>Timeline: preserve prices, reset episode-level state
end
ComputeClusterEnv-->>User: initial observation
loop step interactions
User->>ComputeClusterEnv: step(action)
ComputeClusterEnv->>Timeline: process jobs, update nodes
ComputeClusterEnv->>MetricsTracker: increment episode metrics<br/>(episode_jobs_completed, episode_total_cost, etc.)
ComputeClusterEnv->>ComputeClusterEnv: compute step_reward
ComputeClusterEnv-->>User: state, reward, terminated, truncated, info
end
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly Related PRs
Suggested Reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/plotter.py`:
- Around line 170-180: The conditionals for showing reward panels are inverted;
remove the negation so panels are shown when flags are True. In the block that
checks env.plot_config.plot_eff_reward, plot_price_reward, plot_idle_penalty,
plot_job_age_penalty, and plot_total_reward, change each "if not
env.plot_config.<flag>" to "if env.plot_config.<flag>" so add_panel("...", ...,
...) is called when the corresponding flag is True (references:
env.plot_config.plot_eff_reward, plot_price_reward, plot_idle_penalty,
plot_job_age_penalty, plot_total_reward and the add_panel calls).
🧹 Nitpick comments (5)
test/test_plot_dummy.py (3)
92-105: Avoid double-assigningepisode_running_jobs_counts.It’s set twice; removing the first assignment keeps intent clear.
♻️ Suggested cleanup
- # approximate "running jobs" as used_nodes for the dummy - self.metrics.episode_running_jobs_counts = self.metrics.episode_used_nodes.copy()
124-136: Rename unused loop variable (static analysis).Ruff flags
ias unused;_is the conventional placeholder.♻️ Suggested cleanup
- for i in range(n_episodes): + for _ in range(n_episodes):
160-195: Consider removing the commented-out legacy generator.Keeping large commented blocks in tests can confuse future readers.
🧹 Suggested cleanup
-#def make_dummy_episode_costs(n_episodes=24, seed=123): -# """ -# Produce a plausible episode_costs list: -# - baseline_cost is highest -# - baseline_cost_off is a bit lower -# - agent_cost starts near baseline and improves slightly over episodes -# - includes jobs_completed and baseline_jobs_completed so that -# the jobs & combined panels in plot_cumulative_savings can be tested. -# """ -# rng = np.random.default_rng(seed) -# -# episode_costs = [] -# agent_cost = 90_000.0 -# for i in range(n_episodes): -# baseline = rng.normal(95_000, 3_000) -# baseline_off = baseline - rng.uniform(2_000, 7_000) -# -# # simulate learning: agent gets a bit cheaper over time -# agent_cost *= rng.uniform(0.992, 0.998) # gradual improvement -# agent = agent_cost + rng.normal(0, 1_000) -# -# # jobs: baseline slightly worse or similar, agent slightly improving -# base_jobs = rng.integers(4_000, 6_000) -# # allow agent to be a bit better or worse, but centered near baseline -# agent_jobs = int(base_jobs * rng.uniform(0.95, 1.05)) -# -# episode_costs.append({ -# "agent_cost": float(agent), -# "baseline_cost": float(baseline), -# "baseline_cost_off": float(baseline_off), -# "jobs_completed": float(agent_jobs), -# "baseline_jobs_completed": float(base_jobs), -# }) -# -# return episode_coststest/test_plotter.py (1)
185-207: Test name is now misleading after logic change.The test
test_skip_all_panelsnow enables all reward panels (plot_*_reward=True) before setting skip flags. The name suggests "skip all panels" but the setup enables reward-related panels. Consider renaming to clarify the intent (e.g.,test_skip_all_data_panelsortest_skip_panels_with_rewards_enabled).💡 Suggested rename
- def test_skip_all_panels(self): + def test_skip_all_data_panels_with_rewards_enabled(self):test/test_sanity_env.py (1)
145-200: Well-structured carry-over continuity test.The test correctly:
- Captures initial state after reset
- Steps the environment and snapshots post-action state
- Asserts at least one state change occurred (preventing false positives)
- Resets and verifies timeline state is preserved while
current_hourresets to 0One minor improvement: the unused loop variables (
obs,r,info) flagged by static analysis can be prefixed with underscores.🧹 Optional: Silence linter warnings for unused variables
for action in actions: - obs, r, term, trunc, info = env.step(action) + _obs, _r, term, trunc, _info = env.step(action) if term or trunc: break
| # Booleans at this place are inverted | ||
| if not env.plot_config.plot_eff_reward: | ||
| add_panel("Efficiency reward (%)", env.metrics.episode_eff_rewards, "score", None) | ||
| if not env.plot_config.plot_price_reward: | ||
| add_panel("Price reward (%)", env.metrics.episode_price_rewards, "score", None) | ||
| if not env.plot_config.plot_idle_penalty: | ||
| add_panel("Idle penalty (%)", env.metrics.episode_idle_penalties, "score", None) | ||
| if not env.plot_config.plot_job_age_penalty: | ||
| add_panel("Job-age penalty (%)", env.metrics.episode_job_age_penalties, "score", None) | ||
| if not env.plot_config.plot_total_reward: | ||
| add_panel("Total reward", getattr(env.metrics, "episode_rewards", None), "reward", None) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the PlotConfig definition
fd -t f -e py | xargs rg -l "class PlotConfig" | head -5Repository: FairRootGroup/powersched
Length of output: 86
🏁 Script executed:
# Search for PlotConfig definition
rg "class PlotConfig" -A 20 --type pyRepository: FairRootGroup/powersched
Length of output: 753
🏁 Script executed:
# Also search for where these flags are defined
rg "plot_eff_reward|plot_price_reward|plot_idle_penalty" --type py -B 2 -A 2Repository: FairRootGroup/powersched
Length of output: 3561
🏁 Script executed:
# Read test_plotter.py to see the "Skip all panels" context
rg "Skip all panels" -B 5 -A 15 src/test_plotter.pyRepository: FairRootGroup/powersched
Length of output: 127
🏁 Script executed:
# Check test_plot_dummy.py to understand DummyEnv behavior
cat test/test_plot_dummy.pyRepository: FairRootGroup/powersched
Length of output: 9071
🏁 Script executed:
# Look for any documentation or comments about PlotConfig semantics
rg "PlotConfig" -B 3 -A 8 src/plot_config.pyRepository: FairRootGroup/powersched
Length of output: 367
🏁 Script executed:
# Find the test file (it's in test directory)
fd -t f "test_plotter.py"Repository: FairRootGroup/powersched
Length of output: 88
🏁 Script executed:
# Read the test_plotter.py to see the "Skip all panels" comment
cat test/test_plotter.py | head -100Repository: FairRootGroup/powersched
Length of output: 4497
🏁 Script executed:
# Continue reading test_plotter.py to find the "Skip all panels" section
sed -n '100,250p' test/test_plotter.pyRepository: FairRootGroup/powersched
Length of output: 5410
🏁 Script executed:
# Read plotter.py to see the full context around lines 170-180
sed -n '150,200p' src/plotter.pyRepository: FairRootGroup/powersched
Length of output: 2087
🏁 Script executed:
# Check the plot_dashboard function to see if it calls both plot.py and uses plotter logic
rg "def plot_dashboard" -A 50 src/plotter.py | head -80Repository: FairRootGroup/powersched
Length of output: 2616
🏁 Script executed:
# Check if test_plot_dummy.py main() calls plot_dashboard
rg "plot_dashboard" test/test_plot_dummy.py -B 2 -A 2Repository: FairRootGroup/powersched
Length of output: 533
Remove negation from reward panel conditionals; flag semantics are inverted.
The plot_eff_reward, plot_price_reward, plot_idle_penalty, plot_job_age_penalty, and plot_total_reward flags are negated in plotter.py (lines 170-180), causing panels to be skipped when flags are True—the opposite of the intended behavior. This contradicts src/plot.py, which treats these flags correctly as "show when True," and breaks test_plot_dummy.py, which sets all flags to True expecting those panels to appear.
Remove the not operator from all five conditionals:
Fix
- if not env.plot_config.plot_eff_reward:
+ if env.plot_config.plot_eff_reward:
add_panel("Efficiency reward (%)", env.metrics.episode_eff_rewards, "score", None)
- if not env.plot_config.plot_price_reward:
+ if env.plot_config.plot_price_reward:
add_panel("Price reward (%)", env.metrics.episode_price_rewards, "score", None)
- if not env.plot_config.plot_idle_penalty:
+ if env.plot_config.plot_idle_penalty:
add_panel("Idle penalty (%)", env.metrics.episode_idle_penalties, "score", None)
- if not env.plot_config.plot_job_age_penalty:
+ if env.plot_config.plot_job_age_penalty:
add_panel("Job-age penalty (%)", env.metrics.episode_job_age_penalties, "score", None)
- if not env.plot_config.plot_total_reward:
+ if env.plot_config.plot_total_reward:
add_panel("Total reward", getattr(env.metrics, "episode_rewards", None), "reward", None)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Booleans at this place are inverted | |
| if not env.plot_config.plot_eff_reward: | |
| add_panel("Efficiency reward (%)", env.metrics.episode_eff_rewards, "score", None) | |
| if not env.plot_config.plot_price_reward: | |
| add_panel("Price reward (%)", env.metrics.episode_price_rewards, "score", None) | |
| if not env.plot_config.plot_idle_penalty: | |
| add_panel("Idle penalty (%)", env.metrics.episode_idle_penalties, "score", None) | |
| if not env.plot_config.plot_job_age_penalty: | |
| add_panel("Job-age penalty (%)", env.metrics.episode_job_age_penalties, "score", None) | |
| if not env.plot_config.plot_total_reward: | |
| add_panel("Total reward", getattr(env.metrics, "episode_rewards", None), "reward", None) | |
| # Booleans at this place are inverted | |
| if env.plot_config.plot_eff_reward: | |
| add_panel("Efficiency reward (%)", env.metrics.episode_eff_rewards, "score", None) | |
| if env.plot_config.plot_price_reward: | |
| add_panel("Price reward (%)", env.metrics.episode_price_rewards, "score", None) | |
| if env.plot_config.plot_idle_penalty: | |
| add_panel("Idle penalty (%)", env.metrics.episode_idle_penalties, "score", None) | |
| if env.plot_config.plot_job_age_penalty: | |
| add_panel("Job-age penalty (%)", env.metrics.episode_job_age_penalties, "score", None) | |
| if env.plot_config.plot_total_reward: | |
| add_panel("Total reward", getattr(env.metrics, "episode_rewards", None), "reward", None) |
🤖 Prompt for AI Agents
In `@src/plotter.py` around lines 170 - 180, The conditionals for showing reward
panels are inverted; remove the negation so panels are shown when flags are
True. In the block that checks env.plot_config.plot_eff_reward,
plot_price_reward, plot_idle_penalty, plot_job_age_penalty, and
plot_total_reward, change each "if not env.plot_config.<flag>" to "if
env.plot_config.<flag>" so add_panel("...", ..., ...) is called when the
corresponding flag is True (references: env.plot_config.plot_eff_reward,
plot_price_reward, plot_idle_penalty, plot_job_age_penalty, plot_total_reward
and the add_panel calls).
There was a problem hiding this comment.
Due to historic reasons, the booleans are inverted. We need to accept this at this place.
There was a problem hiding this comment.
@enlorenz, thank you for the clarification! I understand now—the boolean inversion is intentional and part of the existing design. I've noted this for future reviews.
✏️ Learnings added
Learnt from: enlorenz
Repo: FairRootGroup/powersched PR: 11
File: src/plotter.py:170-180
Timestamp: 2026-01-22T09:30:34.088Z
Learning: In src/plotter.py, the PlotConfig flags plot_eff_reward, plot_price_reward, plot_idle_penalty, plot_job_age_penalty, and plot_total_reward are intentionally inverted (checked with `not` operator) due to historic reasons. These flags act as skip flags rather than show flags, and this behavior should be preserved.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
No description provided.