Add Env for playing Pokémon using poke-env#131
Conversation
Added observations on popular choices and practices in Pokemon-RL projects.
|
✅ Deployment succeeded for
Nice work! Wait for a code review and we're ready to go. You can iterate locally or validate fixes by running |
Add observations section to Readme
…sicSetupCheckList.md
Updated the checklist for setting up the Pokémon RL training environment, including system requirements, hardware recommendations, setup instructions, and known issues.
Removed unnecessary comments and cleaned up formatting.
Added instructions for forking the Pokemon-Showdown repository and noted compatibility issues with random battle generation.
Organised Dockerfile
fix: ensure Gen 8 Random Battles run correctly instead of Gen 9
|
✅ Deployment succeeded for
Nice work! Wait for a code review and we're ready to go. You can iterate locally or validate fixes by running |
1 similar comment
|
✅ Deployment succeeded for
Nice work! Wait for a code review and we're ready to go. You can iterate locally or validate fixes by running |
Created a Basic Setup and Installation.md file
|
✅ Deployment succeeded for
Nice work! Wait for a code review and we're ready to go. You can iterate locally or validate fixes by running |
dockerfiles separated
|
✅ Deployment succeeded for
Nice work! Wait for a code review and we're ready to go. You can iterate locally or validate fixes by running |
There was a problem hiding this comment.
Pull Request Overview
This PR adds a comprehensive Pokemon battle environment integration to OpenEnv using poke-env and Pokemon Showdown. The implementation provides HTTP-based access to Pokemon battles for reinforcement learning and AI training.
- Complete Pokemon environment integration with poke-env library
- Docker containerization with Pokemon Showdown server and OpenEnv API server
- Client library for HTTP-based environment interaction
- Comprehensive test suite and documentation
Reviewed Changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 24 comments.
Show a summary per file
| File | Description |
|---|---|
| src/envs/pokemon_env/server/pokemon_environment.py | Core environment implementation with async/event loop synchronization |
| src/envs/pokemon_env/client.py | HTTP client for Pokemon environment |
| src/envs/pokemon_env/models.py | Data models for actions, observations, and state |
| src/envs/pokemon_env/server/app.py | FastAPI server application |
| src/envs/pokemon_env/server/Dockerfile | Multi-service Docker image configuration |
| src/envs/pokemon_env/test_enhancements.py | Comprehensive test suite |
| examples/project-pikachu/test_*.py | Integration and HTTP client tests |
| examples/project-pikachu/TESTING.md | Testing guide and troubleshooting |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if obs.available_switches and random.random() < 0.2 and active_hp < 0.3: | ||
| switch_idx = random.choice(range(len(obs.available_switches))) | ||
| action = PokemonAction(action_type="switch", action_index=switch_idx) | ||
| action_desc = f"Switch to {obs.team[switch_idx + 1].species.title()}" |
There was a problem hiding this comment.
Array index mismatch: switch_idx is selected from available_switches (line 498), but used to index obs.team[switch_idx + 1] (line 500). The available_switches list contains indices into the team, not raw team indices. This should be obs.team[obs.available_switches[switch_idx]] to correctly map from the chosen switch index to the actual team member.
| action_desc = f"Switch to {obs.team[switch_idx + 1].species.title()}" | |
| action_desc = f"Switch to {obs.team[obs.available_switches[switch_idx]].species.title()}" |
| dynamax: Whether to dynamax this turn (if applicable) | ||
| terastallize: Whether to terastallize this turn (if applicable) | ||
| """ | ||
| action_type: Literal["move", "switch"] = "move" |
There was a problem hiding this comment.
The action_type field doesn't include 'forfeit' or 'default' as valid options, but the pokemon_environment.py implementation handles these action types (lines 142, 193). Consider adding these to the Literal type for consistency: Literal[\"move\", \"switch\", \"forfeit\", \"default\"].
| action_type: Literal["move", "switch"] = "move" | |
| action_type: Literal["move", "switch", "forfeit", "default"] = "move" |
| except ImportError as e: | ||
| raise ImportError( | ||
| "poke-env is not installed. " | ||
| "Please install it with: pip install poke-env" |
There was a problem hiding this comment.
Corrected command formatting. The installation command should use backticks for code formatting consistency: 'Please install it with: pip install poke-env'
| "Please install it with: pip install poke-env" | |
| "Please install it with: `pip install poke-env`" |
| # Otherwise, use a move | ||
| elif obs.available_moves: | ||
| move_idx = random.choice(range(len(obs.available_moves))) | ||
| move_name = obs.available_moves[move_idx] |
There was a problem hiding this comment.
obs.available_moves is a list of integers (move indices, not move names) based on the PokemonObservation model definition. Using it directly as move_name will result in displaying an integer instead of the actual move name. To get the move name, use: obs.active_pokemon.moves[obs.available_moves[move_idx]]['id'] if available.
| move_name = obs.available_moves[move_idx] | |
| move_index = obs.available_moves[move_idx] | |
| move_name = obs.active_pokemon.moves[move_index]['id'] if obs.active_pokemon and obs.active_pokemon.moves and move_index < len(obs.active_pokemon.moves) else str(move_index) |
| mega_evolve: bool = False | ||
| dynamax: bool = False | ||
| terastallize: bool = False |
There was a problem hiding this comment.
Missing documentation for the z_move field. While mega_evolve, dynamax, and terastallize are documented in the docstring, z_move is referenced in pokemon_environment.py but not included in the PokemonAction dataclass or its docstring. Either add the field or remove references to it in the implementation.
| """ | ||
|
|
||
| import sys | ||
| import os |
There was a problem hiding this comment.
Import of 'os' is not used.
| import os |
| import sys | ||
| import os | ||
| import argparse | ||
| import time |
There was a problem hiding this comment.
Import of 'time' is not used.
| import time |
| """ | ||
|
|
||
| import sys | ||
| import os |
There was a problem hiding this comment.
Import of 'os' is not used.
| import os |
|
|
||
| import sys | ||
| import os | ||
| import time |
There was a problem hiding this comment.
Import of 'time' is not used.
| import time |
| sys.path.insert(0, str(project_root / "src")) | ||
|
|
||
| # Import models and environment | ||
| from envs.pokemon_env.models import PokemonAction, PokemonObservation, PokemonState |
There was a problem hiding this comment.
Import of 'PokemonObservation' is not used.
Import of 'PokemonState' is not used.
| from envs.pokemon_env.models import PokemonAction, PokemonObservation, PokemonState | |
| from envs.pokemon_env.models import PokemonAction |
Darktex
left a comment
There was a problem hiding this comment.
Leaving my manual review here and claude review coming next...
| action_index: int = 0 | ||
| move_id: Optional[str] = None | ||
| switch_pokemon: Optional[str] = None | ||
| mega_evolve: bool = False |
There was a problem hiding this comment.
Shouldn't these be inside Action type then?
There was a problem hiding this comment.
Or like, these should be "can_mega_evolve", no?
| ability: Optional[str] | ||
| item: Optional[str] | ||
|
|
||
| attack: int |
There was a problem hiding this comment.
You don't get to see the opponent pokemon's stats ever though
| active_pokemon: Optional[PokemonData] = None | ||
| opponent_active_pokemon: Optional[PokemonData] = None | ||
| team: List[PokemonData] = field(default_factory=list) | ||
| opponent_team: List[PokemonData] = field(default_factory=list) |
There was a problem hiding this comment.
You need a placeholder for opponent pokemon that you have not yet seen. eg at the very start of the battle you see that the other guy has 6 pokemon, then they launch the first and you see that he has 5 more but you don't know which ones
| turn: int = 0 | ||
| forced_switch: bool = False | ||
|
|
||
| can_mega_evolve: bool = False |
There was a problem hiding this comment.
These are already there in the PokemonData of the active pokemon
| def _compute_reward(self, battle, done: bool) -> float: | ||
| """Compute reward based on reward_mode.""" | ||
|
|
||
| if self.reward_mode == "sparse": |
There was a problem hiding this comment.
Let's not do magic numbers here, these should be configurable
| # Final outcome bonus | ||
| if done: | ||
| if battle.won: | ||
| reward += 0.5 |
There was a problem hiding this comment.
(You probably want something more like 10 for this)
| self._last_player_fainted = player_fainted | ||
|
|
||
| # Small reward for opponent HP damage | ||
| if battle.opponent_active_pokemon and hasattr(battle.opponent_active_pokemon, 'current_hp_fraction'): |
There was a problem hiding this comment.
You should likely add a reward for statuses. Paralyzed, Toxin, Spikes, Leech Seed etc.
There should be another reward for REMOVING statuses. What made Blissey super strong was that she could learn Aromatherapy which removes status debuffs on your entire team.
There was a problem hiding this comment.
You should also add a reward for buffs which are a big deal in this game (we banned using Curse on Snorlax back in the day because he became ridiculously strong). Marowak is one Swords Dance away from 999 Attack which one-shots most pokemon in the game. Buffs are heavily quantized, eg Dragon Dance is +1 Atk +1 Spd while Swords Dance is +2 Atk and Belly Drum is +4 Atk (but you lose 50% hp!). Each +1 is +50%.
There was a problem hiding this comment.
Similarly, there should be a reward for cleaning buffs. Skarmory was played for this reason: he could Whirlwind out your pokemon (buffs do not survive a switch unlike statuses)
| # Small reward for opponent HP damage | ||
| if battle.opponent_active_pokemon and hasattr(battle.opponent_active_pokemon, 'current_hp_fraction'): | ||
| current_hp = battle.opponent_active_pokemon.current_hp_fraction | ||
| if current_hp is not None: |
There was a problem hiding this comment.
Also HP are often recovered. Moves like Recover, etc. There should be another reward for recovery of HP
Darktex
left a comment
There was a problem hiding this comment.
Summary
This PR adds a comprehensive Pokemon battle environment integration using poke-env and Pokemon Showdown. The implementation demonstrates excellent async synchronization, proper memory management, and strong test coverage. The architecture follows OpenEnv patterns well.
Highlights ✅
- Excellent async event loop bridging between FastAPI and poke-env's POKE_LOOP
- Memory leak prevention with automatic battle cleanup
- Comprehensive error handling with fallback to legal actions
- Dual reward modes (sparse/dense) for different RL approaches
- Strong test coverage (needs relocation)
- Thread-safe with proper locking
Critical Issues 🔴
Two blocking issues must be fixed before merge:
- Missing Copyright Headers: All Python files in
src/envs/pokemon_env/need Meta copyright headers - Tests in Wrong Location: Tests should be in
tests/envs/pokemon_env/, notexamples/
Important Issues 🟡
- Test Suite Integration: Tests need proper pytest structure and CI/CD integration
- Dockerfile Organization: Multiple Dockerfiles need clarification (see proposal below)
- Hard-coded Port: Dockerfile should respect PORT environment variable
- Error Messages: Add context to observation metadata for better debugging
- Documentation Cleanup: Remove development artifacts from examples (see proposal below)
Minor Issues 🟢
Issues #8-12 are accepted as minor improvements for future consideration.
Proposals
Proposal for Issue #4: Dockerfile Organization
Current situation: Three Dockerfiles exist:
Dockerfile- Combines both Showdown + OpenEnv with supervisor ✅Dockerfile.showdown- Standalone ShowdownDockerfile.pokemonenv- Standalone OpenEnv
Recommendation: Keep the main Dockerfile (it's well-designed) and remove Dockerfile.showdown and Dockerfile.pokemonenv because:
- The main Dockerfile already handles everything with multi-stage build
- Having multiple Dockerfiles creates confusion about which to use
- The README should document the single, consolidated approach
- Separate containers are rarely needed for this use case (Showdown must run alongside the env)
Alternative (if separate builds are truly needed): Keep all three but:
- Rename them clearly:
Dockerfile,Dockerfile.showdown-standalone,Dockerfile.env-only - Add clear documentation in README explaining when to use each
- Add a decision tree: "Use main Dockerfile unless you need X"
My recommendation: Remove the extra Dockerfiles. The all-in-one approach is simpler and matches the architecture (two services that must communicate).
Proposal for Issue #7: Documentation Cleanup
Current state: examples/project-pikachu/ contains:
test_local_pokemon.py(move to tests) ✓test_http_pokemon.py(move to tests) ✓BasicSetupCheckList.md- Development artifact ❌IMPLEMENTATION_SUMMARY.md- Development artifact ❌Poke-Env.MD- Library documentation (external) ❌Readme.MD- Duplicate of main README ❌TESTING.md- Internal testing guide ❌
Recommendation: Remove entire examples/project-pikachu/ directory and instead:
- Create
examples/pokemon_env_example.pywith a simple, documented example:
#!/usr/bin/env python3
"""
Simple Pokemon Battle Environment Example.
This example shows how to use the Pokemon environment with OpenEnv.
Prerequisites:
docker run -p 8000:8000 -p 9980:9980 pokemon-env:latest
Usage:
python examples/pokemon_env_example.py
"""
from envs.pokemon_env import PokemonEnv, PokemonAction
import random
def main():
# Connect to server
client = PokemonEnv(base_url="http://localhost:9980")
# Reset environment
result = client.reset()
print(f"Battle started! Active Pokemon: {result.observation.active_pokemon.species}")
# Battle loop
step = 0
while not result.done and step < 100:
# Choose random legal action
if result.observation.available_moves:
action = PokemonAction(
action_type="move",
action_index=random.choice(result.observation.available_moves)
)
elif result.observation.available_switches:
action = PokemonAction(
action_type="switch",
action_index=random.choice(result.observation.available_switches)
)
else:
break
result = client.step(action)
step += 1
if step % 10 == 0:
print(f"Turn {step}: Reward={result.reward:.2f}")
print(f"\nBattle finished in {step} turns")
print(f"Final reward: {result.reward}")
state = client.state()
print(f"Winner: {state.battle_winner}")
if __name__ == "__main__":
main()-
Merge useful content into main README:
- Architecture details from IMPLEMENTATION_SUMMARY.md
- Any unique troubleshooting from TESTING.md
-
Benefits:
- Single source of truth (README.md)
- Clear, runnable example for users
- No confusion about what's documentation vs development artifacts
- Cleaner repository structure
Overall: This is a high-quality implementation. Once copyright headers are added and tests are relocated, this will be an excellent addition to OpenEnv! 🎉
| @@ -0,0 +1,127 @@ | |||
| """ | |||
There was a problem hiding this comment.
🔴 CRITICAL: Missing Copyright Header
All Python files in the OpenEnv repository must include the Meta copyright header. This file and all other Python files in src/envs/pokemon_env/ are missing this required header.
Required header:
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.Files needing headers:
models.pyclient.py__init__.pyserver/app.pyserver/pokemon_environment.pyserver/__init__.py
Please add this header to the top of each file before the module docstring.
| @@ -0,0 +1,310 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
🔴 CRITICAL: Tests in Wrong Location
Tests should be located in tests/envs/pokemon_env/, not in examples/. The examples directory is for user-facing demonstration code, not test suites.
Required changes:
- Create
tests/envs/pokemon_env/directory - Move
test_local_pokemon.py→tests/envs/pokemon_env/test_environment.py - Move
test_http_pokemon.py→tests/envs/pokemon_env/test_client.py - Remove manual
sys.pathmanipulation (tests should use proper imports) - Follow pytest conventions and integrate with existing test infrastructure
Reference: See tests/envs/atari_env/ or tests/envs/echo_env/ for proper structure.
|
|
||
| # Validate battle state | ||
| if self._current_battle.finished: | ||
| logger.warning("Step called on finished battle, returning final state") |
There was a problem hiding this comment.
🟡 IMPORTANT: Incomplete Error Messages
When step() is called on a finished battle, the warning is logged but not exposed to the user in the observation.
Improvement:
logger.warning("Step called on finished battle, returning final state")
obs = self._battle_to_observation(self._current_battle, reward=None, done=True)
obs.metadata["warning"] = "Step called on already finished battle"
return obsThis helps users debug issues when they accidentally call step after the battle ends.
| \n\ | ||
| [program:openenv]\n\ | ||
| command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980\n\ | ||
| directory=/app\n\ |
There was a problem hiding this comment.
🟡 IMPORTANT: Hard-coded Port
The supervisor config hard-codes port 9980. This should respect the PORT environment variable for deployment flexibility.
Fix:
[program:openenv]
command=bash -c "uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port ${PORT:-9980}"
directory=/app
environment=PYTHONPATH="/app/src",PORT="${PORT:-9980}"This allows deployments to override the port via environment variable while keeping 9980 as the default.
| self._episodes_completed = 0 | ||
| self._cleanup_interval = 10 # Clean up every 10 episodes | ||
|
|
||
| def _pokemon_to_data(self, pokemon) -> Optional[PokemonData]: |
There was a problem hiding this comment.
🟢 MINOR: Type Annotation
The pokemon parameter could have a type hint for better documentation:
from typing import Any
def _pokemon_to_data(self, pokemon: Any) -> Optional[PokemonData]:Since this comes from the external poke-env library, Any is appropriate here.
|
|
||
| # Battle history cleanup interval | ||
| self._episodes_completed = 0 | ||
| self._cleanup_interval = 10 # Clean up every 10 episodes |
There was a problem hiding this comment.
🟢 MINOR: Magic Number
The cleanup interval is hard-coded. Consider making it configurable:
def __init__(
self,
battle_format: str = "gen8randombattle",
player_username: Optional[str] = None,
opponent: Optional[Player] = None,
reward_mode: str = "sparse",
max_turns: int = 1000,
cleanup_interval: int = 10, # Add this parameter
):
...
self._cleanup_interval = cleanup_intervalThis allows fine-tuning in production environments if needed.
|
|
||
| ```bash | ||
| # Build both images (run from project root directory) | ||
| docker build -t pokemon-showdown:latest -f src/envs/pokemon_env/server/Dockerfile.showdown . |
There was a problem hiding this comment.
🟡 IMPORTANT: Dockerfile Documentation
The README mentions building Dockerfile.showdown and Dockerfile.pokemonenv separately, but the main Dockerfile appears to handle everything with supervisor.
Please clarify:
- Which Dockerfiles are actually needed?
- When to use each one?
See the proposal in the main review for a recommendation to simplify this.
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code (alignment-reviewer agent), not a human review. The account posting this is shared with the human maintainer.
Automated review by Claude Code | Learn more about OpenEnv's agentic workflow
Darktex
left a comment
There was a problem hiding this comment.
Note: This is an automated review by Claude Code (alignment-reviewer agent), not a human review. The account posting this is shared with the human maintainer.
Alignment Review Report
Overview
Reviewed PR #131: "Add Env for playing Pokémon using poke-env"
This PR adds a complete Pokemon battle environment integration using poke-env and Pokemon Showdown. The implementation follows OpenEnv patterns but has several critical issues that need addressing.
Tier 1: Fixes Required
Critical Issues (Build Blockers)
-
src/envs/pokemon_env/client.py:13 - Import error:
from core.http_env_client import HTTPEnvClientshould befrom openenv.core.env_client import EnvClient. TheHTTPEnvClientclass does not exist in the codebase. -
src/envs/pokemon_env/client.py:11 - Import error:
from core.client_types import StepResultshould befrom openenv.core.client_types import StepResult -
src/envs/pokemon_env/models.py:13 - Import error:
from core.env_server import Action, Observation, Stateshould befrom openenv.core.env_server.types import Action, Observation, State -
src/envs/pokemon_env/server/app.py:26 - Import error:
from core.env_server import create_appshould befrom openenv.core.env_server import create_app -
src/envs/pokemon_env/server/pokemon_environment.py:22 - Import error:
from core.env_server import Action, Environment, Observationshould befrom openenv.core.env_server.interfaces import Environmentandfrom openenv.core.env_server.types import Action, Observation -
src/envs/pokemon_env/client.py:21 - Class inheritance error: Should inherit from
EnvClient[PokemonAction, PokemonObservation, State]notHTTPEnvClient. The current codebase uses WebSocket-basedEnvClient, not HTTP.
Code Organization Issues
-
src/envs/pokemon_env/test_enhancements.py - Test file should not be in
src/envs/pokemon_env/. Move totests/envs/orexamples/project-pikachu/ -
src/envs/pokemon_env/test_pokemon_docker.sh - Test script should not be in
src/envs/pokemon_env/. Move toexamples/project-pikachu/ -
src/envs/pokemon_env/test_notebook/ - Notebook should not be in
src/envs/pokemon_env/. Move toexamples/project-pikachu/
Missing Type Information
- src/envs/pokemon_env/client.py - Missing
Statetype parameter. Should define a properPokemonStateclass in models.py and use it in the generic type (currently using bareState)
Documentation
- src/envs/pokemon_env/models.py:169 - Contains
password: Optional[str] = Nonein PokemonConfig. While this is for poke-env authentication (not exposed), add a comment clarifying this is for Pokemon Showdown server auth, not production secrets.
Tier 2: Alignment Discussion Points
ALIGNMENT FLAG: API Protocol Mismatch
- Principle at stake: "Minimize lifecycle deltas" (PRINCIPLES.md) and "WebSocket for all environment communication" (INVARIANTS.md note)
- The concern: This PR uses
HTTPEnvClient(HTTP-based) when the codebase has moved to WebSocket-basedEnvClient. According to INVARIANTS.md: "We are in the process of deprecating HTTP (see PR #252) in favor of WebSocket-only, but we are still transitioning and both protocols are currently available." However,HTTPEnvClientdoesn't exist in the codebase - suggesting this PR may be based on old code or the transition is complete. - Impact: New environments should follow current patterns. Using deprecated/non-existent APIs creates technical debt.
- Suggested reviewer: @Darktex
ALIGNMENT FLAG: Test Files in Source Tree
- Principle at stake: "Be hands-on" (PRINCIPLES.md) and code organization best practices
- The concern: Test files (
test_enhancements.py,test_pokemon_docker.sh,test_notebook/) are placed insrc/envs/pokemon_env/instead of appropriate test directories. This violates typical Python packaging conventions wheresrc/contains only distributable code. - Impact: These files would be included in package distributions, increasing package size and exposing test-only code.
- Suggested approach: Move to
examples/project-pikachu/(for integration examples) ortests/envs/(for unit tests) - Suggested reviewer: @Darktex
ALIGNMENT FLAG: Client-Server Separation Compliance
- Principle at stake: "Client-server separation" (INVARIANTS.md)
- The concern: Need to verify that
client.pynever imports fromserver/directory. Initial review shows proper separation (client.py only imports from models.py and openenv.core), but should be explicitly validated. - Status: ✅ Appears compliant - client only imports from models.py (shared) and openenv.core
- Suggested reviewer: @Darktex (final confirmation)
ALIGNMENT FLAG: Reward Computation Location
- Principle at stake: "Rewards in environment" (INVARIANTS.md)
- The concern: Reward computation is correctly placed inside
PokemonEnvironment._compute_reward()on the server side, supporting both "sparse" and "dense" modes. This is aligned with the principle. - Status: ✅ Compliant - rewards are computed server-side
- No action needed - documenting for completeness
ALIGNMENT FLAG: Agent Cannot Reset
- Principle at stake: "Agents cannot reset" (INVARIANTS.md)
- The concern: The environment correctly does NOT expose reset/step/state via MCP tools. The WebSocket/HTTP interface is for orchestration only. No MCP server is defined in this PR.
- Status: ✅ Compliant - simulation controls not exposed to agents
- Future consideration: When MCP tools are added for Pokemon battles (e.g., "choose_move"), ensure they don't expose reset()
- Suggested reviewer: @Darktex (for future MCP integration)
Automated Checks
Debug Code Check
- Status:
⚠️ FOUND (but not in PR files) - Details: Existing codebase has debug print statements in unrelated files. No debug code found in the Pokemon environment PR files themselves.
Lint Check
- Status: ❌ UNABLE TO RUN
- Details:
uvnot installed in environment. Recommend runningbash .claude/hooks/lint.shlocally before merge.
Summary
Critical Path to Merge
- Fix all import paths (change
core.*toopenenv.core.*) - Replace HTTPEnvClient with EnvClient and ensure WebSocket compatibility
- Move test files out of
src/directory - Run lint check with
bash .claude/hooks/lint.sh - Test the environment to ensure WebSocket client works
Counts
- 7 critical import/type errors (build blockers)
- 3 code organization issues (test files in wrong location)
- 3 alignment discussion points (API protocol, test organization, architectural compliance)
- 3 alignment confirmations (client-server separation ✅, rewards ✅, agent isolation ✅)
Recommendation
REQUEST CHANGES - The import errors are critical and will cause immediate runtime failures. Once these are fixed and tests are reorganized, the architectural design appears sound and aligned with OpenEnv principles.
The core implementation (event loop management, memory cleanup, battle state serialization) is well-designed and shows good understanding of the poke-env async patterns.
Automated review by Claude Code | Learn more about OpenEnv's agentic workflow
Dismissing automated approval due to bug in review bot. The original review either had blank content or approved despite finding blocking issues. Please disregard this approval.
|
@greptile |
Greptile SummaryAdds comprehensive Pokemon battle environment using poke-env and Pokemon Showdown, implementing proper async/threading synchronization between FastAPI and poke-env's POKE_LOOP with memory leak prevention through periodic battle cleanup. Key Changes:
Critical Issues Found:
Architecture Strengths:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client as HTTP Client
participant FastAPI as FastAPI Server<br/>(Main Loop)
participant PokemonEnv as PokemonEnvironment<br/>(with Lock)
participant Player as OpenEnvPokemonPlayer
participant PokeLoop as POKE_LOOP<br/>(Background Thread)
participant Showdown as Pokemon Showdown<br/>(WebSocket)
Note over Client,Showdown: Reset Flow
Client->>FastAPI: POST /reset
FastAPI->>PokemonEnv: reset()
PokemonEnv->>PokemonEnv: Acquire _env_lock
PokemonEnv->>PokemonEnv: Cancel previous battle task
PokemonEnv->>PokemonEnv: Cleanup old battles (every 10 episodes)
PokemonEnv->>PokeLoop: asyncio.run_coroutine_threadsafe(start_battle())
PokeLoop->>Player: battle_against(opponent)
Player->>Showdown: WebSocket connect & battle init
Showdown-->>Player: Battle started
Player-->>PokeLoop: Battle task created
PokeLoop-->>PokemonEnv: Battle reference
PokemonEnv->>PokemonEnv: Release _env_lock
PokemonEnv-->>FastAPI: Initial PokemonObservation
FastAPI-->>Client: 200 OK with observation
Note over Client,Showdown: Step Flow
Client->>FastAPI: POST /step with PokemonAction
FastAPI->>PokemonEnv: step(action)
PokemonEnv->>PokemonEnv: Acquire _env_lock
PokemonEnv->>PokemonEnv: Validate action type
PokemonEnv->>Player: set_next_action(action)
Player->>PokeLoop: Schedule _set_action() on POKE_LOOP
PokeLoop->>Player: Set _next_action, signal _action_event
PokemonEnv->>PokeLoop: asyncio.run_coroutine_threadsafe(wait_turn())
Note over Player,Showdown: choose_move() called by poke-env
Player->>Player: wait_for(_action_event)
Player->>Player: _action_to_order(action)
Player->>Showdown: Send move via WebSocket
Showdown-->>Player: Battle state update
Player->>Player: Signal _turn_complete_event
PokeLoop-->>PokemonEnv: Turn complete
PokemonEnv->>PokemonEnv: _battle_to_observation()
PokemonEnv->>PokemonEnv: _compute_reward()
PokemonEnv->>PokemonEnv: Release _env_lock
PokemonEnv-->>FastAPI: Updated PokemonObservation
FastAPI-->>Client: 200 OK with observation
|
|
|
||
| else: | ||
| # Unknown mode, use sparse | ||
| return self._compute_reward(battle, done) if done else 0.0 |
There was a problem hiding this comment.
logic: recursive call to _compute_reward will cause infinite recursion if reward_mode is not "sparse" or "dense"
| return self._compute_reward(battle, done) if done else 0.0 | |
| return 0.0 # Unknown mode, default to sparse behavior |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/envs/pokemon_env/server/pokemon_environment.py
Line: 460:460
Comment:
**logic:** recursive call to `_compute_reward` will cause infinite recursion if `reward_mode` is not `"sparse"` or `"dense"`
```suggestion
return 0.0 # Unknown mode, default to sparse behavior
```
How can I resolve this? If you propose a fix, please make it concise.| echo "Starting Pokemon OpenEnv server on port 9000..." | ||
| cd /app | ||
| export PYTHONPATH=/app/src | ||
| exec uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000 |
There was a problem hiding this comment.
logic: port mismatch - entrypoint.sh uses port 9000 but Dockerfile, supervisord.conf, and other files use port 9980
| echo "Starting Pokemon OpenEnv server on port 9000..." | |
| cd /app | |
| export PYTHONPATH=/app/src | |
| exec uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000 | |
| echo "Starting Pokemon OpenEnv server on port 9980..." | |
| cd /app | |
| export PYTHONPATH=/app/src | |
| exec uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980 |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/envs/pokemon_env/server/entrypoint.sh
Line: 29:32
Comment:
**logic:** port mismatch - entrypoint.sh uses port 9000 but Dockerfile, supervisord.conf, and other files use port 9980
```suggestion
echo "Starting Pokemon OpenEnv server on port 9980..."
cd /app
export PYTHONPATH=/app/src
exec uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980
```
How can I resolve this? If you propose a fix, please make it concise.| priority=10 | ||
|
|
||
| [program:openenv] | ||
| command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000 |
There was a problem hiding this comment.
logic: port mismatch - supervisord.conf uses port 9000 but Dockerfile.pokemonenv, Dockerfile, and README.md use port 9980
| command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000 | |
| command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980 |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/envs/pokemon_env/server/supervisord.conf
Line: 19:19
Comment:
**logic:** port mismatch - supervisord.conf uses port 9000 but Dockerfile.pokemonenv, Dockerfile, and README.md use port 9980
```suggestion
command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980
```
How can I resolve this? If you propose a fix, please make it concise.| try: | ||
| # Import from top-level poke_env module | ||
| from poke_env import Player, RandomPlayer, AccountConfiguration, ServerConfiguration | ||
| # Import battle orders from player submodule |
There was a problem hiding this comment.
syntax: missing import for RandomPlayer
| # Import battle orders from player submodule | |
| from poke_env import Player, AccountConfiguration, ServerConfiguration | |
| from poke_env.player import RandomPlayer |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/envs/pokemon_env/server/pokemon_environment.py
Line: 30:30
Comment:
**syntax:** missing import for `RandomPlayer`
```suggestion
from poke_env import Player, AccountConfiguration, ServerConfiguration
from poke_env.player import RandomPlayer
```
How can I resolve this? If you propose a fix, please make it concise.| dynamax: Whether to dynamax this turn (if applicable) | ||
| terastallize: Whether to terastallize this turn (if applicable) | ||
| """ | ||
| action_type: Literal["move", "switch"] = "move" |
There was a problem hiding this comment.
syntax: missing "default" and "forfeit" action types from literal - these are used in pokemon_environment.py lines 142, 193, and 197
| action_type: Literal["move", "switch"] = "move" | |
| action_type: Literal["move", "switch", "default", "forfeit"] = "move" |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/envs/pokemon_env/models.py
Line: 30:30
Comment:
**syntax:** missing `"default"` and `"forfeit"` action types from literal - these are used in `pokemon_environment.py` lines 142, 193, and 197
```suggestion
action_type: Literal["move", "switch", "default", "forfeit"] = "move"
```
How can I resolve this? If you propose a fix, please make it concise.|
@init27 It looks like this PR is no longer compatible with the code base. I'd suggest closing and opening a new PR with the env in |
This idea came super thanks to @cpich3g from our Hackathon.
We are continuing here but thanks again Justin for leading and inspiring!
Adds a complete Pokemon battle environment integration using poke-env and Pokemon Showdown, following OpenEnv's HTTP-based framework patterns.
Design Decisions
Event Loop Management
POKE_LOOPbackground threadasyncio.run_coroutine_threadsafe()asyncio.Eventobjects on correct loopMemory Management
Concurrency
reset()andstep()operationsError Handling
File Structure
src/envs/pokemon_env/
├── init.py # Package exports
├── models.py # Action, Observation, State dataclasses
├── client.py # HTTPEnvClient implementation
├── README.md # Environment documentation
└── server/
├── init.py
├── app.py # FastAPI application
├── pokemon_environment.py # Core environment logic
├── Dockerfile # Container with Showdown + OpenEnv
└── requirements.txt # Dependencies
examples/project-pikachu/
├── test_local_pokemon.py # Direct environment tests (6 tests)
├── test_http_pokemon.py # HTTP client tests (6 tests)
└── [documentation files] # Testing guides and architecture docs
Reward Modes
Sparse (default):
Dense:
Credits