Add Env for playing Pokémon using poke-env by init27 · Pull Request #131 · huggingface/OpenEnv

init27 · 2025-11-02T03:56:57Z

This idea came super thanks to @cpich3g from our Hackathon.
We are continuing here but thanks again Justin for leading and inspiring!

Adds a complete Pokemon battle environment integration using poke-env and Pokemon Showdown, following OpenEnv's HTTP-based framework patterns.

Design Decisions

Event Loop Management

poke-env runs on dedicated POKE_LOOP background thread
FastAPI runs on main uvicorn event loop
Proper async synchronization via asyncio.run_coroutine_threadsafe()
Lazy initialization of asyncio.Event objects on correct loop

Memory Management

Automatic battle cleanup every 10 episodes (prevents accumulation)
Previous battle task cancellation on reset
Stable memory usage over 100+ episodes

Concurrency

Single lock for both reset() and step() operations
Prevents race conditions in battle state access
Thread-safe action queuing

Error Handling

Illegal move validation with fallback to random legal action
Battle state validation before step execution
Comprehensive None checks for Pokemon attributes
Graceful handling of finished battles

File Structure

src/envs/pokemon_env/
├── init.py # Package exports
├── models.py # Action, Observation, State dataclasses
├── client.py # HTTPEnvClient implementation
├── README.md # Environment documentation
└── server/
├── init.py
├── app.py # FastAPI application
├── pokemon_environment.py # Core environment logic
├── Dockerfile # Container with Showdown + OpenEnv
└── requirements.txt # Dependencies

examples/project-pikachu/
├── test_local_pokemon.py # Direct environment tests (6 tests)
├── test_http_pokemon.py # HTTP client tests (6 tests)
└── [documentation files] # Testing guides and architecture docs

Reward Modes

Sparse (default):

+1.0 for win, -1.0 for loss, 0.0 otherwise

Dense:

+0.2 per opponent Pokemon fainted
-0.2 per own Pokemon fainted
+0.05 per HP% damage dealt
+0.5 bonus for win, -0.5 penalty for loss

Credits

https://github.com/hsahovic/poke-env
https://github.com/smogon/pokemon-showdown - Battle engine

…integration

Project pikachu - First Draft

Add Surfiniaburger to contributors list

…nEnv into project-pikachu

github-actions · 2025-11-02T03:57:29Z

⚠️ Deployment failed for pokemon_env

Space repo:
Live URL:

Please resolve your environment.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

Added observations on popular choices and practices in Pokemon-RL projects.

github-actions · 2025-11-02T10:18:31Z

✅ Deployment succeeded for pokemon_env

Space repo: https://huggingface.co/spaces/openenv-testing/pokemon_env-pr-131
Live URL: https://openenv-testing-pokemon-env-pr-131.hf.space

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

Add observations section to Readme

…sicSetupCheckList.md

Updated the checklist for setting up the Pokémon RL training environment, including system requirements, hardware recommendations, setup instructions, and known issues.

Removed unnecessary comments and cleaned up formatting.

Added instructions for forking the Pokemon-Showdown repository and noted compatibility issues with random battle generation.

Organised Dockerfile

fix: ensure Gen 8 Random Battles run correctly instead of Gen 9

github-actions · 2025-11-05T16:45:52Z

✅ Deployment succeeded for pokemon_env

Space repo: https://huggingface.co/spaces/openenv-testing/pokemon_env-pr-131
Live URL: https://openenv-testing-pokemon-env-pr-131.hf.space

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

github-actions · 2025-11-05T16:45:53Z

✅ Deployment succeeded for pokemon_env

Space repo: https://huggingface.co/spaces/openenv-testing/pokemon_env-pr-131
Live URL: https://openenv-testing-pokemon-env-pr-131.hf.space

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

Created a Basic Setup and Installation.md file

github-actions · 2025-11-05T18:25:14Z

✅ Deployment succeeded for pokemon_env

Space repo: https://huggingface.co/spaces/openenv-testing/pokemon_env-pr-131
Live URL: https://openenv-testing-pokemon-env-pr-131.hf.space

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

dockerfiles separated

github-actions · 2025-11-05T22:14:44Z

✅ Deployment succeeded for pokemon_env

Space repo: https://huggingface.co/spaces/openenv-testing/pokemon_env-pr-131
Live URL: https://openenv-testing-pokemon-env-pr-131.hf.space

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

Copilot

Pull Request Overview

This PR adds a comprehensive Pokemon battle environment integration to OpenEnv using poke-env and Pokemon Showdown. The implementation provides HTTP-based access to Pokemon battles for reinforcement learning and AI training.

Complete Pokemon environment integration with poke-env library
Docker containerization with Pokemon Showdown server and OpenEnv API server
Client library for HTTP-based environment interaction
Comprehensive test suite and documentation

Reviewed Changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 24 comments.

Show a summary per file

File	Description
src/envs/pokemon_env/server/pokemon_environment.py	Core environment implementation with async/event loop synchronization
src/envs/pokemon_env/client.py	HTTP client for Pokemon environment
src/envs/pokemon_env/models.py	Data models for actions, observations, and state
src/envs/pokemon_env/server/app.py	FastAPI server application
src/envs/pokemon_env/server/Dockerfile	Multi-service Docker image configuration
src/envs/pokemon_env/test_enhancements.py	Comprehensive test suite
examples/project-pikachu/test_*.py	Integration and HTTP client tests
examples/project-pikachu/TESTING.md	Testing guide and troubleshooting

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-07T18:23:30Z

+            if obs.available_switches and random.random() < 0.2 and active_hp < 0.3:
+                switch_idx = random.choice(range(len(obs.available_switches)))
+                action = PokemonAction(action_type="switch", action_index=switch_idx)
+                action_desc = f"Switch to {obs.team[switch_idx + 1].species.title()}"


Array index mismatch: switch_idx is selected from available_switches (line 498), but used to index obs.team[switch_idx + 1] (line 500). The available_switches list contains indices into the team, not raw team indices. This should be obs.team[obs.available_switches[switch_idx]] to correctly map from the chosen switch index to the actual team member.

Suggested change

action_desc = f"Switch to {obs.team[switch_idx + 1].species.title()}"

action_desc = f"Switch to {obs.team[obs.available_switches[switch_idx]].species.title()}"

Copilot · 2025-11-07T18:23:30Z

+        dynamax: Whether to dynamax this turn (if applicable)
+        terastallize: Whether to terastallize this turn (if applicable)
+    """
+    action_type: Literal["move", "switch"] = "move"


The action_type field doesn't include 'forfeit' or 'default' as valid options, but the pokemon_environment.py implementation handles these action types (lines 142, 193). Consider adding these to the Literal type for consistency: Literal[\"move\", \"switch\", \"forfeit\", \"default\"].

Suggested change

action_type: Literal["move", "switch"] = "move"

action_type: Literal["move", "switch", "forfeit", "default"] = "move"

Copilot · 2025-11-07T18:23:31Z

+except ImportError as e:
+    raise ImportError(
+        "poke-env is not installed. "
+        "Please install it with: pip install poke-env"


Corrected command formatting. The installation command should use backticks for code formatting consistency: 'Please install it with: pip install poke-env'

Suggested change

"Please install it with: pip install poke-env"

"Please install it with: `pip install poke-env`"

Copilot · 2025-11-07T18:23:31Z

+            # Otherwise, use a move
+            elif obs.available_moves:
+                move_idx = random.choice(range(len(obs.available_moves)))
+                move_name = obs.available_moves[move_idx]


obs.available_moves is a list of integers (move indices, not move names) based on the PokemonObservation model definition. Using it directly as move_name will result in displaying an integer instead of the actual move name. To get the move name, use: obs.active_pokemon.moves[obs.available_moves[move_idx]]['id'] if available.

Suggested change

move_name = obs.available_moves[move_idx]

move_index = obs.available_moves[move_idx]

move_name = obs.active_pokemon.moves[move_index]['id'] if obs.active_pokemon and obs.active_pokemon.moves and move_index < len(obs.active_pokemon.moves) else str(move_index)

Copilot · 2025-11-07T18:23:31Z

+    mega_evolve: bool = False
+    dynamax: bool = False
+    terastallize: bool = False


Missing documentation for the z_move field. While mega_evolve, dynamax, and terastallize are documented in the docstring, z_move is referenced in pokemon_environment.py but not included in the PokemonAction dataclass or its docstring. Either add the field or remove references to it in the implementation.

Copilot · 2025-11-07T18:23:35Z

+"""
+
+import sys
+import os


Import of 'os' is not used.

Suggested change

import os

Copilot · 2025-11-07T18:23:35Z

+import sys
+import os
+import argparse
+import time


Import of 'time' is not used.

Suggested change

import time

Copilot · 2025-11-07T18:23:36Z

+"""
+
+import sys
+import os


Import of 'os' is not used.

Suggested change

import os

Copilot · 2025-11-07T18:23:36Z

+
+import sys
+import os
+import time


Import of 'time' is not used.

Suggested change

import time

Copilot · 2025-11-07T18:23:36Z

+sys.path.insert(0, str(project_root / "src"))
+
+# Import models and environment
+from envs.pokemon_env.models import PokemonAction, PokemonObservation, PokemonState


Import of 'PokemonObservation' is not used.
Import of 'PokemonState' is not used.

Suggested change

from envs.pokemon_env.models import PokemonAction, PokemonObservation, PokemonState

from envs.pokemon_env.models import PokemonAction

Darktex

Leaving my manual review here and claude review coming next...

Darktex · 2025-11-07T20:09:12Z

+    action_index: int = 0
+    move_id: Optional[str] = None
+    switch_pokemon: Optional[str] = None
+    mega_evolve: bool = False


Shouldn't these be inside Action type then?

Or like, these should be "can_mega_evolve", no?

Darktex · 2025-11-07T20:09:52Z

+    ability: Optional[str]
+    item: Optional[str]
+
+    attack: int


You don't get to see the opponent pokemon's stats ever though

Darktex · 2025-11-07T20:11:18Z

+    active_pokemon: Optional[PokemonData] = None
+    opponent_active_pokemon: Optional[PokemonData] = None
+    team: List[PokemonData] = field(default_factory=list)
+    opponent_team: List[PokemonData] = field(default_factory=list)


You need a placeholder for opponent pokemon that you have not yet seen. eg at the very start of the battle you see that the other guy has 6 pokemon, then they launch the first and you see that he has 5 more but you don't know which ones

Darktex · 2025-11-07T20:11:44Z

+    turn: int = 0
+    forced_switch: bool = False
+
+    can_mega_evolve: bool = False


These are already there in the PokemonData of the active pokemon

Darktex · 2025-11-07T20:14:35Z

+    def _compute_reward(self, battle, done: bool) -> float:
+        """Compute reward based on reward_mode."""
+
+        if self.reward_mode == "sparse":


Let's not do magic numbers here, these should be configurable

Darktex · 2025-11-07T20:14:58Z

+            # Final outcome bonus
+            if done:
+                if battle.won:
+                    reward += 0.5


(You probably want something more like 10 for this)

Darktex · 2025-11-07T20:16:11Z

+            self._last_player_fainted = player_fainted
+
+            # Small reward for opponent HP damage
+            if battle.opponent_active_pokemon and hasattr(battle.opponent_active_pokemon, 'current_hp_fraction'):


You should likely add a reward for statuses. Paralyzed, Toxin, Spikes, Leech Seed etc.

There should be another reward for REMOVING statuses. What made Blissey super strong was that she could learn Aromatherapy which removes status debuffs on your entire team.

You should also add a reward for buffs which are a big deal in this game (we banned using Curse on Snorlax back in the day because he became ridiculously strong). Marowak is one Swords Dance away from 999 Attack which one-shots most pokemon in the game. Buffs are heavily quantized, eg Dragon Dance is +1 Atk +1 Spd while Swords Dance is +2 Atk and Belly Drum is +4 Atk (but you lose 50% hp!). Each +1 is +50%.

Similarly, there should be a reward for cleaning buffs. Skarmory was played for this reason: he could Whirlwind out your pokemon (buffs do not survive a switch unlike statuses)

Darktex · 2025-11-07T20:18:08Z

+            # Small reward for opponent HP damage
+            if battle.opponent_active_pokemon and hasattr(battle.opponent_active_pokemon, 'current_hp_fraction'):
+                current_hp = battle.opponent_active_pokemon.current_hp_fraction
+                if current_hp is not None:


Also HP are often recovered. Moves like Recover, etc. There should be another reward for recovery of HP

Darktex

Summary

This PR adds a comprehensive Pokemon battle environment integration using poke-env and Pokemon Showdown. The implementation demonstrates excellent async synchronization, proper memory management, and strong test coverage. The architecture follows OpenEnv patterns well.

Highlights ✅

Excellent async event loop bridging between FastAPI and poke-env's POKE_LOOP
Memory leak prevention with automatic battle cleanup
Comprehensive error handling with fallback to legal actions
Dual reward modes (sparse/dense) for different RL approaches
Strong test coverage (needs relocation)
Thread-safe with proper locking

Critical Issues 🔴

Two blocking issues must be fixed before merge:

Missing Copyright Headers: All Python files in src/envs/pokemon_env/ need Meta copyright headers
Tests in Wrong Location: Tests should be in tests/envs/pokemon_env/, not examples/

Important Issues 🟡

Test Suite Integration: Tests need proper pytest structure and CI/CD integration
Dockerfile Organization: Multiple Dockerfiles need clarification (see proposal below)
Hard-coded Port: Dockerfile should respect PORT environment variable
Error Messages: Add context to observation metadata for better debugging
Documentation Cleanup: Remove development artifacts from examples (see proposal below)

Minor Issues 🟢

Issues #8-12 are accepted as minor improvements for future consideration.

Proposals

Proposal for Issue #4: Dockerfile Organization

Current situation: Three Dockerfiles exist:

Dockerfile - Combines both Showdown + OpenEnv with supervisor ✅
Dockerfile.showdown - Standalone Showdown
Dockerfile.pokemonenv - Standalone OpenEnv

Recommendation: Keep the main Dockerfile (it's well-designed) and remove Dockerfile.showdown and Dockerfile.pokemonenv because:

The main Dockerfile already handles everything with multi-stage build
Having multiple Dockerfiles creates confusion about which to use
The README should document the single, consolidated approach
Separate containers are rarely needed for this use case (Showdown must run alongside the env)

Alternative (if separate builds are truly needed): Keep all three but:

Rename them clearly: Dockerfile, Dockerfile.showdown-standalone, Dockerfile.env-only
Add clear documentation in README explaining when to use each
Add a decision tree: "Use main Dockerfile unless you need X"

My recommendation: Remove the extra Dockerfiles. The all-in-one approach is simpler and matches the architecture (two services that must communicate).

Proposal for Issue #7: Documentation Cleanup

Current state: examples/project-pikachu/ contains:

test_local_pokemon.py (move to tests) ✓
test_http_pokemon.py (move to tests) ✓
BasicSetupCheckList.md - Development artifact ❌
IMPLEMENTATION_SUMMARY.md - Development artifact ❌
Poke-Env.MD - Library documentation (external) ❌
Readme.MD - Duplicate of main README ❌
TESTING.md - Internal testing guide ❌

Recommendation: Remove entire examples/project-pikachu/ directory and instead:

Create examples/pokemon_env_example.py with a simple, documented example:

#!/usr/bin/env python3
"""
Simple Pokemon Battle Environment Example.

This example shows how to use the Pokemon environment with OpenEnv.

Prerequisites:
    docker run -p 8000:8000 -p 9980:9980 pokemon-env:latest

Usage:
    python examples/pokemon_env_example.py
"""

from envs.pokemon_env import PokemonEnv, PokemonAction
import random

def main():
    # Connect to server
    client = PokemonEnv(base_url="http://localhost:9980")
    
    # Reset environment
    result = client.reset()
    print(f"Battle started! Active Pokemon: {result.observation.active_pokemon.species}")
    
    # Battle loop
    step = 0
    while not result.done and step < 100:
        # Choose random legal action
        if result.observation.available_moves:
            action = PokemonAction(
                action_type="move",
                action_index=random.choice(result.observation.available_moves)
            )
        elif result.observation.available_switches:
            action = PokemonAction(
                action_type="switch",
                action_index=random.choice(result.observation.available_switches)
            )
        else:
            break
        
        result = client.step(action)
        step += 1
        
        if step % 10 == 0:
            print(f"Turn {step}: Reward={result.reward:.2f}")
    
    print(f"\nBattle finished in {step} turns")
    print(f"Final reward: {result.reward}")
    state = client.state()
    print(f"Winner: {state.battle_winner}")

if __name__ == "__main__":
    main()

Merge useful content into main README:
- Architecture details from IMPLEMENTATION_SUMMARY.md
- Any unique troubleshooting from TESTING.md
Benefits:
- Single source of truth (README.md)
- Clear, runnable example for users
- No confusion about what's documentation vs development artifacts
- Cleaner repository structure

Overall: This is a high-quality implementation. Once copyright headers are added and tests are relocated, this will be an excellent addition to OpenEnv! 🎉

Darktex · 2025-11-07T20:25:53Z

@@ -0,0 +1,127 @@
+"""


🔴 CRITICAL: Missing Copyright Header

All Python files in the OpenEnv repository must include the Meta copyright header. This file and all other Python files in src/envs/pokemon_env/ are missing this required header.

Required header:

# Copyright (c) Meta Platforms, Inc. and affiliates. # All rights reserved. # # This source code is licensed under the BSD-style license found in the # LICENSE file in the root directory of this source tree.

Files needing headers:

models.py

client.py

__init__.py

server/app.py

server/pokemon_environment.py

server/__init__.py

Please add this header to the top of each file before the module docstring.

Darktex · 2025-11-07T20:25:54Z

@@ -0,0 +1,310 @@
+#!/usr/bin/env python3


🔴 CRITICAL: Tests in Wrong Location

Tests should be located in tests/envs/pokemon_env/, not in examples/. The examples directory is for user-facing demonstration code, not test suites.

Required changes:

Create tests/envs/pokemon_env/ directory

Move test_local_pokemon.py → tests/envs/pokemon_env/test_environment.py

Move test_http_pokemon.py → tests/envs/pokemon_env/test_client.py

Remove manual sys.path manipulation (tests should use proper imports)

Follow pytest conventions and integrate with existing test infrastructure

Reference: See tests/envs/atari_env/ or tests/envs/echo_env/ for proper structure.

Darktex · 2025-11-07T20:25:54Z

+
+            # Validate battle state
+            if self._current_battle.finished:
+                logger.warning("Step called on finished battle, returning final state")


🟡 IMPORTANT: Incomplete Error Messages

When step() is called on a finished battle, the warning is logged but not exposed to the user in the observation.

Improvement:

logger.warning("Step called on finished battle, returning final state") obs = self._battle_to_observation(self._current_battle, reward=None, done=True) obs.metadata["warning"] = "Step called on already finished battle" return obs

This helps users debug issues when they accidentally call step after the battle ends.

Darktex · 2025-11-07T20:25:54Z

+\n\
+[program:openenv]\n\
+command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980\n\
+directory=/app\n\


🟡 IMPORTANT: Hard-coded Port

The supervisor config hard-codes port 9980. This should respect the PORT environment variable for deployment flexibility.

Fix:

[program:openenv] command=bash -c "uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port ${PORT:-9980}" directory=/app environment=PYTHONPATH="/app/src",PORT="${PORT:-9980}"

This allows deployments to override the port via environment variable while keeping 9980 as the default.

Darktex · 2025-11-07T20:25:54Z

+        self._episodes_completed = 0
+        self._cleanup_interval = 10  # Clean up every 10 episodes
+
+    def _pokemon_to_data(self, pokemon) -> Optional[PokemonData]:


🟢 MINOR: Type Annotation

The pokemon parameter could have a type hint for better documentation:

from typing import Any def _pokemon_to_data(self, pokemon: Any) -> Optional[PokemonData]:

Since this comes from the external poke-env library, Any is appropriate here.

Darktex · 2025-11-07T20:25:54Z

+
+        # Battle history cleanup interval
+        self._episodes_completed = 0
+        self._cleanup_interval = 10  # Clean up every 10 episodes


🟢 MINOR: Magic Number

The cleanup interval is hard-coded. Consider making it configurable:

def __init__( self, battle_format: str = "gen8randombattle", player_username: Optional[str] = None, opponent: Optional[Player] = None, reward_mode: str = "sparse", max_turns: int = 1000, cleanup_interval: int = 10, # Add this parameter ): ... self._cleanup_interval = cleanup_interval

This allows fine-tuning in production environments if needed.

Darktex · 2025-11-07T20:25:54Z

+
+```bash
+# Build both images (run from project root directory)
+docker build -t pokemon-showdown:latest -f src/envs/pokemon_env/server/Dockerfile.showdown .


🟡 IMPORTANT: Dockerfile Documentation

The README mentions building Dockerfile.showdown and Dockerfile.pokemonenv separately, but the main Dockerfile appears to handle everything with supervisor.

Please clarify:

Which Dockerfiles are actually needed?

When to use each one?

See the proposal in the main review for a recommendation to simplify this.

Darktex

Note: This is an automated review by Claude Code (alignment-reviewer agent), not a human review. The account posting this is shared with the human maintainer.

Automated review by Claude Code | Learn more about OpenEnv's agentic workflow

Darktex

Note: This is an automated review by Claude Code (alignment-reviewer agent), not a human review. The account posting this is shared with the human maintainer.

Alignment Review Report

Overview

Reviewed PR #131: "Add Env for playing Pokémon using poke-env"

This PR adds a complete Pokemon battle environment integration using poke-env and Pokemon Showdown. The implementation follows OpenEnv patterns but has several critical issues that need addressing.

Tier 1: Fixes Required

Critical Issues (Build Blockers)

src/envs/pokemon_env/client.py:13 - Import error: from core.http_env_client import HTTPEnvClient should be from openenv.core.env_client import EnvClient. The HTTPEnvClient class does not exist in the codebase.
src/envs/pokemon_env/client.py:11 - Import error: from core.client_types import StepResult should be from openenv.core.client_types import StepResult
src/envs/pokemon_env/models.py:13 - Import error: from core.env_server import Action, Observation, State should be from openenv.core.env_server.types import Action, Observation, State
src/envs/pokemon_env/server/app.py:26 - Import error: from core.env_server import create_app should be from openenv.core.env_server import create_app
src/envs/pokemon_env/server/pokemon_environment.py:22 - Import error: from core.env_server import Action, Environment, Observation should be from openenv.core.env_server.interfaces import Environment and from openenv.core.env_server.types import Action, Observation
src/envs/pokemon_env/client.py:21 - Class inheritance error: Should inherit from EnvClient[PokemonAction, PokemonObservation, State] not HTTPEnvClient. The current codebase uses WebSocket-based EnvClient, not HTTP.

Code Organization Issues

src/envs/pokemon_env/test_enhancements.py - Test file should not be in src/envs/pokemon_env/. Move to tests/envs/ or examples/project-pikachu/
src/envs/pokemon_env/test_pokemon_docker.sh - Test script should not be in src/envs/pokemon_env/. Move to examples/project-pikachu/
src/envs/pokemon_env/test_notebook/ - Notebook should not be in src/envs/pokemon_env/. Move to examples/project-pikachu/

Missing Type Information

src/envs/pokemon_env/client.py - Missing State type parameter. Should define a proper PokemonState class in models.py and use it in the generic type (currently using bare State)

Documentation

src/envs/pokemon_env/models.py:169 - Contains password: Optional[str] = None in PokemonConfig. While this is for poke-env authentication (not exposed), add a comment clarifying this is for Pokemon Showdown server auth, not production secrets.

Tier 2: Alignment Discussion Points

ALIGNMENT FLAG: API Protocol Mismatch

Principle at stake: "Minimize lifecycle deltas" (PRINCIPLES.md) and "WebSocket for all environment communication" (INVARIANTS.md note)
The concern: This PR uses HTTPEnvClient (HTTP-based) when the codebase has moved to WebSocket-based EnvClient. According to INVARIANTS.md: "We are in the process of deprecating HTTP (see PR #252) in favor of WebSocket-only, but we are still transitioning and both protocols are currently available." However, HTTPEnvClient doesn't exist in the codebase - suggesting this PR may be based on old code or the transition is complete.
Impact: New environments should follow current patterns. Using deprecated/non-existent APIs creates technical debt.
Suggested reviewer: @Darktex

ALIGNMENT FLAG: Test Files in Source Tree

Principle at stake: "Be hands-on" (PRINCIPLES.md) and code organization best practices
The concern: Test files (test_enhancements.py, test_pokemon_docker.sh, test_notebook/) are placed in src/envs/pokemon_env/ instead of appropriate test directories. This violates typical Python packaging conventions where src/ contains only distributable code.
Impact: These files would be included in package distributions, increasing package size and exposing test-only code.
Suggested approach: Move to examples/project-pikachu/ (for integration examples) or tests/envs/ (for unit tests)
Suggested reviewer: @Darktex

ALIGNMENT FLAG: Client-Server Separation Compliance

Principle at stake: "Client-server separation" (INVARIANTS.md)
The concern: Need to verify that client.py never imports from server/ directory. Initial review shows proper separation (client.py only imports from models.py and openenv.core), but should be explicitly validated.
Status: ✅ Appears compliant - client only imports from models.py (shared) and openenv.core
Suggested reviewer: @Darktex (final confirmation)

ALIGNMENT FLAG: Reward Computation Location

Principle at stake: "Rewards in environment" (INVARIANTS.md)
The concern: Reward computation is correctly placed inside PokemonEnvironment._compute_reward() on the server side, supporting both "sparse" and "dense" modes. This is aligned with the principle.
Status: ✅ Compliant - rewards are computed server-side
No action needed - documenting for completeness

ALIGNMENT FLAG: Agent Cannot Reset

Principle at stake: "Agents cannot reset" (INVARIANTS.md)
The concern: The environment correctly does NOT expose reset/step/state via MCP tools. The WebSocket/HTTP interface is for orchestration only. No MCP server is defined in this PR.
Status: ✅ Compliant - simulation controls not exposed to agents
Future consideration: When MCP tools are added for Pokemon battles (e.g., "choose_move"), ensure they don't expose reset()
Suggested reviewer: @Darktex (for future MCP integration)

Automated Checks

Debug Code Check

Status: ⚠️ FOUND (but not in PR files)
Details: Existing codebase has debug print statements in unrelated files. No debug code found in the Pokemon environment PR files themselves.

Lint Check

Status: ❌ UNABLE TO RUN
Details: uv not installed in environment. Recommend running bash .claude/hooks/lint.sh locally before merge.

Summary

Critical Path to Merge

Fix all import paths (change core.* to openenv.core.*)
Replace HTTPEnvClient with EnvClient and ensure WebSocket compatibility
Move test files out of src/ directory
Run lint check with bash .claude/hooks/lint.sh
Test the environment to ensure WebSocket client works

Counts

7 critical import/type errors (build blockers)
3 code organization issues (test files in wrong location)
3 alignment discussion points (API protocol, test organization, architectural compliance)
3 alignment confirmations (client-server separation ✅, rewards ✅, agent isolation ✅)

Recommendation

REQUEST CHANGES - The import errors are critical and will cause immediate runtime failures. Once these are fixed and tests are reorganized, the architectural design appears sound and aligned with OpenEnv principles.

The core implementation (event loop management, memory cleanup, battle state serialization) is well-designed and shows good understanding of the poke-env async patterns.

Automated review by Claude Code | Learn more about OpenEnv's agentic workflow

Dismissing automated approval due to bug in review bot. The original review either had blank content or approved despite finding blocking issues. Please disregard this approval.

zkwentz · 2026-01-21T14:47:07Z

@greptile

greptile-apps · 2026-01-21T14:55:03Z

Greptile Summary

Adds comprehensive Pokemon battle environment using poke-env and Pokemon Showdown, implementing proper async/threading synchronization between FastAPI and poke-env's POKE_LOOP with memory leak prevention through periodic battle cleanup.

Key Changes:

Implements PokemonEnvironment with single-lock concurrency control for reset() and step()
Bridges two event loops: FastAPI's main loop and poke-env's POKE_LOOP using asyncio.run_coroutine_threadsafe()
Provides HTTP client (PokemonEnv) following OpenEnv patterns with proper serialization
Supports sparse and dense reward modes with configurable battle formats
Includes comprehensive test suites for both local and HTTP usage
Provides three Docker deployment options: monolithic, separated containers, and development setup

Critical Issues Found:

Infinite recursion bug in _compute_reward() when reward_mode is invalid (line 460)
Missing RandomPlayer import in pokemon_environment.py (line 30)
Incomplete action_type literal in models.py missing "default" and "forfeit" values
Port inconsistency: entrypoint.sh and supervisord.conf use port 9000 while all other configs use 9980

Architecture Strengths:

Proper event loop isolation prevents deadlocks
Single lock design eliminates race conditions between reset/step
Automatic cleanup every 10 episodes prevents memory growth
Comprehensive error handling with illegal move fallback to random actions

Confidence Score: 2/5

Not safe to merge - contains critical bugs including infinite recursion and port mismatches that will cause runtime failures
Score reflects presence of critical bugs: infinite recursion in reward computation, missing import, incomplete type definitions, and deployment-breaking port inconsistencies. While the architecture and async design are solid, these issues must be fixed before deployment
Pay close attention to src/envs/pokemon_env/server/pokemon_environment.py (infinite recursion + missing import), src/envs/pokemon_env/models.py (incomplete types), src/envs/pokemon_env/server/entrypoint.sh and src/envs/pokemon_env/server/supervisord.conf (port mismatches)

Important Files Changed

Filename	Overview
src/envs/pokemon_env/server/pokemon_environment.py	Core environment logic with infinite recursion bug and missing import; otherwise solid async/threading design
src/envs/pokemon_env/models.py	Data models with incomplete action_type literal definition missing "default" and "forfeit" values
src/envs/pokemon_env/server/entrypoint.sh	Port mismatch - uses 9000 instead of 9980, will cause deployment failures
src/envs/pokemon_env/server/supervisord.conf	Port mismatch - uses 9000 instead of 9980, inconsistent with other configuration
src/envs/pokemon_env/client.py	HTTP client implementation is clean with proper parsing and type handling
src/envs/pokemon_env/server/app.py	FastAPI app setup is straightforward and follows OpenEnv patterns correctly

Sequence Diagram

sequenceDiagram
    participant Client as HTTP Client
    participant FastAPI as FastAPI Server<br/>(Main Loop)
    participant PokemonEnv as PokemonEnvironment<br/>(with Lock)
    participant Player as OpenEnvPokemonPlayer
    participant PokeLoop as POKE_LOOP<br/>(Background Thread)
    participant Showdown as Pokemon Showdown<br/>(WebSocket)

    Note over Client,Showdown: Reset Flow
    Client->>FastAPI: POST /reset
    FastAPI->>PokemonEnv: reset()
    PokemonEnv->>PokemonEnv: Acquire _env_lock
    PokemonEnv->>PokemonEnv: Cancel previous battle task
    PokemonEnv->>PokemonEnv: Cleanup old battles (every 10 episodes)
    PokemonEnv->>PokeLoop: asyncio.run_coroutine_threadsafe(start_battle())
    PokeLoop->>Player: battle_against(opponent)
    Player->>Showdown: WebSocket connect & battle init
    Showdown-->>Player: Battle started
    Player-->>PokeLoop: Battle task created
    PokeLoop-->>PokemonEnv: Battle reference
    PokemonEnv->>PokemonEnv: Release _env_lock
    PokemonEnv-->>FastAPI: Initial PokemonObservation
    FastAPI-->>Client: 200 OK with observation

    Note over Client,Showdown: Step Flow
    Client->>FastAPI: POST /step with PokemonAction
    FastAPI->>PokemonEnv: step(action)
    PokemonEnv->>PokemonEnv: Acquire _env_lock
    PokemonEnv->>PokemonEnv: Validate action type
    PokemonEnv->>Player: set_next_action(action)
    Player->>PokeLoop: Schedule _set_action() on POKE_LOOP
    PokeLoop->>Player: Set _next_action, signal _action_event
    PokemonEnv->>PokeLoop: asyncio.run_coroutine_threadsafe(wait_turn())
    Note over Player,Showdown: choose_move() called by poke-env
    Player->>Player: wait_for(_action_event)
    Player->>Player: _action_to_order(action)
    Player->>Showdown: Send move via WebSocket
    Showdown-->>Player: Battle state update
    Player->>Player: Signal _turn_complete_event
    PokeLoop-->>PokemonEnv: Turn complete
    PokemonEnv->>PokemonEnv: _battle_to_observation()
    PokemonEnv->>PokemonEnv: _compute_reward()
    PokemonEnv->>PokemonEnv: Release _env_lock
    PokemonEnv-->>FastAPI: Updated PokemonObservation
    FastAPI-->>Client: 200 OK with observation

greptile-apps

_{25 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-21T14:55:00Z

+
+        else:
+            # Unknown mode, use sparse
+            return self._compute_reward(battle, done) if done else 0.0


logic: recursive call to _compute_reward will cause infinite recursion if reward_mode is not "sparse" or "dense"

Suggested change

return self._compute_reward(battle, done) if done else 0.0

return 0.0 # Unknown mode, default to sparse behavior

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/pokemon_env/server/pokemon_environment.py Line: 460:460 Comment: **logic:** recursive call to `_compute_reward` will cause infinite recursion if `reward_mode` is not `"sparse"` or `"dense"` ```suggestion return 0.0 # Unknown mode, default to sparse behavior ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:55:01Z

+echo "Starting Pokemon OpenEnv server on port 9000..."
+cd /app
+export PYTHONPATH=/app/src
+exec uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000


logic: port mismatch - entrypoint.sh uses port 9000 but Dockerfile, supervisord.conf, and other files use port 9980

Suggested change

echo "Starting Pokemon OpenEnv server on port 9000..."

cd /app

export PYTHONPATH=/app/src

exec uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000

echo "Starting Pokemon OpenEnv server on port 9980..."

cd /app

export PYTHONPATH=/app/src

exec uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/pokemon_env/server/entrypoint.sh Line: 29:32 Comment: **logic:** port mismatch - entrypoint.sh uses port 9000 but Dockerfile, supervisord.conf, and other files use port 9980 ```suggestion echo "Starting Pokemon OpenEnv server on port 9980..." cd /app export PYTHONPATH=/app/src exec uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980 ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:55:01Z

+priority=10
+
+[program:openenv]
+command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000


logic: port mismatch - supervisord.conf uses port 9000 but Dockerfile.pokemonenv, Dockerfile, and README.md use port 9980

Suggested change

command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000

command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/pokemon_env/server/supervisord.conf Line: 19:19 Comment: **logic:** port mismatch - supervisord.conf uses port 9000 but Dockerfile.pokemonenv, Dockerfile, and README.md use port 9980 ```suggestion command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980 ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:55:02Z

+try:
+    # Import from top-level poke_env module
+    from poke_env import Player, RandomPlayer, AccountConfiguration, ServerConfiguration
+    # Import battle orders from player submodule


syntax: missing import for RandomPlayer

Suggested change

# Import battle orders from player submodule

from poke_env import Player, AccountConfiguration, ServerConfiguration

from poke_env.player import RandomPlayer

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/pokemon_env/server/pokemon_environment.py Line: 30:30 Comment: **syntax:** missing import for `RandomPlayer` ```suggestion from poke_env import Player, AccountConfiguration, ServerConfiguration from poke_env.player import RandomPlayer ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:55:03Z

+        dynamax: Whether to dynamax this turn (if applicable)
+        terastallize: Whether to terastallize this turn (if applicable)
+    """
+    action_type: Literal["move", "switch"] = "move"


syntax: missing "default" and "forfeit" action types from literal - these are used in pokemon_environment.py lines 142, 193, and 197

Suggested change

action_type: Literal["move", "switch"] = "move"

action_type: Literal["move", "switch", "default", "forfeit"] = "move"

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/pokemon_env/models.py Line: 30:30 Comment: **syntax:** missing `"default"` and `"forfeit"` action types from literal - these are used in `pokemon_environment.py` lines 142, 193, and 197 ```suggestion action_type: Literal["move", "switch", "default", "forfeit"] = "move" ``` How can I resolve this? If you propose a fix, please make it concise.

burtenshaw · 2026-02-05T13:56:37Z

@init27 It looks like this PR is no longer compatible with the code base. I'd suggest closing and opening a new PR with the env in envs/, just for simplicity.

burtenshaw · 2026-02-06T19:43:17Z

@init27 @Darktex as discussed, I'll close because they are out of sync with the current project. Hope to see them reopened.

init27 and others added 23 commits November 1, 2025 12:15

Create Readme.MD

6b64f35

Update Readme.MD

b962714

Add Poke-Env documentation for environment setup and usage

469a568

Implement a sample Pokemon Battle Environment with server and client …

f1685c1

…integration

Merge pull request #124 from cpich3g/project-pikachu

62bede1

Project pikachu - First Draft

cleanup

a151874

improve logging

1185aaa

Update Dockerfile

0abf5bb

push testing

c726d75

Add Surfiniaburger to contributors list

535e907

add logs and fix tests

9ec5b12

fix paths

8626797

add edge handling

abeb605

reward

359ef1b

move files

eef1675

Update pokemon_environment.py

31e6509

Update pokemon_environment.py

ab7d91e

Create README.md

b165e65

Update test_local_pokemon.py

376b54a

Update test_http_pokemon.py

f7abe37

Merge pull request #127 from surfiniaburger/patch-1

f270e9f

Add Surfiniaburger to contributors list

fix paths

6e9dff7

Merge branch 'project-pikachu' of https://github.com/meta-pytorch/Ope…

705d203

…nEnv into project-pikachu

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 2, 2025

init27 requested a review from zkwentz November 2, 2025 03:57

init27 requested a review from Darktex November 2, 2025 03:57

Add observations section to Readme

c3ba48c

Added observations on popular choices and practices in Pokemon-RL projects.

Merge pull request #134 from Sai-Ishaan/patch-2

7f181cb

Add observations section to Readme

Sai-Ishaan added 5 commits November 4, 2025 13:32

Rename examples/BasicSetupCheckList.md to examples/project-pikachu/Ba…

7883816

…sicSetupCheckList.md

Revise Basic Pokémon RL Training Setup Checklist

6c296dc

Updated the checklist for setting up the Pokémon RL training environment, including system requirements, hardware recommendations, setup instructions, and known issues.

Fix formatting in BasicSetupCheckList.md

c91411f

Clean up BasicSetupCheckList.md

bec696b

Removed unnecessary comments and cleaned up formatting.

Update BasicSetupCheckList with forking instructions

215257e

Added instructions for forking the Pokemon-Showdown repository and noted compatibility issues with random battle generation.

jspisak added the New Environment label Nov 4, 2025

yogesh1801 and others added 4 commits November 5, 2025 14:33

fixed gen8 random battles

bc8e0c5

Split Dockerfile into 2 files

bfd79b2

Merge pull request #153 from dcube0710/project-pikachu-simran

1f47a4e

Organised Dockerfile

Merge pull request #150 from yogesh1801/project-pikachu-yogesh

37ae400

fix: ensure Gen 8 Random Battles run correctly instead of Gen 9

Merge pull request #146 from Sai-Ishaan/project-pikachu

73bae33

Created a Basic Setup and Installation.md file

yogesh1801 and others added 2 commits November 6, 2025 03:35

dockerfiles separated

1d0aa99

Merge pull request #154 from yogesh1801/project-pikachu

55b18ff

dockerfiles separated

Darktex requested a review from Copilot November 7, 2025 18:19

Copilot AI reviewed Nov 7, 2025

View reviewed changes

Darktex requested changes Nov 7, 2025

View reviewed changes

Darktex previously approved these changes Jan 13, 2026

View reviewed changes

Darktex reviewed Jan 13, 2026

View reviewed changes

greptile-apps Bot reviewed Jan 21, 2026

View reviewed changes

burtenshaw closed this Feb 6, 2026

	action_desc = f"Switch to {obs.team[switch_idx + 1].species.title()}"
	action_desc = f"Switch to {obs.team[obs.available_switches[switch_idx]].species.title()}"

	action_type: Literal["move", "switch"] = "move"
	action_type: Literal["move", "switch", "forfeit", "default"] = "move"

	"Please install it with: pip install poke-env"
	"Please install it with: `pip install poke-env`"

	move_name = obs.available_moves[move_idx]
	move_index = obs.available_moves[move_idx]
	move_name = obs.active_pokemon.moves[move_index]['id'] if obs.active_pokemon and obs.active_pokemon.moves and move_index < len(obs.active_pokemon.moves) else str(move_index)

	from envs.pokemon_env.models import PokemonAction, PokemonObservation, PokemonState
	from envs.pokemon_env.models import PokemonAction

	return self._compute_reward(battle, done) if done else 0.0
	return 0.0 # Unknown mode, default to sparse behavior

	command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9000
	command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980

	# Import battle orders from player submodule
	from poke_env import Player, AccountConfiguration, ServerConfiguration
	from poke_env.player import RandomPlayer

Uh oh!

Conversation

init27 commented Nov 2, 2025

Design Decisions

File Structure

Uh oh!

github-actions Bot commented Nov 2, 2025

Uh oh!

github-actions Bot commented Nov 2, 2025

Uh oh!

github-actions Bot commented Nov 5, 2025

Uh oh!

github-actions Bot commented Nov 5, 2025

Uh oh!

github-actions Bot commented Nov 5, 2025

Uh oh!

github-actions Bot commented Nov 5, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment