Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# BaseAgent - SDK 3.0

High-performance autonomous agent for [Term Challenge](https://term.challenge). **Does NOT use term_sdk** - fully autonomous with litellm.
High-performance autonomous agent for [Term Challenge](https://term.challenge). **Does NOT use term_sdk** - fully autonomous with Chutes API.

## Installation

Expand Down Expand Up @@ -36,7 +36,7 @@ my-agent/
β”‚ β”‚ β”œβ”€β”€ loop.py # Main loop
β”‚ β”‚ └── compaction.py # Context management (MANDATORY)
β”‚ β”œβ”€β”€ llm/
β”‚ β”‚ └── client.py # LLM client (litellm)
β”‚ β”‚ └── client.py # LLM client (Chutes API)
β”‚ └── tools/
β”‚ └── ... # Available tools
β”œβ”€β”€ requirements.txt # Dependencies
Expand Down Expand Up @@ -77,13 +77,13 @@ AUTO_COMPACT_THRESHOLD = 0.85

## Features

### LLM Client (litellm)
### LLM Client (Chutes API)

```python
from src.llm.client import LiteLLMClient
from src.llm.client import LLMClient

llm = LiteLLMClient(
model="openrouter/anthropic/claude-opus-4.5",
llm = LLMClient(
model="moonshotai/Kimi-K2.5-TEE",
temperature=0.0,
max_tokens=16384,
)
Expand Down Expand Up @@ -129,7 +129,7 @@ See `src/config/defaults.py`:

```python
CONFIG = {
"model": "openrouter/anthropic/claude-opus-4.5",
"model": "moonshotai/Kimi-K2.5-TEE",
"max_tokens": 16384,
"max_iterations": 200,
"auto_compact_threshold": 0.85,
Expand All @@ -142,7 +142,7 @@ CONFIG = {

| Variable | Description |
|----------|-------------|
| `OPENROUTER_API_KEY` | OpenRouter API key |
| `CHUTES_API_KEY` | Chutes API key |

## Documentation

Expand All @@ -151,7 +151,7 @@ CONFIG = {
See [rules/](rules/) for comprehensive guides:

- [Architecture Patterns](rules/02-architecture-patterns.md) - **Mandatory project structure**
- [LLM Usage Guide](rules/06-llm-usage-guide.md) - **Using litellm**
- [LLM Usage Guide](rules/06-llm-usage-guide.md) - **Using Chutes API**
- [Best Practices](rules/05-best-practices.md)
- [Error Handling](rules/08-error-handling.md)

Expand Down
67 changes: 37 additions & 30 deletions agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
SuperAgent for Term Challenge - Entry Point (SDK 3.0 Compatible).

This agent accepts --instruction from the validator and runs autonomously.
Uses litellm for LLM calls instead of term_sdk.
Uses Chutes API for LLM calls instead of term_sdk.

Installation:
pip install . # via pyproject.toml
Expand All @@ -16,56 +16,61 @@
from __future__ import annotations

import argparse
import sys
import time
import os
import subprocess
import sys
import time
from pathlib import Path

# Add parent to path for imports
sys.path.insert(0, str(Path(__file__).parent))


# Auto-install dependencies if missing
def ensure_dependencies():
"""Install dependencies if not present."""
try:
import litellm
import httpx
import pydantic
except ImportError:
print("[setup] Installing dependencies...", file=sys.stderr)
agent_dir = Path(__file__).parent
req_file = agent_dir / "requirements.txt"
if req_file.exists():
subprocess.run([sys.executable, "-m", "pip", "install", "-r", str(req_file), "-q"], check=True)
subprocess.run(
[sys.executable, "-m", "pip", "install", "-r", str(req_file), "-q"], check=True
)
else:
subprocess.run([sys.executable, "-m", "pip", "install", str(agent_dir), "-q"], check=True)
subprocess.run(
[sys.executable, "-m", "pip", "install", str(agent_dir), "-q"], check=True
)
print("[setup] Dependencies installed", file=sys.stderr)


ensure_dependencies()

from src.config.defaults import CONFIG
from src.core.loop import run_agent_loop
from src.llm.client import CostLimitExceeded, LLMClient
from src.output.jsonl import ErrorEvent, emit
from src.tools.registry import ToolRegistry
from src.output.jsonl import emit, ErrorEvent
from src.llm.client import LiteLLMClient, CostLimitExceeded


class AgentContext:
"""Minimal context for agent execution (replaces term_sdk.AgentContext)."""

def __init__(self, instruction: str, cwd: str = None):
self.instruction = instruction
self.cwd = cwd or os.getcwd()
self.step = 0
self.is_done = False
self.history = []
self._start_time = time.time()

@property
def elapsed_secs(self) -> float:
return time.time() - self._start_time

def shell(self, cmd: str, timeout: int = 120) -> "ShellResult":
"""Execute a shell command."""
self.step += 1
Expand All @@ -86,20 +91,22 @@ def shell(self, cmd: str, timeout: int = 120) -> "ShellResult":
except Exception as e:
output = f"[ERROR] {e}"
exit_code = -1

shell_result = ShellResult(output=output, exit_code=exit_code)
self.history.append({
"step": self.step,
"command": cmd,
"output": output[:1000],
"exit_code": exit_code,
})
self.history.append(
{
"step": self.step,
"command": cmd,
"output": output[:1000],
"exit_code": exit_code,
}
)
return shell_result

def done(self):
"""Mark task as complete."""
self.is_done = True

def log(self, msg: str):
"""Log a message."""
timestamp = time.strftime("%H:%M:%S")
Expand All @@ -108,13 +115,13 @@ def log(self, msg: str):

class ShellResult:
"""Result from shell command."""

def __init__(self, output: str, exit_code: int):
self.output = output
self.stdout = output
self.stderr = ""
self.exit_code = exit_code

def has(self, text: str) -> bool:
return text in self.output

Expand All @@ -129,29 +136,29 @@ def main():
parser = argparse.ArgumentParser(description="SuperAgent for Term Challenge SDK 3.0")
parser.add_argument("--instruction", required=True, help="Task instruction from validator")
args = parser.parse_args()

_log("=" * 60)
_log("SuperAgent Starting (SDK 3.0 - litellm)")
_log("SuperAgent Starting (SDK 3.0 - Chutes API)")
_log("=" * 60)
_log(f"Model: {CONFIG['model']}")
_log(f"Reasoning effort: {CONFIG.get('reasoning_effort', 'default')}")
_log(f"Instruction: {args.instruction[:200]}...")
_log("-" * 60)

# Initialize components
start_time = time.time()
llm = LiteLLMClient(

llm = LLMClient(
model=CONFIG["model"],
temperature=CONFIG.get("temperature"),
max_tokens=CONFIG.get("max_tokens", 16384),
)

tools = ToolRegistry()
ctx = AgentContext(instruction=args.instruction)

_log("Components initialized")

try:
run_agent_loop(
llm=llm,
Expand Down
15 changes: 7 additions & 8 deletions astuces/08-cost-optimization.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,20 @@

## Cost Breakdown

For Claude Sonnet via OpenRouter:
Typical LLM pricing (varies by model):

| Token Type | Cost per 1M |
|------------|-------------|
| Input tokens | $3.00 |
| Cached input | $0.30 (90% off) |
| Output tokens | $15.00 |
| Token Type | Typical Cost per 1M |
|------------|---------------------|
| Input tokens | $1.00 - $15.00 |
| Cached input | 10-50% of input |
| Output tokens | $2.00 - $60.00 |

For a typical task:
- 50 turns
- 100k context average
- 500 output tokens per turn

**Without optimization**: 50 Γ— 100k Γ— $3/1M = **$15 per task**
**With 90% caching**: 50 Γ— 100k Γ— $0.30/1M = **$1.50 per task**
Costs vary significantly by model choice. Kimi K2.5-TEE offers a good balance of performance and cost.

## Optimization Strategies

Expand Down
125 changes: 125 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# BaseAgent Documentation

> **Professional documentation for the BaseAgent autonomous coding assistant**

BaseAgent is a high-performance autonomous agent designed for the [Term Challenge](https://term.challenge). It leverages LLM-driven decision making with advanced context management and cost optimization techniques.

---

## Table of Contents

### Getting Started
- [Overview](./overview.md) - What is BaseAgent and core design principles
- [Installation](./installation.md) - Prerequisites and setup instructions
- [Quick Start](./quickstart.md) - Your first task in 5 minutes

### Core Concepts
- [Architecture](./architecture.md) - Technical architecture and system design
- [Configuration](./configuration.md) - All configuration options explained
- [Usage Guide](./usage.md) - Command-line interface and options

### Reference
- [Tools Reference](./tools.md) - Available tools and their parameters
- [Context Management](./context-management.md) - Token management and compaction
- [Best Practices](./best-practices.md) - Optimal usage patterns

### LLM Providers
- [Chutes API Integration](./chutes-integration.md) - Using Chutes as your LLM provider

---

## Quick Navigation

| Document | Description |
|----------|-------------|
| [Overview](./overview.md) | High-level introduction and design principles |
| [Installation](./installation.md) | Step-by-step setup guide |
| [Quick Start](./quickstart.md) | Get running in minutes |
| [Architecture](./architecture.md) | Technical deep-dive with diagrams |
| [Configuration](./configuration.md) | Environment variables and settings |
| [Usage](./usage.md) | CLI commands and examples |
| [Tools](./tools.md) | Complete tools reference |
| [Context Management](./context-management.md) | Memory and token optimization |
| [Best Practices](./best-practices.md) | Tips for optimal performance |
| [Chutes Integration](./chutes-integration.md) | Chutes API setup and usage |

---

## Architecture at a Glance

```mermaid
graph TB
subgraph User["User Interface"]
CLI["CLI (agent.py)"]
end

subgraph Core["Core Engine"]
Loop["Agent Loop"]
Context["Context Manager"]
Cache["Prompt Cache"]
end

subgraph LLM["LLM Layer"]
Client["LiteLLM Client"]
Provider["Provider (Chutes/OpenRouter)"]
end

subgraph Tools["Tool System"]
Registry["Tool Registry"]
Shell["shell_command"]
Files["read_file / write_file"]
Search["grep_files / list_dir"]
end

CLI --> Loop
Loop --> Context
Loop --> Cache
Loop --> Client
Client --> Provider
Loop --> Registry
Registry --> Shell
Registry --> Files
Registry --> Search
```

---

## Key Features

- **Fully Autonomous** - No user confirmation required; makes decisions independently
- **LLM-Driven** - All decisions made by the language model, not hardcoded logic
- **Prompt Caching** - 90%+ cache hit rate for significant cost reduction
- **Context Management** - Intelligent pruning and compaction for long tasks
- **Self-Verification** - Automatic validation before task completion
- **Multi-Provider** - Supports Chutes AI, OpenRouter, and litellm-compatible providers

---

## Project Structure

```
baseagent/
β”œβ”€β”€ agent.py # Entry point
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ core/
β”‚ β”‚ β”œβ”€β”€ loop.py # Main agent loop
β”‚ β”‚ └── compaction.py # Context management
β”‚ β”œβ”€β”€ llm/
β”‚ β”‚ └── client.py # LLM client (litellm)
β”‚ β”œβ”€β”€ config/
β”‚ β”‚ └── defaults.py # Configuration
β”‚ β”œβ”€β”€ tools/ # Tool implementations
β”‚ β”œβ”€β”€ prompts/
β”‚ β”‚ └── system.py # System prompt
β”‚ └── output/
β”‚ └── jsonl.py # JSONL event emission
β”œβ”€β”€ rules/ # Development guidelines
β”œβ”€β”€ astuces/ # Implementation techniques
└── docs/ # This documentation
```

---

## License

MIT License - See [LICENSE](../LICENSE) for details.
Loading