Feature Request: browser automation plugin (kimi-browser) via MCP

## Motivation

Kimi Code currently has excellent tools for reading and modifying local code, plus `WebSearch` and `FetchURL` for text-based web access. However, there is no built-in or official-plugin way to interact with a web browser visually—e.g., navigate to a page, take a screenshot, click an element, fill a form, or verify a local web app in a headless browser.

Other coding agents are moving in this direction:

- **Codex** ships a Computer Use plugin that can operate desktop apps and browsers (screen recording + accessibility permissions).
- **Claude** exposes a native `computer_use` tool that combines screenshots with coordinate-based actions.
- The community has produced MCP servers such as [`microsoft/playwright-mcp`](https://github.com/microsoft/playwright-mcp) that expose browser control to agents.

For many dev tasks, a lightweight headless-browser capability is enough and avoids the heavy permissions of full desktop control:
- Verify a checkout page after local changes.
- Screenshot a UI component rendered in Storybook.
- Fill out a form to reproduce a bug.
- Scrape dynamic content that `FetchURL` cannot retrieve.

## Proposal

Add an official Kimi Code plugin, tentatively named `kimi-browser`, that exposes browser automation tools through an MCP server backed by Playwright (or Puppeteer).

### Plugin location

```text
plugins/official/kimi-browser/
├── kimi.plugin.json
├── SKILL.md
└── bin/
    └── kimi-browser.mjs   # MCP server entry
```

### Exposed MCP tools (initial set)

| Tool | Purpose |
|------|---------|
| `mcp__kimi-browser__navigate` | Open a URL in a headless browser context. |
| `mcp__kimi-browser__screenshot` | Capture the current viewport or a specific element. |
| `mcp__kimi-browser__click` | Click an element by selector or coordinates. |
| `mcp__kimi-browser__type` | Type text into an input field. |
| `mcp__kimi-browser__scroll` | Scroll the page or an element. |
| `mcp__kimi-browser__evaluate` | Run a JS snippet in the page context and return the result. |
| `mcp__kimi-browser__close` | Close the browser context and release resources. |

### How it fits Kimi Code

- **No core changes**: the plugin only declares an MCP server in its manifest, matching the existing `kimi-datasource` pattern.
- **Reuses existing permission model**: users approve `mcp__kimi-browser__*` calls just like any other MCP tool.
- **Optional install**: users who do not need browser automation can simply not install it.
- **Cross-platform**: headless Playwright works on macOS, Linux, and Windows without requiring screen-recording or accessibility permissions.

## Security considerations

- Browser automation can interact with signed-in sessions and external sites, so all tool calls should require explicit approval by default.
- The plugin should default to headless mode and isolate each session (clean context, no persistent cookies unless configured).
- Network access should respect the user environment; the plugin should not bypass proxy or firewall settings.

## Scope questions for maintainers

Before I start implementing a PR, I would like to confirm a few things:

1. Is an official `kimi-browser` plugin aligned with Kimi Code’s roadmap, or is browser/computer-use automation being handled differently?
2. Should the plugin ship its own browser binary via `playwright install`, or should it expect the user to have Chromium/Chrome already installed?
3. Are there naming conventions or manifest requirements for official plugins beyond what `kimi-datasource` demonstrates?
4. Would the team prefer a minimal initial PR (e.g., navigate + screenshot only) or a more complete tool set from the start?

I am happy to iterate on the design and provide a proof-of-concept once the direction is confirmed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: browser automation plugin (kimi-browser) via MCP #945

Motivation

Proposal

Plugin location

Exposed MCP tools (initial set)

How it fits Kimi Code

Security considerations

Scope questions for maintainers

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tool	Purpose
`mcp__kimi-browser__navigate`	Open a URL in a headless browser context.
`mcp__kimi-browser__screenshot`	Capture the current viewport or a specific element.
`mcp__kimi-browser__click`	Click an element by selector or coordinates.
`mcp__kimi-browser__type`	Type text into an input field.
`mcp__kimi-browser__scroll`	Scroll the page or an element.
`mcp__kimi-browser__evaluate`	Run a JS snippet in the page context and return the result.
`mcp__kimi-browser__close`	Close the browser context and release resources.

Feature Request: browser automation plugin (kimi-browser) via MCP #945

Description

Motivation

Proposal

Plugin location

Exposed MCP tools (initial set)

How it fits Kimi Code

Security considerations

Scope questions for maintainers

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions