Motivation
Kimi Code currently has excellent tools for reading and modifying local code, plus WebSearch and FetchURL for text-based web access. However, there is no built-in or official-plugin way to interact with a web browser visually—e.g., navigate to a page, take a screenshot, click an element, fill a form, or verify a local web app in a headless browser.
Other coding agents are moving in this direction:
- Codex ships a Computer Use plugin that can operate desktop apps and browsers (screen recording + accessibility permissions).
- Claude exposes a native
computer_use tool that combines screenshots with coordinate-based actions.
- The community has produced MCP servers such as
microsoft/playwright-mcp that expose browser control to agents.
For many dev tasks, a lightweight headless-browser capability is enough and avoids the heavy permissions of full desktop control:
- Verify a checkout page after local changes.
- Screenshot a UI component rendered in Storybook.
- Fill out a form to reproduce a bug.
- Scrape dynamic content that
FetchURL cannot retrieve.
Proposal
Add an official Kimi Code plugin, tentatively named kimi-browser, that exposes browser automation tools through an MCP server backed by Playwright (or Puppeteer).
Plugin location
plugins/official/kimi-browser/
├── kimi.plugin.json
├── SKILL.md
└── bin/
└── kimi-browser.mjs # MCP server entry
Exposed MCP tools (initial set)
| Tool |
Purpose |
mcp__kimi-browser__navigate |
Open a URL in a headless browser context. |
mcp__kimi-browser__screenshot |
Capture the current viewport or a specific element. |
mcp__kimi-browser__click |
Click an element by selector or coordinates. |
mcp__kimi-browser__type |
Type text into an input field. |
mcp__kimi-browser__scroll |
Scroll the page or an element. |
mcp__kimi-browser__evaluate |
Run a JS snippet in the page context and return the result. |
mcp__kimi-browser__close |
Close the browser context and release resources. |
How it fits Kimi Code
- No core changes: the plugin only declares an MCP server in its manifest, matching the existing
kimi-datasource pattern.
- Reuses existing permission model: users approve
mcp__kimi-browser__* calls just like any other MCP tool.
- Optional install: users who do not need browser automation can simply not install it.
- Cross-platform: headless Playwright works on macOS, Linux, and Windows without requiring screen-recording or accessibility permissions.
Security considerations
- Browser automation can interact with signed-in sessions and external sites, so all tool calls should require explicit approval by default.
- The plugin should default to headless mode and isolate each session (clean context, no persistent cookies unless configured).
- Network access should respect the user environment; the plugin should not bypass proxy or firewall settings.
Scope questions for maintainers
Before I start implementing a PR, I would like to confirm a few things:
- Is an official
kimi-browser plugin aligned with Kimi Code’s roadmap, or is browser/computer-use automation being handled differently?
- Should the plugin ship its own browser binary via
playwright install, or should it expect the user to have Chromium/Chrome already installed?
- Are there naming conventions or manifest requirements for official plugins beyond what
kimi-datasource demonstrates?
- Would the team prefer a minimal initial PR (e.g., navigate + screenshot only) or a more complete tool set from the start?
I am happy to iterate on the design and provide a proof-of-concept once the direction is confirmed.
Motivation
Kimi Code currently has excellent tools for reading and modifying local code, plus
WebSearchandFetchURLfor text-based web access. However, there is no built-in or official-plugin way to interact with a web browser visually—e.g., navigate to a page, take a screenshot, click an element, fill a form, or verify a local web app in a headless browser.Other coding agents are moving in this direction:
computer_usetool that combines screenshots with coordinate-based actions.microsoft/playwright-mcpthat expose browser control to agents.For many dev tasks, a lightweight headless-browser capability is enough and avoids the heavy permissions of full desktop control:
FetchURLcannot retrieve.Proposal
Add an official Kimi Code plugin, tentatively named
kimi-browser, that exposes browser automation tools through an MCP server backed by Playwright (or Puppeteer).Plugin location
Exposed MCP tools (initial set)
mcp__kimi-browser__navigatemcp__kimi-browser__screenshotmcp__kimi-browser__clickmcp__kimi-browser__typemcp__kimi-browser__scrollmcp__kimi-browser__evaluatemcp__kimi-browser__closeHow it fits Kimi Code
kimi-datasourcepattern.mcp__kimi-browser__*calls just like any other MCP tool.Security considerations
Scope questions for maintainers
Before I start implementing a PR, I would like to confirm a few things:
kimi-browserplugin aligned with Kimi Code’s roadmap, or is browser/computer-use automation being handled differently?playwright install, or should it expect the user to have Chromium/Chrome already installed?kimi-datasourcedemonstrates?I am happy to iterate on the design and provide a proof-of-concept once the direction is confirmed.