Skip to content

Add workspace grep/search for remote workspaces #306

@chubes4

Description

@chubes4

Problem

Remote Data Machine Code workspaces currently support reading files and exact-string edits, but they do not expose a grep/search tool. Agents working against remote GitHub-backed workspaces have to guess edit anchors in large files after coarse workspace_read calls.

This showed up in the PHP transformer iterator smoke loop. After prompt routing correctly sent a core/freeform DOM-to-block conversion finding to html-to-blocks-converter, the agent inspected includes/class-transform-registry.php but could only read slices of a ~138 KB file. It then guessed anchors like class Transform_Registry and public function register_transforms; both failed with old_string not found because the real symbols were class HTML_To_Blocks_Transform_Registry and get_raw_transforms().

The exact edit failure was correct, but the agent lacked the code discovery primitive needed to ground the edit.

Desired capability

Add a workspace search/grep tool that works for both local and remote workspace backends.

Suggested tool name: workspace_grep or workspace_search.

Suggested inputs:

  • repo: workspace handle, e.g. html-to-blocks-converter@fix-hero-section-fallback
  • pattern: string or regex pattern
  • path: optional path prefix
  • include: optional file glob/filter
  • max_results: optional result cap
  • context_lines: optional surrounding context

Suggested output per match:

  • file path
  • line number
  • matching line
  • optional context lines

Acceptance criteria

  • Local workspace backend can search checked-out files.
  • Remote GitHub API workspace backend can search repository contents for the active branch/ref without requiring a local checkout.
  • Tool is registered for agent use alongside workspace_ls, workspace_read, workspace_edit, and workspace_git_status.
  • Tests/smoke coverage proves search works through the remote workspace backend.
  • Error output is actionable when no matches are found.

Why this matters

The iterator now correctly avoids opening ungrounded PRs after failed edits, but without remote workspace search it often cannot produce grounded edits in large files. Search should let the workflow move from:

workspace_read first 800 lines of a giant file
model guesses old_string
workspace_edit fails

to:

workspace_grep pattern="hero|section|transform"
workspace_read targeted file/offset
workspace_edit exact observed anchor

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions