Skip to content

feat: add DSL output format for graph tools#536

Open
indrajeet0510 wants to merge 2 commits into
tirth8205:mainfrom
indrajeet0510:feature/dsl-output-mode
Open

feat: add DSL output format for graph tools#536
indrajeet0510 wants to merge 2 commits into
tirth8205:mainfrom
indrajeet0510:feature/dsl-output-mode

Conversation

@indrajeet0510

Copy link
Copy Markdown

Summary

Adds a format="dsl" parameter to get_impact_radius_tool,
query_graph_tool, and get_review_context_tool. When set, the graph
payload comes back as one-line strings instead of JSON dicts. Default
stays "dict" so nothing existing breaks.

Why

I was reading through graph.py and noticed that node_to_dict and
edge_to_dict (around line 1349) get called from roughly half of the 30
tools. Every node carries 9 JSON keys, every edge carries 8. On a
get_impact_radius that hits the default 500-node cap, that's roughly
30k tokens of structural metadata before any actual code shows up.

Two of those fields don't seem to do anything for the LLM consumer: the
internal database id, and confidence_tier (which is fully derivable
from the confidence float — same thresholds you already encode
elsewhere). And in the edge encoding, source almost always starts with
the same path as file_path, so that prefix gets duplicated on every
edge in the response.

Felt like there was real headroom to claw back without changing any
semantics:

# dict (unchanged, still the default)
{"id": 42, "kind": "Function", "name": "validateOrder",
 "qualified_name": "src/orders/validator.py::OrderValidator::validateOrder",
 "file_path": "src/orders/validator.py", "line_start": 142, "line_end": 178,
 "language": "python", "parent_name": "src/orders/validator.py::OrderValidator",
 "is_test": false}

# dsl (new, opt-in)
fn validateOrder@src/orders/validator.py:142-178 py parent=OrderValidator

Same information, about 4× fewer characters per node. Edges compress
roughly 2.5× once you strip the redundant file_path:: prefix from
source. Each response also carries a legend field documenting the
codes so the LLM doesn't need external docs to decode the lines.

Numbers

I wrote a reproducible benchmark in scripts/bench_dsl_output.py. It
builds a synthetic 50-file backend service with cross-module fan-out and
runs get_impact_radius against it. On a typical 71-node blast radius:

  • dict response: ~14,800 tokens
  • dsl response: ~6,700 tokens
  • saves ~8,000 tokens per call

Worth flagging that per-row compression is more like 3-4×, but the
whole-response ratio comes out closer to 2.2× because the envelope
(summary, file lists, the legend itself) is identical in both modes.
On larger payloads the per-row encoding dominates and the ratio creeps
up toward the per-row number. Run the script to see for yourself.

Tests

26 new tests in tests/test_dsl_output.py covering the encoders, the
dispatch helpers, control-char sanitization (security parity with the
dict encoder — _sanitize_name is called from the DSL encoders too, so
the existing TestSanitizeName protections carry over), and the tool
integration. All pass. ruff check and mypy --ignore-missing-imports --no-strict-optional both clean on the modified files.

Heads up — there are 7 unrelated failures in tests/test_main.py
(TestApplyToolFilter and one in TestLongRunningToolsAreAsync). They
fail on clean main too, doesn't look related to this PR.

Notes

  • Default of "dict" means every existing caller is byte-identical.
    Nothing changes unless someone explicitly opts in.
  • I left detect_changes_tool and traverse_graph_tool alone even
    though they share the same encoder bottleneck — felt cleaner as
    separate follow-up PRs once we agree on the format here.
  • Updated only the English README.md (added a short paragraph below
    the MCP tools table mentioning the new parameter). The four
    translated READMEs (Hindi, Japanese, Korean, Chinese Simplified)
    aren't updated — happy to take a follow-up for the Hindi sync if
    you'd like, but I'd defer to native speakers for the others.
  • Open to bikeshedding on the field codes / legend wording / arrow
    character / etc. The current shape was a first stab at something
    compact-but-still-readable; happy to iterate if you have preferences.

Opt-in format="dsl" parameter on get_impact_radius_tool, query_graph_tool, and get_review_context_tool returns nodes and edges as one-line strings instead of JSON dicts.

Per-row: ~4x nodes, ~2.5x edges. Full-response on a 71-node blast radius: ~8,000 tokens saved per call.

Default stays "dict" — fully backwards compatible.
@indrajeet0510 indrajeet0510 force-pushed the feature/dsl-output-mode branch from 44bd88c to 84f29e6 Compare June 8, 2026 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant