feat(gepa): add tool description optimization for multi-agent systems #8928

Ju-usc · 2025-10-10T06:12:23Z

Summary

Addresses #8706 which requested GEPA to optimize tool descriptions.

When enable_tool_optimization=True, GEPA jointly optimizes:

ReAct modules: react instructions, extract instructions, tool descriptions, and tool argument descriptions
~~Generic tool-using predictors support is TODO'd out pending DSPy trace lineage improvements.~~

All components are optimized together based on shared execution traces, enabling the reflection LM to see how components work together.

Backward compatible - enable_tool_optimization=False (default) preserves existing behavior.

Issue

Closes #8706

Changes

Core Implementation

enable_tool_optimization parameter on GEPA (default False)
Type-based detection: Identifies tool-using predictors via signature field annotations (dspy.Tool, list[dspy.Tool], dict[str, Tool]) (TODO)
Trace-based tool extraction: Discovers actual tools used at runtime from execution traces (TODO)
ToolProposer: Specialized proposer with dynamic signature generation for each tool and argument
ReAct handling: Discovers ReAct modules via isinstance() check, includes both react and extract predictors
Routing: ReAct modules → ToolProposer; regular predictors → default/custom proposer
Serialization: Tool modules stored as JSON configs with predictor instructions and tool schemas
Application: Applies optimized descriptions to dspy.Tool objects by matching tool.name

Testing

6 tests covering:

Skip predictors without tool annotations
ReAct module detection (single, multiple, nested)
Apply optimized ReAct descriptions
Selective optimization when LM returns None

Documentation

GEPA_Advanced.md - Tool optimization guide with usage examples
overview.md - Brief introduction linking to advanced guide

Usage Example

ReAct Agent

import dspy

def search_web(query: str) -> str:
    return f"Search results for: {query}"

search_tool = dspy.Tool(search_web, name="search_web", desc="Search the web")

agent = dspy.ReAct("question -> answer", tools=[search_tool])

gepa = dspy.GEPA(
    metric=my_metric,
    reflection_lm=dspy.LM("openai/gpt-5-mini"),
    enable_tool_optimization=True,
    auto="medium"
)

optimized = gepa.compile(agent, trainset=trainset, valset=valset)

Key Features

Joint Optimization: Predictor instructions and tool descriptions optimized together
Selective Updates: LM returns None for unchanged components
Multi-Agent Support: Discovers nested ReAct modules

- Add optimize_tool_descriptions parameter (default False) to GEPA - Extract tool descriptions from all nested modules via named_sub_modules() - Apply optimized descriptions in DspyAdapter.build_program() - Enables holistic optimization of tools across main and subagent modules - Tests: 4 new tests, all 16 pass (4 new + 12 existing)

Ju-usc · 2025-10-10T06:16:07Z

Apologies for accidentally closing #8927

Thank you for the thorough review, @LakshyAAAgrawal! I'll address your feedback:

Since tools are categorically different from prompts, they should use a different reflection meta prompt. The default reflection meta prompt is shown here https://dspy.ai/api/optimizers/GEPA/GEPA_Advanced/#default-implementation, whereas I assume that the tool must use somewhat different meta prompt. Can you implement a propose_new_texts method that mimics the default_proposer shown in the link above for all prompts, but calls to a tool description specific prompt/signature for tool evolution.
Can you also add some description to the documentation, explaining that this feature is beneficial for React agents.
(This is not a requirement to merge the PR) Would it be possible to add a simple and short tutorial demonstrating the use and performance improvement via tool evolution?

I'll start working on items 1 and 2 and update the PR soon. Please let me know if you have any specific preferences for the tutorial format!

LakshyAAAgrawal · 2025-10-10T06:20:53Z

Thanks a lot! For the tutorial, I think you can follow the current GEPA tutorial format (load a dataset, show an example from the dataset, build a dspy program, evaluate the baseline program on testset, run GEPA with new optimization settings, show the optimized programs' prompts and tool descriptions, and finally evaluate the optimized program).

Hopefully we should be able to see a nice and large gain on agentic tasks with this amazing contribution by you!

- Add ToolProposer with GenerateImprovedToolDescription signature - Implement routing logic to separate tools from signatures - Tools use ToolProposer, signatures use custom or parent default - Backward compatible: preserves existing custom_instruction_proposer behavior - Add test verifying routing splits components correctly

- Define tool functions outside class for clarity - Match structure of simple ReAct example - Add clear comments explaining architecture - Make code more readable and maintainable

Ju-usc · 2025-10-10T09:40:59Z

Hi @LakshyAAAgrawal,

I've implemented the tool-specific proposer as requested! Here's what's included:

1. Tool-Specific Proposer Implementation ✅

Added GenerateImprovedToolDescriptionFromFeedback signature with a specialized reflection prompt
Implemented ToolProposer and SingleComponentToolProposer following the MultiModalInstructionProposer pattern
Routing logic in DspyAdapter that directs tools to ToolProposer and signatures to custom/default proposers
Fully backward compatible with existing custom instruction proposers

2. Documentation ✅

Added comprehensive section to GEPA_Advanced.md
Explains when to use tool optimization (ReAct agents, multi-agent systems)
Includes usage examples for both simple and nested agent architectures
Documents how to inspect optimized tool descriptions

Reflection Prompt Design:
The tool-specific prompt is intentionally open-ended to avoid prescriptive patterns that might lead to local minima. It asks the LM to identify patterns in successful/unsuccessful tool usage and extract domain-specific information, without suggesting specific heuristics.

Before I create a short tutorial (item #3), would you have any feedback on:

The reflection prompt design - is it general enough? Any improvements you'd suggest?
The implementation approach - does the routing logic make sense?
The documentation - anything unclear or missing?

Any feedback would be helpful before I invest time in the tutorial. Thank you!

Ju-usc · 2025-10-11T03:01:40Z

wait there is a bug in the implementation working on it to fix. Also test has to be fixed.

…euse Tools now copy ReAct's reflective data with tool-specific annotation instead of complex trajectory extraction. This 15-line approach reuses ReAct's existing context (thoughts, tool calls, observations) and adds focused annotation for each tool. Implementation: - Tools receive full ReAct reflective examples (same trajectory context) - Feedback prefixed: [Optimizing tool: 'X'] for focused optimization - Reflection LM sees complete multi-step execution traces per tool Benefits: - Simpler: 15 lines vs 70+ line extraction approach - Reuses code: No duplicate trajectory formatting logic - Same context: Tools see full ReAct execution traces - Clean: Removed all debug output Tests: - 4 focused tests following GEPA patterns (removed 1 redundant) - 226KB fixture with 34 LM + 6 reflection calls - All tests passing with gpt-5-nano traces Documentation: - Updated GEPA_Advanced.md with implementation details - Explains reflective dataset construction approach

LakshyAAAgrawal · 2025-10-11T05:07:54Z

docs/docs/api/optimizers/GEPA/GEPA_Advanced.md

+
+The `optimize_tool_descriptions` parameter enables GEPA to optimize tool descriptions in addition to signature instructions. This is particularly valuable for ReAct agents and other tool-using systems, where the quality of tool descriptions directly impacts the agent's ability to select appropriate tools for each task.
+
+Unlike signature instructions that guide reasoning strategies, tool descriptions serve a fundamentally different purpose: they help agents decide **which tool to use** in a given situation. GEPA recognizes this categorical difference and applies a specialized reflection prompt tailored for tool selection decisions.


which tool to use, when to use it, and how to use it. All three are captured by the description.

Let's avoid the word "fundamentally". One can imagine that all of tool descriptions can (and many times do) simply included in the system prompt itself.

Please also add a corresponding entry in GEPA Overview, that links to this file/section.

LakshyAAAgrawal · 2025-10-11T05:10:25Z

docs/docs/api/optimizers/GEPA/GEPA_Advanced.md

+
+Consider enabling `optimize_tool_descriptions=True` when:
+
+- **Building ReAct agents**: ReAct agents rely on tool descriptions to make action selection decisions


One should consider using this, when they use dspy.Tool anywhere in the DSPy program. Here are a few scenarios for using dspy.Tool:

docs/docs/api/optimizers/GEPA/GEPA_Advanced.md

LakshyAAAgrawal · 2025-10-11T05:14:40Z

docs/docs/api/optimizers/GEPA/GEPA_Advanced.md

+)
+```
+
+**Note:** Tool optimization is fully backward compatible. Existing programs without tools, or with `optimize_tool_descriptions=False`, continue to work exactly as before.


I don't think we need to inform users about backward compatibility here. It should be implicit that there should be no behaviour changes for any program not containing dspy.Tool.

LakshyAAAgrawal · 2025-10-11T05:54:35Z

dspy/teleprompt/gepa/gepa.py

            raised if a mismatch in module-level and predictor-level score is detected.
+        optimize_tool_descriptions: Whether to optimize tool descriptions for modules with tools 
+            (e.g., ReAct agents). When enabled, tool descriptions are included in the optimization 
+            process alongside signature instructions. Default is False.


Add a link to GEPA Advanced/Tool section

LakshyAAAgrawal · 2025-10-11T06:01:25Z

dspy/teleprompt/gepa/gepa_utils.py

                    )

            self.propose_new_texts = custom_propose_new_texts
+        elif self.optimize_tool_descriptions:


Edge case: What should happen when user tries to provide both a custom proposer, and enables optimize_tool_descriptions

LakshyAAAgrawal · 2025-10-11T06:13:31Z

dspy/teleprompt/gepa/gepa_utils.py

+                # Handle signature components - replicate proposer's default behavior
+                sig_texts = {}
+                if sig_components:
+                    from gepa.strategies.instruction_proposal import InstructionProposalSignature


This is a slight deviation from this PR, but would be a large enhancement (feel free to ignore):

Create 2 fields, self.instruction_proposal_signature and self.tool_proposer, which are initialized to the default InstructionProposalSignature and ToolProposerSignature.

Take an argument from dspy.GEPA that can override the default signature values.

LakshyAAAgrawal · 2025-10-11T06:17:15Z

dspy/teleprompt/gepa/gepa_utils.py

+        # Second pass: Process tools by copying ReAct data with annotation
+        react_module_name = None
+        for name in ret_d.keys():
+            if "react" in name.lower():


Is this robust? Might it be better to use isinstance or some other way?

LakshyAAAgrawal · 2025-10-11T06:25:21Z

dspy/teleprompt/gepa/instruction_proposal.py

+
+    Your task is to write a better description for this tool.
+
+    Read the examples carefully and identify patterns in when the tool was used successfully versus when it was misused or overlooked. Identify any domain-specific information about the tool's capabilities or appropriate usage that may not be available to the assistant in the future. The assistant may have developed effective patterns for tool selection - if so, ensure the tool description supports those patterns.


Tool use. Also suggest identifying any failure modes of the tool?

dspy/teleprompt/gepa/instruction_proposal.py

LakshyAAAgrawal · 2025-10-11T06:45:56Z

Dear @Ju-usc,

This is a great PR. Thanks a lot! I have tried to be overly critical and made too many nits. Feel free to ignore if you disagree with something. Let me know if you'd like me to address anything!

Regarding the meta prompt, overall I think it looks great. However, I suggest that as you build the tutorial, you may find that the reflection prompt needs tweaking, or the content exposed in reflective_dataset for the tool may be lacking or need improvement. This is going to be an empirical exercise, which will guide what works in the reflection meta prompts. ! Looking forward to the tutorial on this too!

You may already have thoughts about what you'd like to show in the tutorial, but if not, you may consider building off (https://kargarisaac.medium.com/building-and-optimizing-multi-agent-rag-systems-with-dspy-and-gepa-2b88b5838ce2) by @kargarisaac.

- Add GenerateImprovedToolDescriptionFromFeedback signature documentation - Include tool-aware metric example showing trajectory access - Document tool prefix annotation in feedback - Note component_selector applies to both signatures and tools - Fix 'fundamentally' language per reviewer feedback

- Separate Pass 1 (predictor examples) and Pass 2 (tool aggregation) - Clarify Generated Outputs includes full trajectory for ReAct - Fix feedback annotation format to [Tool 'name' from 'predictor_key'] - Add Component Identification & Proposer Routing section - Explain dual-proposer independence (custom proposer doesn't affect tool proposer) - Use consistent terminology: 'predictor' and 'signature instructions'

Ju-usc · 2025-11-20T03:52:38Z

@LakshyAAAgrawal @chenmoneygithub

Thanks again for the thoughtful feedback — I've pushed toward as generic as safely possible while preserving ReAct behavior.

Core idea: Optimize tools jointly with the predictor that uses them.

For generic tool modules, I detect predictors with dspy.Tool in their signature at compile time, then discover actual tools from traces at runtime. For ReAct, we know tools statically from module.tools. Both share the same ToolModuleProposer and update path — the only difference is when tools are discovered.

Here are my thoughts on a few design choices. Feel free to comment on these or anything else:

Prefix-based grouping (react_module: / tool_module:) — I encode module type in string keys to preserve GEPA's dict[str, str] interface. A bit hacky, but avoids changing the core adapter API.
ReAct tracing uses extract only — The extract predictor's trace already contains the full trajectory, so this avoids the duplicated-prefix issue from Reflective dataset contains potentially redundant data for ReAct agent gepa-ai/gepa#97.
No separate tool tracing yet — Reflective dataset uses predictor-level traces only. Tool inputs are always captured (since dspy.Tool objects flow through predictor inputs), but tool outputs depend on how users design their module. If users need tool outputs in the reflective dataset, they can wire them back into a predictor. Is this enough for now, or should tool calls go into dspy.settings.trace?
ReAct still has special handling — Because it has 2 predictors (react + extract) that need joint optimization. Is this acceptable given the maintenance blackhole concern, or should I push for fully generic?

Ran an experiment with nested ReAct + custom tool module: https://gist.github.com/Ju-usc/80b9918fe07288204579df735e084cb4

Happy to iterate!

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-29T07:28:21Z

dspy/teleprompt/gepa/instruction_proposal.py

+            current_module_config = json.loads(candidate[module_key])
+
+            # Predictor keys: 1 for tool modules, 2 for ReAct modules (extra extract predictor)
+            predictor_keys = [k for k, v in current_module_config.items() if isinstance(v, str)]


Potential IndexError if predictor_keys is empty. Add validation to ensure at least one predictor key exists before accessing index 0. Consider: if not predictor_keys: logger.warning(...); continue or raising a more descriptive error.

Suggested change

predictor_keys = [k for k, v in current_module_config.items() if isinstance(v, str)]

predictor_keys = [k for k, v in current_module_config.items() if isinstance(v, str)]

if not predictor_keys:

logger.warning(f"No predictor keys found for module '{module_key}'. Skipping.")

continue

I don't think this is needed - config is built internally by GEPA and always contains predictor keys. Edge case seems impossible.

Copilot · 2025-11-29T07:28:21Z

dspy/teleprompt/gepa/instruction_proposal.py

+            for tool_name, tool_info in current_tools_dict.items():
+                # Update tool description if LM proposed a change
+                improved_tool_desc = getattr(result, f"improved_tool_{tool_name}_desc", None)
+                if improved_tool_desc is not None:
+                    tool_info["desc"] = improved_tool_desc
+
+                # Update arg descriptions if LM proposed changes
+                for arg_name in tool_info["args"].keys():
+                    improved_tool_arg_desc = getattr(result, f"improved_tool_{tool_name}_arg_{arg_name}_desc", None)
+                    if improved_tool_arg_desc is not None:
+                        tool_info["args"][arg_name]["description"] = improved_tool_arg_desc
+
+                improved_module_config["tools"][tool_name] = tool_info


Mutating the input data structure. tool_info is a reference to a dict in current_tools_dict (from line 460), so modifications on lines 464 and 470 mutate the original candidate data. This can cause unintended side effects across GEPA iterations. Create a deep copy: import copy and tool_info = copy.deepcopy(tool_info) after line 460.

I don't think this is an issue - candidate[module_key] is a json string, so json.loads() creates a new dict. Mutations don't affect the original.

docs/docs/api/optimizers/GEPA/GEPA_Advanced.md

chenmoneygithub

Thank you very much for the update!

The direction looks great to me, but the implementation, especially how we extract tool trace from the DSPy traces is a bit fragile. This is not directly caused by the PR, but in DSPy we are lacking this important lineage. As of now, we can probably do the following:

Replace the generic tool capture + optimization code by a TODO.
Still protect ReAct-specific logic in an if branch.
Ship the ReAct code + tutorial.

This is still more generic than the last version, and leave a decent room for us to maintain and scale to generic tool optimization. Meanwhile, we will work on a robust way to connect tool calls to predict traces, after which we can revisit the generic tool optimization.

chenmoneygithub · 2025-12-01T21:15:16Z

dspy/teleprompt/gepa/gepa_utils.py

+        self.propose_new_texts = self._build_propose_new_texts()
        self.reflection_minibatch_size = reflection_minibatch_size

+    def _build_propose_new_texts(self):


This method is huge, and it is doing two things:

Configure the proposer for predicts and tools

Create a function that generates the proposal

We don't necessarily need to create propose_component_texts on the fly, so I recommend separating propose_component_texts out to be a private method.

chenmoneygithub · 2025-12-01T21:15:56Z

dspy/teleprompt/gepa/gepa_utils.py

+        tool_module_proposer = None
+        if self.enable_tool_optimization:
+            from dspy.teleprompt.gepa.instruction_proposal import ToolModuleProposer
+            tool_module_proposer = ToolModuleProposer()


module literal here is confusing, maybe either tool_proposer or tool_desc_proposer

yes just refactored!

chenmoneygithub · 2025-12-01T21:17:20Z

dspy/teleprompt/gepa/gepa_utils.py

        new_prog = self.student.deepcopy()
+
+        # Start with plain string instructions from candidate
+        improved_predictors = {


maybe call this predict_candidates? improved suggests that this has already been improved.

chenmoneygithub · 2025-12-01T21:19:28Z

dspy/teleprompt/gepa/gepa_utils.py

+                pred.signature = pred.signature.with_instructions(improved_predictors[name])
+
+        # Update tool descriptions
+        if improved_tools:


nit: Make this a standalone method to reduce nesting.

chenmoneygithub · 2025-12-01T21:24:00Z

dspy/teleprompt/gepa/gepa.py

+            for module_path, module in student.named_sub_modules():
+                if isinstance(module, ReAct):
+                    logger.warning(
+                        f"Detected ReAct module at '{module_path}'. Consider using "


This should be a info instead warning IMO, unless we can articulate that involving tools in the optimization works strictly better than not involving tools.

chenmoneygithub · 2025-12-01T21:49:09Z

dspy/teleprompt/gepa/gepa_utils.py


        return ret_d

+    def _update_candidate_tools(self, candidate, program, trajectories) -> None:


could you help me understand the purpose of this method?

From my rough understanding, you are trying to link the tool to the predict's trace where the tool is getting used. Is that correct?

Yes that is correct, the idea was to scan traces for predictors with tool inputs and extract them for optimization.

Unlike ReAct where we can directly access module.tools, generic dspy.Tool doesn't easily expose a predictor→tool mapping. So the approach was to find Tool objects in the trace inputs, then update the candidate dict with tool configs - since that's the interface defining what gets optimized.

… only

…omponents

… modules - Rename REACT_MODULE_PREFIX to TOOL_MODULE_PREFIX - Single abstraction for tool modules (ReAct now, generic later) - Use count-based detection for extract predictor instead of prefix check - Update docs to reflect new naming

…support

- Remove test_detect_single_tool, test_detect_multiple_tools - Remove test_apply_optimized_tool_descriptions - Update REACT_MODULE_PREFIX -> TOOL_MODULE_PREFIX - Update docstring to reflect ReAct-only support

…t-only - Remove self._tool_proposer instance variable - Create ToolProposer locally when needed (stateless) - Update overview.md to say ReAct-only instead of 'any module'

- Remove generic tool module references, keep ReAct only - Update JSON structure examples to show both react and extract predictors - Fix comment in custom proposer example

Ju-usc · 2025-12-02T03:04:59Z

Thank you very much for the update!

The direction looks great to me, but the implementation, especially how we extract tool trace from the DSPy traces is a bit fragile. This is not directly caused by the PR, but in DSPy we are lacking this important lineage. As of now, we can probably do the following:

Replace the generic tool capture + optimization code by a TODO.

Still protect ReAct-specific logic in an if branch.

Ship the ReAct code + tutorial.

This is still more generic than the last version, and leave a decent room for us to maintain and scale to generic tool optimization. Meanwhile, we will work on a robust way to connect tool calls to predict traces, after which we can revisit the generic tool optimization.

Thanks for all the suggestions! I've addressed everything and the PR is ready for review.

Summary of changes:

logger.warning → logger.info for ReAct detection
Extracted _propose_component_texts as private method
TODO'd out generic tool optimization (keeping ReAct-only for now)
Removed generic tool module detection, kept ReAct only
Naming: improved_predictors → predictor_candidates, improved_tools → tool_candidates
Extracted _update_tool_descriptions and _collect_tools methods
Renamed ToolModuleProposer → ToolProposer
Unified prefix: REACT_MODULE_PREFIX → TOOL_MODULE_PREFIX
Updated all docs and tests for ReAct-only support

Let me know if anything else needs to be addressed!

chenmoneygithub · 2025-12-02T22:41:27Z

@Ju-usc Thank you so much for the persistent work, and great PR! I will handle the rest of the work and play a bit more with it then merge.

Ju-usc · 2025-12-03T02:06:05Z

@LakshyAAAgrawal @chenmoneygithub This wouldn't have been possible without all your guidance and feedback! Thank you so much. Really happy to make my first open source contribution and learned a lot!

LakshyAAAgrawal · 2025-12-04T19:20:46Z

Dear @Ju-usc,

I would just like to say that this is a very very non-trivial contribution, and we all sincerely thank you for going through so many iterations to perfect the PR. Thanks a lot to @chenmoneygithub as well, for kindly offering his valuable advise to make this addition truly generalizable.

Ju-usc added 3 commits October 9, 2025 20:07

style: fix ruff formatting (trailing whitespace)

cf0be4f

style: apply ruff formatting fixes

aa53fe2

Ju-usc added 2 commits October 10, 2025 02:06

docs(gepa): clean up multi-agent example code

c4f2041

- Define tool functions outside class for clarity - Match structure of simple ReAct example - Add clear comments explaining architecture - Make code more readable and maintainable

Ju-usc force-pushed the feature/tool-description-optimization branch from 197f077 to c4f2041 Compare October 10, 2025 09:38

LakshyAAAgrawal reviewed Oct 11, 2025

View reviewed changes

docs/docs/api/optimizers/GEPA/GEPA_Advanced.md Outdated Show resolved Hide resolved

LakshyAAAgrawal reviewed Oct 11, 2025

View reviewed changes

dspy/teleprompt/gepa/instruction_proposal.py Outdated Show resolved Hide resolved

LakshyAAAgrawal reviewed Oct 11, 2025

View reviewed changes

dspy/teleprompt/gepa/instruction_proposal.py Outdated Show resolved Hide resolved

Ju-usc added 7 commits October 11, 2025 17:38

fix(gepa): unify custom proposer routing for tools

04f7e3d

docs(gepa): clarify tool reflection prompt

f92e184

test: streamline GEPA tool optimization tests

7178869

fix(gepa): streamline tool proposer formatting

e34703b

test(gepa): drop legacy dummy tool fixture

3f05311

Ju-usc requested review from chenmoneygithub and Copilot November 29, 2025 07:24

Copilot started reviewing on behalf of Ju-usc November 29, 2025 07:24 View session

Copilot finished reviewing on behalf of Ju-usc November 29, 2025 07:27

Copilot AI reviewed Nov 29, 2025

View reviewed changes

docs(gepa): replace eval() example with get_weather tool

09990a6

chenmoneygithub reviewed Dec 1, 2025

View reviewed changes

Ju-usc added 14 commits December 1, 2025 16:36

fix(gepa): change ReAct detection log from warning to info

33fc771

refactor(gepa): extract _propose_component_texts as private method

fa72fc0

refactor(gepa): TODO out generic tool module optimization, keep ReAct…

2a15e56

… only

refactor(gepa): remove generic tool module detection, keep ReAct only

59f23e5

refactor(gepa): improve naming and extract tool update methods

68d7021

refactor(gepa): remove unused TOOL_MODULE_PREFIX and rename to tool_c…

d99ba1d

…omponents

refactor(gepa): rename ToolModuleProposer to ToolProposer

3fd9a0a

docs(gepa): update tool optimization docs for ReAct-only support

7d64e7a

docs(gepa): remove CustomAgent example, keep ReAct only

3a5fb7f

docs(gepa): update enable_tool_optimization docstring for ReAct-only …

0e75d8c

…support

test(gepa): remove generic tool tests, keep ReAct-only tests

734fbdf

- Remove test_detect_single_tool, test_detect_multiple_tools - Remove test_apply_optimized_tool_descriptions - Update REACT_MODULE_PREFIX -> TOOL_MODULE_PREFIX - Update docstring to reflect ReAct-only support

refactor(gepa): use local ToolProposer variable, update docs for ReAc…

1fb15ba

…t-only - Remove self._tool_proposer instance variable - Create ToolProposer locally when needed (stateless) - Update overview.md to say ReAct-only instead of 'any module'

docs(gepa): update tool optimization docs for ReAct-only support

da2f6d0

- Remove generic tool module references, keep ReAct only - Update JSON structure examples to show both react and extract predictors - Fix comment in custom proposer example

Ju-usc requested a review from chenmoneygithub December 2, 2025 03:05

some fixes

a942246

chenmoneygithub approved these changes Dec 5, 2025

View reviewed changes

chenmoneygithub merged commit b6115d4 into stanfordnlp:main Dec 5, 2025
14 checks passed


		The `optimize_tool_descriptions` parameter enables GEPA to optimize tool descriptions in addition to signature instructions. This is particularly valuable for ReAct agents and other tool-using systems, where the quality of tool descriptions directly impacts the agent's ability to select appropriate tools for each task.

		Unlike signature instructions that guide reasoning strategies, tool descriptions serve a fundamentally different purpose: they help agents decide which tool to use in a given situation. GEPA recognizes this categorical difference and applies a specialized reflection prompt tailored for tool selection decisions.


		Consider enabling `optimize_tool_descriptions=True` when:

		- Building ReAct agents: ReAct agents rely on tool descriptions to make action selection decisions


		Your task is to write a better description for this tool.

		Read the examples carefully and identify patterns in when the tool was used successfully versus when it was misused or overlooked. Identify any domain-specific information about the tool's capabilities or appropriate usage that may not be available to the assistant in the future. The assistant may have developed effective patterns for tool selection - if so, ensure the tool description supports those patterns.


		return ret_d

		def _update_candidate_tools(self, candidate, program, trajectories) -> None:

feat(gepa): add tool description optimization for multi-agent systems #8928

feat(gepa): add tool description optimization for multi-agent systems #8928

Conversation

Ju-usc commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Issue

Changes

Core Implementation

Testing

Documentation

Usage Example

ReAct Agent

Key Features

Uh oh!

Ju-usc commented Oct 10, 2025

Uh oh!

LakshyAAAgrawal commented Oct 10, 2025

Uh oh!

Ju-usc commented Oct 10, 2025

Uh oh!

Ju-usc commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LakshyAAAgrawal Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

LakshyAAAgrawal commented Oct 11, 2025

Uh oh!

Ju-usc commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chenmoneygithub left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ju-usc commented Oct 10, 2025 •

edited

Loading

Ju-usc commented Oct 11, 2025 •

edited

Loading

LakshyAAAgrawal Oct 11, 2025 •

edited

Loading

Ju-usc commented Nov 20, 2025 •

edited

Loading

Ju-usc commented Dec 2, 2025 •

edited

Loading