feat: Add progressive tool discovery system by sidhantu123 · Pull Request #1644 · modelcontextprotocol/python-sdk

sidhantu123 · 2025-11-19T08:58:51Z

[DRAFT PR FOR OBSERVABILITY] Overview

This PR implements optional progressive disclosure of MCP tools through semantic grouping and lazy-loading. Servers can organize tools into semantic groups with gateway tools that load actual tools on-demand, achieving significant reduction in context token usage at the start of conversations.

Problem Being Solved

MCP servers with 50+ tools currently expose all tool definitions upfront, causing significant context bloat. This wastes valuable context budget and makes it harder for LLMs to reason effectively. Additionally, server developers struggle to unify disparate tools across different domains into a coherent organization system.

Solution

Progressive tool discovery allows servers to:

Organize tools into semantic groups (e.g., repo creation, issue management, code review)
Expose only lightweight gateway tools initially (~500 tokens vs ~4,000+)
Load full tool definitions on-demand when LLM decides which group is relevant
Keep unneeded tools out of context entirely
Provide a unified way to organize tools even across complex servers like GitHub MCP

This enables server developers to organize disparate tools (like GitHub's repo management, pull requests, code changes, collaboration features) into semantic groups that make sense for users and LLMs.

Key Features

Opt-in: is_discovery_enabled defaults to False
Minimal API: Single method to enable
Programmatic: Tool groups defined in Python code
Non-blocking: Client refresh happens in background
Backward compatible: 100% compatible with existing code
Unified organization: Works across disparate tool sets

Implementation

New components:

ToolGroup: Semantic organization of related tools
ToolGroupRegistry & ToolGroupManager: Discovery infrastructure
Server.enable_discovery_with_groups(): Main API
is_discovery_enabled: Status property

Example Use Case

# GitHub MCP Server organizing disparate tools
repo_group = ToolGroup(
    name="repo_management",
    description="Create, configure, and manage GitHub repositories",
    tools=[create_repo, set_description, add_initial_files]
)

code_changes_group = ToolGroup(
    name="code_changes",
    description="Commit, push, and manage code changes",
    tools=[commit, push, create_branch]
)

server = Server("github-mcp")
server.enable_discovery_with_groups([repo_group, code_changes_group])

Testing

All tests pass:

5 new discovery-specific tests passing
41/44 total tests passing
Backward compatibility verified

Backward Compatibility

100% compatible:

Discovery is entirely opt-in
Existing servers work unchanged
Existing clients work unchanged
No protocol changes

Benefits

1. Improved LLM Reasoning & Reduced Hallucination

With fewer tools in context, LLMs have better focus and make fewer mistakes about which tools are available. Significantly reduces the likelihood of LLMs hallucinating about non-existent tools or choosing irrelevant ones.

2. Lower API Costs

Fewer tokens per request means reduced billing for users. This is especially impactful for high-volume applications where token savings directly translate to cost savings.

3. Faster Initial Connection

Quicker listTools() response times at conversation start (500 tokens instead of 4,000+). Provides better user experience with lower latency on the critical first interaction.

4. Extensibility for Large Codebases

Makes it trivial to add new tools without overwhelming LLMs with choices. Enables servers to grow from 10 tools to 100+ tools without degradation in LLM performance.

5. Aligns with Industry Standards

Follows the same pattern as OpenAI's function calling for large tool sets. Reflects best practices in API design and progressive disclosure that users expect.

6. Multi-Agent Ready

Different agents can be given different tool subsets based on their responsibilities. Enables more sophisticated orchestration patterns and agent specialization.

7. Better Composability

Makes it easier to combine multiple MCP servers and organize their tools coherently. Solves the "tool management" problem for complex deployments with many integrated services.

Impact

This feature addresses a fundamental scalability challenge in MCP as server toolsets grow larger. It enables the ecosystem to handle production-scale deployments with hundreds of tools while maintaining optimal LLM performance and user experience.

Implements optional progressive disclosure of MCP tools through semantic grouping and lazy-loading. Servers can organize tools into semantic groups with gateway tools that load actual tools on-demand, achieving significant reduction in context token usage. ## Architecture ### Core Components - ToolGroup: Semantic organization of related tools - ToolGroupRegistry & ToolGroupManager: Discovery infrastructure - Server.enable_discovery_with_groups(): Simple API to enable discovery - is_discovery_enabled: Property to check discovery status - Client-side tool refresh: Automatic handling via ToolListChangedNotification ### Key Features - Hybrid mode: Mix direct tools with grouped tools - Lazy loading: Tools load only when needed - Non-blocking refresh: Tool refresh happens in background - Backward compatible: Discovery is entirely opt-in - No protocol changes: Works with existing MCP clients ## Usage Example ```python from mcp.server import Server from mcp import ToolGroup, Tool math_group = ToolGroup( name='math', description='Mathematical operations', tools=[ Tool(name='add', description='Add numbers', inputSchema={...}), Tool(name='subtract', description='Subtract numbers', inputSchema={...}), ] ) server = Server('my-service') server.enable_discovery_with_groups([math_group]) ``` ## Testing - 5 new discovery-specific tests: All passing - 41/44 total tests passing (3 pre-existing unrelated failures) - Backward compatibility verified with SDK examples - Real-world examples with live weather APIs ## Files Changed New files: - src/mcp/server/discovery/__init__.py - src/mcp/server/discovery/manager.py - src/mcp/server/discovery/tool_group.py - tests/test_discovery.py - tests/test_discovery_integration.py - examples/discovery/ (with server, agent, and README) Modified files: - src/mcp/__init__.py (export ToolGroup) - src/mcp/client/session.py (callback support) - src/mcp/client/session_group.py (tool refresh handling) - src/mcp/server/lowlevel/server.py (discovery integration) - tests/client/test_session_group.py (5 new tests) ## Benefits - Token efficiency: Significant reduction in context token usage for large tool sets - Scalability: Supports servers with many tools - LLM autonomy: LLM decides which tools to load - Clean architecture: Semantic grouping is explicit - Backward compatible: No breaking changes, fully opt-in

sidhantu123 · 2025-11-19T10:24:11Z

Updated with latest commit: chore: Added correct example usage

This includes:

Fixed progressive_discovery_server.py with simplified 2-group design (math & weather tools)
Removed prompt complexity - focused on core progressive discovery
Lightweight group descriptions
Corrected ai_agent.py to reference the proper server filename

Example server now demonstrates clean, focused progressive tool discovery with zero complexity.

maxisbey · 2025-11-19T20:42:47Z

Hey thank you for putting this all together, it's an interesting take on solving this problem.

For now we won't be able to merge this into the SDK as this isn't something that's part of the spec. Although these problems are very real, the way to solve them are actively being discussed and planned out without any decision yet. If you'd like to track or contribute to those discussions you can take a look at these:

Although, I definitely think the SDK should be able to support additions like this on top of the methods the SDK provides, and could be reworked/refactored to be able to better support these sort of low level additions/modifications of server behaviour (while still staying within the bounds of the MCP spec itself).

So for now I'm going to close the PR for the above reasons. But, please take a look at the above links and see if there are any you'd like to contribute your ideas to, that'd actually be super helpful!

chore: Added correct example usage

2883104

sidhantu123 force-pushed the mcp-progressive-discovery branch from 06878f8 to 2883104 Compare November 19, 2025 10:29

maxisbey closed this Nov 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add progressive tool discovery system#1644

feat: Add progressive tool discovery system#1644
sidhantu123 wants to merge 2 commits intomodelcontextprotocol:mainfrom
sidhantu123:mcp-progressive-discovery

sidhantu123 commented Nov 19, 2025 •

edited

Loading

Uh oh!

sidhantu123 commented Nov 19, 2025

Uh oh!

maxisbey commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sidhantu123 commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

[DRAFT PR FOR OBSERVABILITY] Overview

Problem Being Solved

Solution

Key Features

Implementation

Example Use Case

Testing

Backward Compatibility

Benefits

1. Improved LLM Reasoning & Reduced Hallucination

2. Lower API Costs

3. Faster Initial Connection

4. Extensibility for Large Codebases

5. Aligns with Industry Standards

6. Multi-Agent Ready

7. Better Composability

Impact

Uh oh!

sidhantu123 commented Nov 19, 2025

Uh oh!

maxisbey commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sidhantu123 commented Nov 19, 2025 •

edited

Loading