Skip to content

Commit aa0825d

Browse files
IronAdamantclaude
andcommitted
v0.6.5: Add triage tool, next_steps hints, inline coupling partners
Three improvements to help LLM agents chain tools effectively: 1. next_steps in MCP responses — contextual follow-up suggestions appended to tool outputs (risk_map, diff_impact, test_gaps, etc.) so agents know what to invoke next. CLI unaffected. 2. Inline coupling partners in risk_map — top 3 co-change partners per file included in each entry. Zero extra DB queries. 3. New triage tool (#16) — combined risk_map + test_gaps + stale_tests in a single read lock. One command for pre-audit prioritization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 06c82ab commit aa0825d

13 files changed

Lines changed: 587 additions & 14 deletions

CLAUDE.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,11 @@ chisel/
1414
metrics.py — Pure computation: churn scoring, ownership aggregation, co-change detection. _parse_iso_date shared utility.
1515
test_mapper.py — Test file discovery, framework detection, dependency extraction, edge building.
1616
impact.py — Impact analysis, risk scoring, stale test detection, reviewer suggestions. Caches failure rates.
17-
cli.py — argparse CLI (17 subcommands). _run_tool() shared handler. Entry point: chisel.cli:main
18-
schemas.py — JSON Schema definitions for all 15 tools + dispatch table. Shared by HTTP and stdio servers.
17+
cli.py — argparse CLI (18 subcommands). _run_tool() shared handler. Entry point: chisel.cli:main
18+
schemas.py — JSON Schema definitions for all 16 tools + dispatch table. Shared by HTTP and stdio servers.
1919
mcp_server.py — HTTP MCP server (GET /tools, /health, POST /call). ThreadedHTTPServer. dispatch_tool() shared by both servers.
2020
mcp_stdio.py — stdio MCP server (requires optional 'mcp' package). _configure_server() for engine lifecycle mgmt.
21+
next_steps.py — Contextual next-step suggestions for MCP tool responses. compute_next_steps() dispatched per tool.
2122
rwlock.py — Read-write lock for in-process concurrent access.
2223
```
2324

@@ -44,7 +45,10 @@ chisel/
4445
- **Unit-churn scaling**: `_UNIT_CHURN_FILE_LIMIT = 2000` in `engine.py`. Repos with more than 2000 code files skip per-function `git log -L` churn (each function spawns a subprocess). File-level churn is always computed. Validated on Grafana (21k files, 62k units in ~3 min).
4546
- **Numstat validation**: `_parse_log_output` in `git_analyzer.py` validates tab-separated fields are digits or `-` before treating them as numstat. Diff lines with tabs were being misidentified as numstat entries in `git log -L` output.
4647
- **Encoding safety**: All `subprocess.run()` calls use `encoding="utf-8", errors="replace"`. Git history may contain non-UTF-8 bytes (Latin-1 commit messages, binary diff fragments); these are replaced with `` instead of crashing. File reads in `engine.py` and `test_mapper.py` already used `errors="replace"`.
47-
- **Empty-state detection**: All 11 query tools return `{"status": "no_data", "message": "...", "hint": "chisel analyze"}` when the DB has no analysis data, instead of `[]`. `_check_analysis_data()` in `engine.py` calls `storage.has_analysis_data()` (`SELECT 1 FROM code_units LIMIT 1`). Write tools (`analyze`, `update`, `record_result`) and `stats` are unaffected. `stats` adds a `hint` key when all counts are zero. CLI detects this via `_is_no_data()` in `cli.py`.
48+
- **Empty-state detection**: All 12 query tools return `{"status": "no_data", "message": "...", "hint": "chisel analyze"}` when the DB has no analysis data, instead of `[]`. `_check_analysis_data()` in `engine.py` calls `storage.has_analysis_data()` (`SELECT 1 FROM code_units LIMIT 1`). Write tools (`analyze`, `update`, `record_result`) and `stats` are unaffected. `stats` adds a `hint` key when all counts are zero. CLI detects this via `_is_no_data()` in `cli.py`.
49+
- **Next-step suggestions**: `next_steps.py` provides `compute_next_steps(tool_name, result)` which returns contextual follow-up suggestions per tool. Integrated at the dispatch level in `mcp_server.py` — HTTP responses include `"next_steps": [...]` as a sibling to `"result"`, stdio wraps both in a `{"result": ..., "next_steps": [...]}` envelope. CLI is unaffected. Only tools with registered hint functions get suggestions; others return empty.
50+
- **Inline coupling partners**: `risk_map` includes `"coupling_partners"` (top 3 by co-commit count) in each file entry alongside the breakdown. Data is already fetched in the batch query — no extra DB calls.
51+
- **Triage tool**: Composite `triage` runs `risk_map` (top-N) + `test_gaps` (filtered to top-N files) + `stale_tests` in a single read lock. Returns a dict, not a list, so `limit` is not injected.
4852

4953
## Dev Commands
5054

@@ -66,13 +70,14 @@ impact.py → metrics.py
6670
metrics.py → (no internal deps)
6771
cli.py → engine.py, mcp_server.py, mcp_stdio.py
6872
schemas.py → (no internal deps)
69-
mcp_server.py → engine.py, schemas.py
73+
mcp_server.py → engine.py, next_steps.py, schemas.py
7074
mcp_stdio.py → engine.py, mcp_server.py, schemas.py
75+
next_steps.py → (no internal deps)
7176
```
7277

73-
## 15 MCP Tools
78+
## 16 MCP Tools
7479

75-
`analyze`, `impact`, `suggest_tests`, `churn`, `ownership`, `coupling`, `risk_map`, `stale_tests`, `history`, `who_reviews`, `diff_impact`, `update`, `test_gaps`, `record_result`, `stats`
80+
`analyze`, `impact`, `suggest_tests`, `churn`, `ownership`, `coupling`, `risk_map`, `stale_tests`, `history`, `who_reviews`, `diff_impact`, `update`, `test_gaps`, `record_result`, `stats`, `triage`
7681

7782
Each wired through: engine.tool_*() → CLI subcommand, HTTP POST /call, stdio MCP.
7883

@@ -81,5 +86,6 @@ Each wired through: engine.tool_*() → CLI subcommand, HTTP POST /call, stdio M
8186
- **`test_gaps`**: Finds code units with zero test coverage, prioritized by churn risk. Excludes test files by default.
8287
- **`record_result`**: Records test pass/fail outcomes. Feeds into `suggest_tests` (failure rate boost) and `risk_map` (test instability component).
8388
- **`stats`**: Returns summary counts for all database tables (code units, tests, edges, commits, etc.).
89+
- **`triage`**: Combined risk_map + test_gaps + stale_tests for top-N riskiest files. Single command for pre-audit/refactor prioritization. Returns `{top_risk_files, test_gaps, stale_tests, summary}`.
8490
- **`limit` parameter**: All list-returning tools accept `limit` to cap result size.
8591
- **Adaptive coupling threshold**: `max(3, total_commits // 4)` — scales with project maturity.

chisel/cli.py

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,14 @@ def create_parser():
133133
sub.add_parser("stats", parents=[shared],
134134
help="Show database summary counts")
135135

136+
# triage
137+
p_triage = sub.add_parser("triage", parents=[shared],
138+
help="Combined risk + gap + stale triage")
139+
p_triage.add_argument("directory", nargs="?", default=None,
140+
help="Directory to scope (default: all)")
141+
p_triage.add_argument("--top-n", type=int, default=10,
142+
help="Number of top-risk files (default: 10)")
143+
136144
# serve
137145
p_serve = sub.add_parser("serve", parents=[shared],
138146
help="Start HTTP server")
@@ -332,6 +340,35 @@ def fmt(_result, args):
332340
fmt, use_limit=False)
333341

334342

343+
def cmd_triage(args):
344+
def fmt(result, _args):
345+
summary = result["summary"]
346+
print(f"Triage ({summary['files_triaged']} files):")
347+
print("\nTop risk files:")
348+
for r in result["top_risk_files"]:
349+
partners = ""
350+
cp = r.get("coupling_partners", [])
351+
if cp:
352+
names = [p["file"] for p in cp[:2]]
353+
partners = f" coupled: {', '.join(names)}"
354+
print(f" {r['file_path']}: {r['risk_score']}{partners}")
355+
if result["test_gaps"]:
356+
print(f"\nTest gaps ({summary['total_test_gaps']}):")
357+
for g in result["test_gaps"]:
358+
print(f" {g['file_path']}:{g['name']} ({g['unit_type']})")
359+
else:
360+
print("\nNo test gaps in triaged files.")
361+
if result["stale_tests"]:
362+
print(f"\nStale tests ({summary['total_stale_tests']}):")
363+
for s in result["stale_tests"]:
364+
print(f" {s['test_id']} ({s['edge_type']})")
365+
else:
366+
print("\nNo stale tests found.")
367+
return _run_tool(args, "tool_triage",
368+
{"directory": args.directory, "top_n": args.top_n},
369+
fmt, use_limit=False)
370+
371+
335372
def cmd_stats(args):
336373
return _run_tool(args, "tool_stats", {},
337374
_fmt_kv("Chisel database stats:"), use_limit=False)
@@ -381,6 +418,7 @@ def cmd_serve_mcp(args):
381418
"update": cmd_update,
382419
"test-gaps": cmd_test_gaps,
383420
"record-result": cmd_record_result,
421+
"triage": cmd_triage,
384422
"stats": cmd_stats,
385423
"serve": cmd_serve,
386424
"serve-mcp": cmd_serve_mcp,

chisel/engine.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -296,6 +296,31 @@ def tool_record_result(self, test_id, passed, duration_ms=None):
296296
self.storage.record_test_result(test_id, passed, duration_ms)
297297
return {"test_id": test_id, "passed": passed, "recorded": True}
298298

299+
def tool_triage(self, directory=None, top_n=10):
300+
"""MCP tool: combined risk_map + test_gaps + stale_tests triage."""
301+
with self._process_lock.shared():
302+
with self.lock.read_lock():
303+
empty = self._check_analysis_data()
304+
if empty is not None:
305+
return empty
306+
risk_map = self.impact.get_risk_map(directory)[:top_n]
307+
test_gaps = self.impact.get_test_gaps(directory=directory)
308+
stale = self.impact.detect_stale_tests()
309+
310+
top_files = {r["file_path"] for r in risk_map}
311+
relevant_gaps = [g for g in test_gaps if g["file_path"] in top_files]
312+
313+
return {
314+
"top_risk_files": risk_map,
315+
"test_gaps": relevant_gaps,
316+
"stale_tests": stale,
317+
"summary": {
318+
"files_triaged": len(risk_map),
319+
"total_test_gaps": len(relevant_gaps),
320+
"total_stale_tests": len(stale),
321+
},
322+
}
323+
299324
def tool_stats(self):
300325
"""MCP tool: get summary counts for the Chisel database."""
301326
with self._process_lock.shared():

chisel/impact.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -267,6 +267,16 @@ def get_risk_map(self, directory=None):
267267
co_changes = co_changes_batch.get(fp, [])
268268
coupling_norm = min(len(co_changes) / 10.0, 1.0)
269269

270+
# Top 3 coupling partners (by co-commit count, desc)
271+
sorted_cc = sorted(co_changes, key=lambda c: c["co_commit_count"], reverse=True)[:3]
272+
coupling_partners = [
273+
{
274+
"file": cc["file_b"] if cc["file_a"] == fp else cc["file_a"],
275+
"co_commits": cc["co_commit_count"],
276+
}
277+
for cc in sorted_cc
278+
]
279+
270280
code_units = code_units_batch.get(fp, [])
271281
tested_count = 0
272282
covering_test_ids = set()
@@ -295,6 +305,7 @@ def get_risk_map(self, directory=None):
295305
"file_path": fp,
296306
"unit_name": None,
297307
"risk_score": round(risk, 4),
308+
"coupling_partners": coupling_partners,
298309
"breakdown": {
299310
"churn": round(churn_norm, 4),
300311
"coupling": round(coupling_norm, 4),

chisel/mcp_server.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
from socketserver import ThreadingMixIn
1515

1616
from chisel.engine import ChiselEngine
17+
from chisel.next_steps import compute_next_steps
1718
from chisel.schemas import _TOOL_DISPATCH, _TOOL_SCHEMAS
1819

1920
logger = logging.getLogger(__name__)
@@ -36,7 +37,8 @@ def dispatch_tool(engine, tool_name, arguments):
3637
result = getattr(engine, method_name)(**kwargs)
3738
if limit is not None and isinstance(result, list):
3839
result = result[:int(limit)]
39-
return result
40+
next_steps = compute_next_steps(tool_name, result)
41+
return result, next_steps
4042

4143

4244
# ------------------------------------------------------------------ #
@@ -127,8 +129,11 @@ def _handle_call(self):
127129
return
128130

129131
try:
130-
result = dispatch_tool(self.server.engine, tool_name, arguments)
131-
self._send_json({"result": result})
132+
result, next_steps = dispatch_tool(self.server.engine, tool_name, arguments)
133+
response = {"result": result}
134+
if next_steps:
135+
response["next_steps"] = next_steps
136+
self._send_json(response)
132137
except ValueError as exc:
133138
self._send_error_json(404, str(exc))
134139
except TypeError as exc:

chisel/mcp_stdio.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,14 +59,17 @@ async def call_tool(name: str, arguments: dict):
5959
"""Dispatch an MCP tool call to the appropriate engine method."""
6060
try:
6161
loop = asyncio.get_running_loop()
62-
result = await loop.run_in_executor(
62+
result, next_steps = await loop.run_in_executor(
6363
None, lambda: dispatch_tool(engine, name, arguments),
6464
)
6565
except Exception as exc:
6666
logger.exception("Error executing tool %s", name)
6767
return [TextContent(type="text", text=f"Error: {exc}")]
6868

69-
text = json.dumps(result, indent=2, default=str)
69+
payload = {"result": result}
70+
if next_steps:
71+
payload["next_steps"] = next_steps
72+
text = json.dumps(payload, indent=2, default=str)
7073
return [TextContent(type="text", text=text)]
7174

7275
return server

chisel/next_steps.py

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
"""Contextual next-step suggestions for MCP tool responses.
2+
3+
Computes follow-up tool suggestions based on what a tool returned,
4+
so LLM agents know what to invoke next. Only used by MCP servers
5+
(HTTP and stdio), not the CLI.
6+
"""
7+
8+
9+
def compute_next_steps(tool_name, result):
10+
"""Return a list of next-step suggestion strings for a tool result.
11+
12+
Args:
13+
tool_name: Name of the tool that produced the result.
14+
result: The tool's return value (dict or list).
15+
16+
Returns:
17+
List of strings, each a brief actionable suggestion. Empty list
18+
if no suggestions apply.
19+
"""
20+
fn = _TOOL_HINTS.get(tool_name)
21+
if fn is None:
22+
return []
23+
return fn(result)
24+
25+
26+
# ------------------------------------------------------------------ #
27+
# Per-tool hint functions
28+
# ------------------------------------------------------------------ #
29+
30+
def _hints_analyze(result):
31+
if isinstance(result, dict) and "code_files_scanned" in result:
32+
return [
33+
"Run 'risk_map' to identify high-risk files.",
34+
"Run 'test_gaps' to find untested code.",
35+
"Run 'triage' for a combined risk + gap + stale overview.",
36+
]
37+
return []
38+
39+
40+
def _hints_update(result):
41+
if isinstance(result, dict) and result.get("files_updated", 0) > 0:
42+
return [
43+
"Run 'diff_impact' to see which tests are affected by the changes.",
44+
"Run 'risk_map' to check updated risk scores.",
45+
]
46+
return []
47+
48+
49+
def _hints_risk_map(result):
50+
if isinstance(result, list) and result:
51+
top = result[:3]
52+
files = [r["file_path"] for r in top]
53+
steps = [
54+
"Run 'test_gaps' to find missing test coverage for high-risk files.",
55+
]
56+
# Suggest coupling drilldown for files with high coupling scores
57+
high_coupling = [
58+
r["file_path"] for r in top
59+
if r.get("breakdown", {}).get("coupling", 0) > 0.3
60+
]
61+
if high_coupling:
62+
steps.append(
63+
f"Run 'coupling {high_coupling[0]}' to see co-change partners."
64+
)
65+
# Suggest churn drilldown for high-churn files
66+
high_churn = [
67+
r["file_path"] for r in top
68+
if r.get("breakdown", {}).get("churn", 0) > 0.5
69+
]
70+
if high_churn:
71+
steps.append(
72+
f"Run 'churn {high_churn[0]}' for detailed change history."
73+
)
74+
steps.append(
75+
f"Run 'suggest_tests {files[0]}' for test recommendations on the riskiest file."
76+
)
77+
return steps
78+
if isinstance(result, list):
79+
return ["Run 'analyze' to populate risk data."]
80+
return []
81+
82+
83+
def _hints_diff_impact(result):
84+
if isinstance(result, list) and result:
85+
return [
86+
"Run the listed tests to verify your changes.",
87+
"Use 'record_result' to log outcomes for future prioritization.",
88+
"Run 'coupling' on changed files to check for hidden dependents.",
89+
]
90+
if isinstance(result, list):
91+
return [
92+
"Run 'test_gaps' to check if new code needs tests.",
93+
"Run 'update' if you've made changes since last analysis.",
94+
]
95+
return []
96+
97+
98+
def _hints_test_gaps(result):
99+
if isinstance(result, list) and result:
100+
top_file = result[0]["file_path"]
101+
return [
102+
"Write tests for the highest-churn untested units first.",
103+
f"Run 'churn {top_file}' to see change frequency.",
104+
f"Run 'ownership {top_file}' to find who can help write tests.",
105+
]
106+
if isinstance(result, list):
107+
return ["All code units have test coverage."]
108+
return []
109+
110+
111+
def _hints_stale_tests(result):
112+
if isinstance(result, list) and result:
113+
return [
114+
"Update or remove the stale tests listed above.",
115+
"Run 'update' to re-analyze after fixing test files.",
116+
]
117+
if isinstance(result, list):
118+
return ["All tests reference current code."]
119+
return []
120+
121+
122+
def _hints_impact(result):
123+
if isinstance(result, list) and result:
124+
return [
125+
"Run the impacted tests to verify correctness.",
126+
"Use 'record_result' to log outcomes for future prioritization.",
127+
]
128+
return []
129+
130+
131+
def _hints_suggest_tests(result):
132+
if isinstance(result, list) and result:
133+
return [
134+
"Run the suggested tests in order of relevance.",
135+
"Use 'record_result' to log outcomes for future prioritization.",
136+
]
137+
return []
138+
139+
140+
def _hints_triage(result):
141+
if isinstance(result, dict) and "summary" in result:
142+
steps = []
143+
if result["summary"].get("total_test_gaps", 0) > 0:
144+
steps.append(
145+
"Focus on files appearing in both risk and gap sections."
146+
)
147+
if result["top_risk_files"]:
148+
top = result["top_risk_files"][0]["file_path"]
149+
steps.append(f"Run 'suggest_tests {top}' on the highest-risk file.")
150+
steps.append(f"Run 'ownership {top}' to find who owns the riskiest code.")
151+
return steps
152+
return []
153+
154+
155+
# ------------------------------------------------------------------ #
156+
# Dispatch table
157+
# ------------------------------------------------------------------ #
158+
159+
_TOOL_HINTS = {
160+
"analyze": _hints_analyze,
161+
"update": _hints_update,
162+
"risk_map": _hints_risk_map,
163+
"diff_impact": _hints_diff_impact,
164+
"test_gaps": _hints_test_gaps,
165+
"stale_tests": _hints_stale_tests,
166+
"impact": _hints_impact,
167+
"suggest_tests": _hints_suggest_tests,
168+
"triage": _hints_triage,
169+
}

0 commit comments

Comments
 (0)