IronAdamant
diff --git a/‎CHANGELOG.md‎
Lines changed: 16 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎CLAUDE.md‎
Lines changed: 4 additions & 1 deletion b/‎CLAUDE.md‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎COMPLETE_PROJECT_DOCUMENTATION.md‎
Lines changed: 2 additions & 2 deletions b/‎COMPLETE_PROJECT_DOCUMENTATION.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎LLM_Development.md‎
Lines changed: 37 additions & 0 deletions b/‎LLM_Development.md‎
Lines changed: 37 additions & 0 deletions
diff --git a/‎chisel/__init__.py‎
Lines changed: 1 addition & 1 deletion b/‎chisel/__init__.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎chisel/ast_utils.py‎
Lines changed: 47 additions & 3 deletions b/‎chisel/ast_utils.py‎
Lines changed: 47 additions & 3 deletions
diff --git a/‎chisel/engine.py‎
Lines changed: 47 additions & 35 deletions b/‎chisel/engine.py‎
Lines changed: 47 additions & 35 deletions
@@ -5,6 +5,22 @@ All notable changes to Chisel are documented in this file.
 Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 This project uses [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.6.0] - 2026-03-22
+
+### Added
+
+- **Pluggable AST extractors**: `register_extractor(language, fn)` lets users override built-in regex extractors with tree-sitter, LSP, or other backends. `unregister_extractor()` reverts to built-in. `get_registered_extractors()` for introspection. Custom extractors checked before built-ins in `extract_code_units()`. Zero new dependencies.
+- **Batch SQL queries**: 5 new `get_*_batch()` methods in `storage.py` for edges, code units, co-changes, churn stats, and blame. `_chunked()` helper splits large batches to stay under SQLite's variable limit.
+- **Process-level read locks**: All read tool methods in `engine.py` now acquire `_process_lock.shared()` + `lock.read_lock()`. Write tools (`record_result`, `analyze`, `update`) acquire `_process_lock.exclusive()` + `lock.write_lock()`. Concurrent reads from multiple processes are now safe.
+- **Cross-platform ProcessLock**: `project.py` uses `fcntl.flock` on Unix and `LockFileEx`/`UnlockFileEx` via ctypes on Windows. Both support shared and exclusive locks.
+- 18 new tests: extractor registry (6), batch queries (7), process lock (3), engine lock wiring (2)
+
+### Changed
+
+- `impact.get_risk_map()` rewritten to use batch queries — computes all risk scores in ~5 queries instead of N*5 (eliminates N+1 pattern)
+- `ProcessLock._acquire()` takes `exclusive: bool` instead of a platform-specific lock type constant
+- 540 tests pass (up from 522)
+
 ## [0.5.4] - 2026-03-22
 
 ### Fixed
 
@@ -31,13 +31,16 @@ chisel/
 - **Blame caching**: Cached by file content hash, invalidated on change.
 - **Incremental updates**: File content hashes tracked in `file_hashes` table.
 - **Persistent connection**: Storage uses a single SQLite connection (`check_same_thread=False`) with RWLock for thread safety.
-- **Multi-agent safety**: `project.py` provides: (1) `detect_project_root()` canonicalizes via git common dir so worktrees share identity, (2) `normalize_path()` ensures consistent relative paths, (3) `resolve_storage_dir()` defaults to project-local `.chisel/` (priority: explicit > env > project-local > ~/.chisel/), (4) `ProcessLock` uses `fcntl.flock` for cross-process write coordination.
+- **Multi-agent safety**: `project.py` provides: (1) `detect_project_root()` canonicalizes via git common dir so worktrees share identity, (2) `normalize_path()` ensures consistent relative paths, (3) `resolve_storage_dir()` defaults to project-local `.chisel/` (priority: explicit > env > project-local > ~/.chisel/), (4) `ProcessLock` for cross-process coordination — shared locks for reads, exclusive for writes. Cross-platform: `fcntl.flock` on Unix, `LockFileEx` on Windows.
 - **SQLite concurrency**: 30s `busy_timeout` + exponential-backoff retry on `_execute` for cross-process SQLITE_BUSY.
 - **Ownership vs Reviewers**: `ownership` = blame-based (who wrote the code, `role: "original_author"`). `who_reviews` = commit-activity-based (who maintains it, `role: "suggested_reviewer"`).
 - **Shared constants**: `_SKIP_DIRS` and `_EXTENSION_MAP` live in `ast_utils.py`. `_CODE_EXTENSIONS` in `engine.py` is derived from `_EXTENSION_MAP`.
 - **Shared dispatch**: `dispatch_tool()` in `mcp_server.py` is used by both HTTP and stdio servers. Tool schemas and dispatch tables live in `schemas.py`.
 - **Edge weighting**: Test edges carry a weight (0.4-1.0) based on file proximity. Python import-path matching (`from myapp.utils import foo` → `myapp/utils.py:foo`) takes priority over name-only matching. `_compute_proximity_weight()` and `_matches_import_path()` in `test_mapper.py`.
 - **AST regex improvements**: C#/Java support nested generics `<A<B>>` and annotations/attributes `@Override`/`[Test]`. Kotlin supports extension functions `fun String.foo()`. C++ supports template functions and destructors `~Foo()`. Swift supports `@objc`-style attributes. Dart supports factory constructors and getters/setters.
+- **Pluggable extractors**: `register_extractor(lang, fn)` in `ast_utils.py` lets users override built-in regex extractors with tree-sitter or LSP-backed ones. `_custom_extractors` checked before `_EXTRACTORS` in `extract_code_units()`. Zero-dep — the registry is just hooks.
+- **Batch SQL queries**: `storage.py` provides `get_*_batch()` methods for edges, code units, co-changes, churn, and blame. `impact.get_risk_map()` uses these to compute all risk scores in ~5 queries total instead of N*5. `_chunked()` helper splits large batches to stay under SQLite's variable limit.
+- **Process-level read locks**: All read tool methods in `engine.py` acquire `_process_lock.shared()` (outer) + `lock.read_lock()` (inner). Writes acquire `_process_lock.exclusive()` + `lock.write_lock()`. This allows concurrent reads from multiple processes while blocking during writes.
 
 ## Dev Commands
 
 
@@ -2,7 +2,7 @@
 
 Test impact analysis and code intelligence for LLM agents. Zero external dependencies.
 
-**Version:** 0.5.4
+**Version:** 0.6.0
 **PyPI:** `chisel-test-impact`
 **License:** MIT
 **Python:** >= 3.9
@@ -35,7 +35,7 @@ Test impact analysis and code intelligence for LLM agents. Zero external depende
 | `chisel/metrics.py` | Pure computation: churn scoring, ownership aggregation, co-change detection | `collections`, `datetime`, `itertools` | [glossary: churn score](wiki-local/glossary.md) |
 | `chisel/test_mapper.py` | Test file discovery, framework detection (pytest/Jest/Go/Rust/Playwright), dependency extraction, test edge building | `ast`, `os`, `re`, `pathlib`, `chisel.ast_utils`, `chisel.project` | [glossary: test edge](wiki-local/glossary.md) |
 | `chisel/impact.py` | Impact analysis, risk scoring, stale test detection, ownership queries, reviewer suggestions | `collections`, `datetime`, `chisel.metrics`, `chisel.storage` (via constructor injection) | [glossary: risk score](wiki-local/glossary.md) |
-| `chisel/project.py` | Multi-agent safety: project root detection (worktree-aware), path normalization, storage dir resolution, cross-process file lock (ProcessLock) | `fcntl`, `os`, `subprocess`, `contextlib` | -- |
+| `chisel/project.py` | Multi-agent safety: project root detection (worktree-aware), path normalization, storage dir resolution, cross-platform file lock (ProcessLock) | `os`, `subprocess`, `sys`, `contextlib`; Unix: `fcntl`; Windows: `ctypes`, `msvcrt` | -- |
 | `chisel/engine.py` | Orchestrator -- owns Storage, GitAnalyzer, TestMapper, ImpactAnalyzer, RWLock, ProcessLock; exposes `tool_*()` methods for all 15 MCP tools | `os`, `chisel.ast_utils`, `chisel.git_analyzer`, `chisel.impact`, `chisel.project`, `chisel.rwlock`, `chisel.storage`, `chisel.test_mapper` | [spec-project](wiki-local/spec-project.md) |
 | `chisel/cli.py` | argparse CLI with 17 subcommands, dispatch table, output formatting | `argparse`, `json`, `os`, `chisel.engine` | [spec-project: CLI](wiki-local/spec-project.md) |
 | `chisel/mcp_server.py` | HTTP MCP server (GET /tools, /health; POST /call), ThreadedHTTPServer, tool schemas and dispatch table | `json`, `logging`, `threading`, `http.server`, `socketserver`, `chisel.engine` | [spec-project: MCP tools](wiki-local/spec-project.md) |
 
@@ -4,6 +4,43 @@ Chronological record of development activity on the Chisel project.
 
 ---
 
+## v0.6.0 -- 2026-03-22 -- Pluggable Extractors, Batch Queries, Cross-Platform Locks
+
+### Summary
+Four architectural improvements: pluggable AST extraction for tree-sitter/LSP integration, batch SQL to eliminate N+1 in risk_map, process-level shared locks for concurrent reads, cross-platform ProcessLock (Windows support via LockFileEx).
+
+### Pluggable AST Extraction (ast_utils.py)
+- `register_extractor(language, fn)` stores custom extractors in `_custom_extractors` dict
+- `extract_code_units()` checks custom extractors first, falls back to built-in regex
+- `unregister_extractor(language)` reverts to built-in (raises KeyError if not registered)
+- `get_registered_extractors()` returns shallow copy for introspection
+- Zero new dependencies — registry is just callable hooks
+
+### Batch SQL Queries (storage.py, impact.py)
+- 5 new batch methods: `get_edges_for_code_batch`, `get_code_units_by_files_batch`, `get_co_changes_batch`, `get_churn_stats_batch`, `get_blame_batch`
+- `_chunked()` helper splits lists into chunks of 900 to stay under SQLite's 999-variable limit
+- `impact.get_risk_map()` rewritten to use batch queries — ~5 total queries instead of N*5
+- `compute_risk_score()` unchanged for single-file use
+
+### Process-Level Read Locks (engine.py)
+- All 12 read tool methods now acquire `_process_lock.shared()` (outer) + `lock.read_lock()` (inner)
+- `tool_record_result` now acquires `_process_lock.exclusive()` + `lock.write_lock()`
+- `analyze()` and `update()` already used exclusive locks — no change
+- Lock nesting order: process lock (outer) → RWLock (inner) — always consistent
+
+### Cross-Platform ProcessLock (project.py)
+- Module-level `_IS_WINDOWS = sys.platform == "win32"` for platform detection
+- Unix: `fcntl.flock` (unchanged behavior)
+- Windows: `ctypes` calls to `kernel32.LockFileEx`/`UnlockFileEx` — supports both shared and exclusive locks
+- `_flock(fd, exclusive)` and `_funlock(fd)` are platform-neutral module functions
+- `ProcessLock._acquire(exclusive: bool)` replaces platform-specific lock type constants
+
+### Tests
+- 18 new tests: extractor registry (6), batch queries (7), cross-platform lock (3), engine lock wiring (2)
+- 540 tests total, all passing
+
+---
+
 ## v0.5.4 -- 2026-03-22 -- Codebase Audit: Simplify, Modernize, Harden
 
 ### Summary
 
@@ -1 +1 @@
-__version__ = "0.5.4"
+__version__ = "0.6.0"
@@ -543,6 +543,48 @@ def _name_kind(m):
 ]
 
 
+# ---------------------------------------------------------------------------
+# Custom extractor registry (plugin system)
+# ---------------------------------------------------------------------------
+
+_custom_extractors: dict[str, object] = {}
+
+
+def register_extractor(language, extractor):
+    """Register a custom code unit extractor for a language.
+
+    Custom extractors override the built-in regex-based ones, allowing
+    tree-sitter, LSP, or other backends without adding dependencies to
+    Chisel itself.
+
+    Args:
+        language: Language string (e.g. "python", "rust"). Must match a
+                  key in ``_EXTENSION_MAP`` or a custom extension mapping.
+        extractor: Callable with signature
+                   ``(file_path: str, content: str) -> list[CodeUnit]``.
+
+    Raises:
+        TypeError: If *extractor* is not callable.
+    """
+    if not callable(extractor):
+        raise TypeError(f"extractor must be callable, got {type(extractor).__name__}")
+    _custom_extractors[language] = extractor
+
+
+def unregister_extractor(language):
+    """Remove a custom extractor, reverting to the built-in one.
+
+    Raises:
+        KeyError: If no custom extractor is registered for *language*.
+    """
+    del _custom_extractors[language]
+
+
+def get_registered_extractors():
+    """Return a shallow copy of the custom extractor registry."""
+    return dict(_custom_extractors)
+
+
 # ---------------------------------------------------------------------------
 # Dispatcher
 # ---------------------------------------------------------------------------
@@ -568,10 +610,12 @@ def _name_kind(m):
 def extract_code_units(file_path: str, content: str) -> list[CodeUnit]:
     """Extract code units from *content* using the appropriate language parser.
 
-    Dispatches to a language-specific extractor based on the file extension.
+    Custom extractors registered via :func:`register_extractor` take
+    priority over built-in ones.  Dispatches based on the file extension.
     Returns an empty list for unsupported languages.
     """
     lang = detect_language(file_path)
-    if lang not in _EXTRACTORS:
+    extractor = _custom_extractors.get(lang) or _EXTRACTORS.get(lang)
+    if extractor is None:
         return []
-    return _EXTRACTORS[lang](file_path, content)
+    return extractor(file_path, content)
@@ -151,54 +151,62 @@ def tool_analyze(self, directory=None, force=False):
 
     def tool_impact(self, files, functions=None):
         """MCP tool: get impacted tests for changed files."""
-        with self.lock.read_lock():
-            return self.impact.get_impacted_tests(files, functions)
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.impact.get_impacted_tests(files, functions)
 
     def tool_suggest_tests(self, file_path):
         """MCP tool: suggest tests for a file."""
-        with self.lock.read_lock():
-            return self.impact.suggest_tests(file_path)
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.impact.suggest_tests(file_path)
 
     def tool_churn(self, file_path, unit_name=None):
         """MCP tool: get churn stats. Always returns a list."""
-        with self.lock.read_lock():
-            stat = self.storage.get_churn_stat(file_path, unit_name)
-            if stat:
-                return [stat]
-            # Only fall back to all stats when no specific unit was requested
-            if unit_name is None:
-                return self.storage.get_all_churn_stats(file_path)
-            return []
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                stat = self.storage.get_churn_stat(file_path, unit_name)
+                if stat:
+                    return [stat]
+                if unit_name is None:
+                    return self.storage.get_all_churn_stats(file_path)
+                return []
 
     def tool_ownership(self, file_path):
         """MCP tool: get blame-based code ownership."""
-        with self.lock.read_lock():
-            return self.impact.get_ownership(file_path)
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.impact.get_ownership(file_path)
 
     def tool_coupling(self, file_path, min_count=3):
         """MCP tool: get co-change coupling partners."""
-        with self.lock.read_lock():
-            return self.storage.get_co_changes(file_path, min_count)
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.storage.get_co_changes(file_path, min_count)
 
     def tool_risk_map(self, directory=None):
         """MCP tool: risk scores for all files."""
-        with self.lock.read_lock():
-            return self.impact.get_risk_map(directory)
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.impact.get_risk_map(directory)
 
     def tool_stale_tests(self):
         """MCP tool: detect stale tests."""
-        with self.lock.read_lock():
-            return self.impact.detect_stale_tests()
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.impact.detect_stale_tests()
 
     def tool_history(self, file_path):
         """MCP tool: commit history for a file."""
-        with self.lock.read_lock():
-            return self.storage.get_commits_for_file(file_path)
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.storage.get_commits_for_file(file_path)
 
     def tool_who_reviews(self, file_path):
         """MCP tool: suggest reviewers based on recent commit activity."""
-        with self.lock.read_lock():
-            return self.impact.suggest_reviewers(file_path)
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.impact.suggest_reviewers(file_path)
 
     def tool_diff_impact(self, ref=None):
         """MCP tool: auto-detect changes from git diff and return impacted tests.
@@ -217,30 +225,34 @@ def tool_diff_impact(self, ref=None):
                 functions.extend(self.git.get_changed_functions(fp, ref))
             except RuntimeError:
                 pass
-        with self.lock.read_lock():
-            return self.impact.get_impacted_tests(
-                changed_files, functions or None,
-            )
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.impact.get_impacted_tests(
+                    changed_files, functions or None,
+                )
 
     def tool_update(self):
         """MCP tool: incremental re-analysis of changed files."""
         return self.update()
 
     def tool_test_gaps(self, file_path=None, directory=None, exclude_tests=True):
         """MCP tool: find code units with no test coverage."""
-        with self.lock.read_lock():
-            return self.impact.get_test_gaps(file_path, directory, exclude_tests)
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.impact.get_test_gaps(file_path, directory, exclude_tests)
 
     def tool_record_result(self, test_id, passed, duration_ms=None):
         """MCP tool: record a test result (pass/fail) for future prioritization."""
-        with self.lock.write_lock():
-            self.storage.record_test_result(test_id, passed, duration_ms)
-            return {"test_id": test_id, "passed": passed, "recorded": True}
+        with self._process_lock.exclusive():
+            with self.lock.write_lock():
+                self.storage.record_test_result(test_id, passed, duration_ms)
+                return {"test_id": test_id, "passed": passed, "recorded": True}
 
     def tool_stats(self):
         """MCP tool: get summary counts for the Chisel database."""
-        with self.lock.read_lock():
-            return self.storage.get_stats()
+        with self._process_lock.shared():
+            with self.lock.read_lock():
+                return self.storage.get_stats()
 
     # ------------------------------------------------------------------ #
     # Shared internal helpers
Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-__version__ = "0.5.4"`
	`1`	`+__version__ = "0.6.0"`