fix(tui): skip hidden worktrees in discovery walks (#2211)#2273
Conversation
There was a problem hiding this comment.
Code Review
This pull request extracts workspace discovery filters and constants into a new workspace_discovery module to ensure consistent exclusion of large or generated subdirectories (such as .deepseek/snapshots and .claude/worktrees) across both the file picker and working set completions. The reviewer identified two valuable performance optimization opportunities: avoiding redundant heap allocations in path_is_excluded_from_discovery by stripping the root prefix first, and preventing unnecessary disk I/O in the file picker by filtering out excluded subdirectories directly on the directory walker.
| pub(crate) fn path_is_excluded_from_discovery(walk_root: &Path, path: &Path) -> bool { | ||
| DISCOVERY_EXCLUDED_SUBDIRS | ||
| .iter() | ||
| .any(|excluded| path.starts_with(walk_root.join(excluded))) | ||
| } |
There was a problem hiding this comment.
Performance Bottleneck: Unnecessary Heap Allocations
The current implementation of path_is_excluded_from_discovery constructs a new PathBuf via walk_root.join(excluded) for every single directory/file entry checked. Since this function is called for every entry in the workspace walk (which can be up to 50,000 entries), this results in up to 150,000 heap allocations per walk.
We can make this completely allocation-free by stripping the walk_root prefix first and then checking if the relative path starts with any of the excluded subdirs using Path::starts_with.
pub(crate) fn path_is_excluded_from_discovery(walk_root: &Path, path: &Path) -> bool {
if let Ok(rel_path) = path.strip_prefix(walk_root) {
DISCOVERY_EXCLUDED_SUBDIRS
.iter()
.any(|excluded| rel_path.starts_with(excluded))
} else {
false
}
}| for entry in dot_builder.build().flatten() { | ||
| // Exclude machine-generated bulk (e.g. .deepseek/snapshots/). | ||
| if entry.path().starts_with(root.join(".deepseek/snapshots")) { | ||
| if path_is_excluded_from_discovery(root, entry.path()) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Performance Optimization: Avoid Traversing Excluded Subdirectories
Currently, dot_builder traverses the entire directory tree of the AI-tool dot-directories (such as .deepseek/ or .claude/) and then filters out the excluded subdirectories (like .deepseek/snapshots/ or .claude/worktrees/) inside the loop. This means WalkBuilder still performs full directory traversal and disk I/O for all files inside those excluded subdirectories, which can contain thousands of files.
By applying .filter_entry() directly to dot_builder, we can prevent WalkBuilder from descending into these excluded subdirectories entirely, saving significant I/O and CPU time.
| for entry in dot_builder.build().flatten() { | |
| // Exclude machine-generated bulk (e.g. .deepseek/snapshots/). | |
| if entry.path().starts_with(root.join(".deepseek/snapshots")) { | |
| if path_is_excluded_from_discovery(root, entry.path()) { | |
| continue; | |
| } | |
| let root_for_filter = root.to_path_buf(); | |
| dot_builder.filter_entry(move |entry| { | |
| !path_is_excluded_from_discovery(&root_for_filter, entry.path()) | |
| }); | |
| for entry in dot_builder.build().flatten() { |
| /// Filter for walks that turn off gitignore to surface explicit hidden paths. | ||
| pub(crate) fn should_skip_unignored_discovery_entry(walk_root: &Path, path: &Path) -> bool { | ||
| if path == walk_root { | ||
| return false; | ||
| } | ||
|
|
||
| if path_is_excluded_from_discovery(walk_root, path) { | ||
| return true; | ||
| } | ||
|
|
||
| path.file_name() | ||
| .and_then(|name| name.to_str()) | ||
| .is_some_and(|name| DISCOVERY_EXCLUDED_DIR_NAMES.contains(&name)) | ||
| } |
There was a problem hiding this comment.
DISCOVERY_EXCLUDED_DIR_NAMES only applied in the full root walk, not dot-dir walks
should_skip_unignored_discovery_entry (which combines path-prefix exclusion with basename exclusion) is used in local_reference_paths, but both walk_always_discoverable_dirs in working_set.rs and the equivalent loop in file_picker::collect_candidates call only path_is_excluded_from_discovery — skipping the DISCOVERY_EXCLUDED_DIR_NAMES check entirely. A target/ or node_modules/ directory inside .claude/ is therefore traversed and indexed by the dot-dir walks even though local_reference_paths would prune it. The new unified function is not uniformly applied.
|
Independent review: merged onto origin/main locally; both new tests pass. Predicate scope: Release-blocker fit: the 155G Test gap not in greptile: new tests use 1–3 files per fixture dir. They prove correctness but never exercise the saturating case (#2211 cites ~18K paths). No assertion against Minor: v0.8.48 (#2256) compatibility: clean — #2256 only touches |
Summary
Refs #2211. This intentionally leaves the broader sub-agent fanout/backpressure UI acceptance criteria open.
Verification
Greptile Summary
This PR extracts shared workspace-discovery predicates into a new
workspace_discoverymodule and expands what gets excluded from gitignore-disabled TUI walks: hidden release worktrees (.worktrees/), Claude agent worktrees (.claude/worktrees/), DeepSeek snapshot repos (.deepseek/snapshots/), and common build/cache directories. Normal AI-tool command folders remain fully discoverable.workspace_discovery.rs(new): definesDISCOVERY_ALWAYS_DIRS,DISCOVERY_EXCLUDED_SUBDIRS,DISCOVERY_EXCLUDED_DIR_NAMES, and two public predicates (path_is_excluded_from_discovery,should_skip_unignored_discovery_entry).local_reference_pathsinworking_set.rs: correctly swapsshould_skip_local_reference_dirforshould_skip_unignored_discovery_entryviafilter_entry, giving proper early-pruning of excluded subtrees.working_set::walk_always_discoverable_dirsandfile_picker::collect_candidates: continue usingpath_is_excluded_from_discoveryas a post-loopif-continueguard rather thanfilter_entry, so the walker still descends into excluded subtrees before discarding their entries; theDISCOVERY_EXCLUDED_DIR_NAMEScheck is also not applied there.Confidence Score: 4/5
The change is safe to merge; it correctly prevents hidden worktree files from appearing in completions and the file picker, and adds well-tested filtering logic.
The refactoring achieves its goal cleanly and all three changed predicates are exercised by the new tests. The two gaps — dot-dir walks not using filter_entry for early pruning, and DISCOVERY_EXCLUDED_DIR_NAMES not being applied in those same walks — are efficiency/consistency concerns rather than correctness bugs, and both are pre-existing patterns.
The dot-dir walk sections in working_set.rs (walk_always_discoverable_dirs) and file_picker.rs (collect_candidates) would benefit from follow-up to adopt filter_entry with should_skip_unignored_discovery_entry for consistency with local_reference_paths.
Important Files Changed
Comments Outside Diff (1)
crates/tui/src/working_set.rs, line 317-326 (link)path_is_excluded_from_discoveryis called as a post-loopif-continueguard, not as afilter_entrypredicate. TheWalkBuilderwill still enumerate every file under.claude/worktrees/or.deepseek/snapshots/before those entries are discarded.local_reference_pathswas correctly updated to usefilter_entrywithshould_skip_unignored_discovery_entry, butwalk_always_discoverable_dirs(and the parallel loop infile_picker::collect_candidates) still lack early pruning. If a hidden worktree contains thousands of files withinmax_depth, all of them are traversed but silently dropped — the same traversal overhead the.deepseek/snapshotscomment in the old code warned about.Reviews (1): Last reviewed commit: "fix(tui): skip hidden worktrees in disco..." | Re-trigger Greptile