Skip to content

fix(tui): skip hidden worktrees in discovery walks (#2211)#2273

Open
Hmbown wants to merge 1 commit into
mainfrom
fix/2211-discovery-skip-worktrees
Open

fix(tui): skip hidden worktrees in discovery walks (#2211)#2273
Hmbown wants to merge 1 commit into
mainfrom
fix/2211-discovery-skip-worktrees

Conversation

@Hmbown
Copy link
Copy Markdown
Owner

@Hmbown Hmbown commented May 27, 2026

Summary

  • add shared discovery filters for gitignore-disabled TUI walks
  • skip hidden release worktrees, Claude worktrees, DeepSeek snapshots, and common build/cache directories during fallback discovery
  • keep normal AI command folders and explicit user hidden folders discoverable

Refs #2211. This intentionally leaves the broader sub-agent fanout/backpressure UI acceptance criteria open.

Verification

  • cargo test -p codewhale-tui workspace_completions_skip_hidden_worktrees_and_build_bulk -- --nocapture
  • cargo test -p codewhale-tui picker_skips_generated_worktree_bulk_inside_unignored_dot_dirs -- --nocapture
  • cargo check -p codewhale-tui --all-features --locked
  • cargo fmt --all -- --check
  • git diff --check

Open in Devin Review

Greptile Summary

This PR extracts shared workspace-discovery predicates into a new workspace_discovery module and expands what gets excluded from gitignore-disabled TUI walks: hidden release worktrees (.worktrees/), Claude agent worktrees (.claude/worktrees/), DeepSeek snapshot repos (.deepseek/snapshots/), and common build/cache directories. Normal AI-tool command folders remain fully discoverable.

  • workspace_discovery.rs (new): defines DISCOVERY_ALWAYS_DIRS, DISCOVERY_EXCLUDED_SUBDIRS, DISCOVERY_EXCLUDED_DIR_NAMES, and two public predicates (path_is_excluded_from_discovery, should_skip_unignored_discovery_entry).
  • local_reference_paths in working_set.rs: correctly swaps should_skip_local_reference_dir for should_skip_unignored_discovery_entry via filter_entry, giving proper early-pruning of excluded subtrees.
  • Dot-dir walks in both working_set::walk_always_discoverable_dirs and file_picker::collect_candidates: continue using path_is_excluded_from_discovery as a post-loop if-continue guard rather than filter_entry, so the walker still descends into excluded subtrees before discarding their entries; the DISCOVERY_EXCLUDED_DIR_NAMES check is also not applied there.

Confidence Score: 4/5

The change is safe to merge; it correctly prevents hidden worktree files from appearing in completions and the file picker, and adds well-tested filtering logic.

The refactoring achieves its goal cleanly and all three changed predicates are exercised by the new tests. The two gaps — dot-dir walks not using filter_entry for early pruning, and DISCOVERY_EXCLUDED_DIR_NAMES not being applied in those same walks — are efficiency/consistency concerns rather than correctness bugs, and both are pre-existing patterns.

The dot-dir walk sections in working_set.rs (walk_always_discoverable_dirs) and file_picker.rs (collect_candidates) would benefit from follow-up to adopt filter_entry with should_skip_unignored_discovery_entry for consistency with local_reference_paths.

Important Files Changed

Filename Overview
crates/tui/src/workspace_discovery.rs New module centralizing discovery filter constants and predicates; logic is correct but should_skip_unignored_discovery_entry is only partially adopted across call sites.
crates/tui/src/working_set.rs Replaces should_skip_local_reference_dir with the unified filter in local_reference_paths (correct, uses filter_entry), but walk_always_discoverable_dirs still uses only path_is_excluded_from_discovery as a post-loop check without early pruning.
crates/tui/src/tui/file_picker.rs Switches from a hardcoded .deepseek/snapshots check to path_is_excluded_from_discovery; same post-entry filter pattern without filter_entry pruning — pre-existing but not improved.
crates/tui/src/main.rs Trivial: adds mod workspace_discovery declaration.

Comments Outside Diff (1)

  1. crates/tui/src/working_set.rs, line 317-326 (link)

    P2 Dot-dir walk still descends into excluded subtrees

    path_is_excluded_from_discovery is called as a post-loop if-continue guard, not as a filter_entry predicate. The WalkBuilder will still enumerate every file under .claude/worktrees/ or .deepseek/snapshots/ before those entries are discarded. local_reference_paths was correctly updated to use filter_entry with should_skip_unignored_discovery_entry, but walk_always_discoverable_dirs (and the parallel loop in file_picker::collect_candidates) still lack early pruning. If a hidden worktree contains thousands of files within max_depth, all of them are traversed but silently dropped — the same traversal overhead the .deepseek/snapshots comment in the old code warned about.

    Fix in Devin

Fix All in Devin

Reviews (1): Last reviewed commit: "fix(tui): skip hidden worktrees in disco..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request extracts workspace discovery filters and constants into a new workspace_discovery module to ensure consistent exclusion of large or generated subdirectories (such as .deepseek/snapshots and .claude/worktrees) across both the file picker and working set completions. The reviewer identified two valuable performance optimization opportunities: avoiding redundant heap allocations in path_is_excluded_from_discovery by stripping the root prefix first, and preventing unnecessary disk I/O in the file picker by filtering out excluded subdirectories directly on the directory walker.

Comment on lines +34 to +38
pub(crate) fn path_is_excluded_from_discovery(walk_root: &Path, path: &Path) -> bool {
DISCOVERY_EXCLUDED_SUBDIRS
.iter()
.any(|excluded| path.starts_with(walk_root.join(excluded)))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Performance Bottleneck: Unnecessary Heap Allocations

The current implementation of path_is_excluded_from_discovery constructs a new PathBuf via walk_root.join(excluded) for every single directory/file entry checked. Since this function is called for every entry in the workspace walk (which can be up to 50,000 entries), this results in up to 150,000 heap allocations per walk.

We can make this completely allocation-free by stripping the walk_root prefix first and then checking if the relative path starts with any of the excluded subdirs using Path::starts_with.

pub(crate) fn path_is_excluded_from_discovery(walk_root: &Path, path: &Path) -> bool {
    if let Ok(rel_path) = path.strip_prefix(walk_root) {
        DISCOVERY_EXCLUDED_SUBDIRS
            .iter()
            .any(|excluded| rel_path.starts_with(excluded))
    } else {
        false
    }
}

Comment on lines 453 to 457
for entry in dot_builder.build().flatten() {
// Exclude machine-generated bulk (e.g. .deepseek/snapshots/).
if entry.path().starts_with(root.join(".deepseek/snapshots")) {
if path_is_excluded_from_discovery(root, entry.path()) {
continue;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Performance Optimization: Avoid Traversing Excluded Subdirectories

Currently, dot_builder traverses the entire directory tree of the AI-tool dot-directories (such as .deepseek/ or .claude/) and then filters out the excluded subdirectories (like .deepseek/snapshots/ or .claude/worktrees/) inside the loop. This means WalkBuilder still performs full directory traversal and disk I/O for all files inside those excluded subdirectories, which can contain thousands of files.

By applying .filter_entry() directly to dot_builder, we can prevent WalkBuilder from descending into these excluded subdirectories entirely, saving significant I/O and CPU time.

Suggested change
for entry in dot_builder.build().flatten() {
// Exclude machine-generated bulk (e.g. .deepseek/snapshots/).
if entry.path().starts_with(root.join(".deepseek/snapshots")) {
if path_is_excluded_from_discovery(root, entry.path()) {
continue;
}
let root_for_filter = root.to_path_buf();
dot_builder.filter_entry(move |entry| {
!path_is_excluded_from_discovery(&root_for_filter, entry.path())
});
for entry in dot_builder.build().flatten() {

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Comment on lines +40 to +53
/// Filter for walks that turn off gitignore to surface explicit hidden paths.
pub(crate) fn should_skip_unignored_discovery_entry(walk_root: &Path, path: &Path) -> bool {
if path == walk_root {
return false;
}

if path_is_excluded_from_discovery(walk_root, path) {
return true;
}

path.file_name()
.and_then(|name| name.to_str())
.is_some_and(|name| DISCOVERY_EXCLUDED_DIR_NAMES.contains(&name))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 DISCOVERY_EXCLUDED_DIR_NAMES only applied in the full root walk, not dot-dir walks

should_skip_unignored_discovery_entry (which combines path-prefix exclusion with basename exclusion) is used in local_reference_paths, but both walk_always_discoverable_dirs in working_set.rs and the equivalent loop in file_picker::collect_candidates call only path_is_excluded_from_discovery — skipping the DISCOVERY_EXCLUDED_DIR_NAMES check entirely. A target/ or node_modules/ directory inside .claude/ is therefore traversed and indexed by the dot-dir walks even though local_reference_paths would prune it. The new unified function is not uniformly applied.

Fix in Devin

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@Hmbown
Copy link
Copy Markdown
Owner Author

Hmbown commented May 27, 2026

Independent review: merged onto origin/main locally; both new tests pass.

Predicate scope: path_is_excluded_from_discovery is path-prefix only (.worktrees, .claude/worktrees, .deepseek/snapshots joined to walk_root). Matches the #2211 evidence (35 root-level .worktrees/), but won't catch git worktree add registrations under .git/worktrees/ or symlinked worktrees — fine for the stated case, worth noting if the layout ever shifts.

Release-blocker fit: the 155G .worktrees/ from #2211 is already gitignored, so picker's first-pass WalkBuilder (git_ignore=true) prunes it pre-patch. This PR's real lift is the DISCOVERY_ALWAYS_DIRS second pass — exactly where the issue's .claude/worktrees + .deepseek/snapshots would otherwise leak in. So it does close the described leak path, but doesn't address the sub-agent fanout half of #2211 (acknowledged in summary).

Test gap not in greptile: new tests use 1–3 files per fixture dir. They prove correctness but never exercise the saturating case (#2211 cites ~18K paths). No assertion against LOCAL_REFERENCE_SCAN_LIMIT (4096) boundary, and no traversal-cost test — so the early-pruning regression greptile flagged (post-filter vs filter_entry in walk_always_discoverable_dirs + collect_candidates) is not test-covered either way. Worth adding one fixture with 5K+ files under .claude/worktrees/ and asserting both correctness AND wall-time bound to lock in the perf claim.

Minor: should_skip_unignored_discovery_entry's path == walk_root guard (line 42) is dead in the local_reference_paths call site — filter_entry already skips the root entry.

v0.8.48 (#2256) compatibility: clean — #2256 only touches working_set.rs::detect_key_files (adds DEEPSEEK.md) and adds Xiaomi/response_format to main.rs on disjoint lines; merge simulation onto origin/main applied without conflict and new tests pass on the merged tree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants