Skip to content

Phase 4: per-phase perf campaign vs native git on codex — TTLs, protocol levers (noflush/noopen/io_uring), bulk+streamed clone, POSIX and lifecycle fixes#2

Merged
factory-ain3sh merged 77 commits into
mainfrom
phase4-north-star-implementation
Jul 3, 2026

Conversation

@factory-ain3sh

Copy link
Copy Markdown
Collaborator

What this is

77 commits closing the VFS performance gap against native git on the canonical openai/codex workload, plus the correctness and lifecycle work the campaign surfaced. Goal metric: per-phase wall-clock ratio <= 1.5x native.

Where we landed (codex, multi n=5)

phase before after bar (<=1.5x)
read_search ~4.7x 1.37-1.41x met
status ~1.9x 0.60-0.93x met
diff ~80ms 18ms (0.05x) met
checkout 0.42x met
fsck 0.83x met
clone (agentfs clone) 9.6x plain 2.22x (0.754s) floored: whole-state double write (pack+worktree 2x43MB into SQLite)
edit ~8ms 6ms codex / 2.5-2.8ms micro floor micro target met; residual = kernel close-inval + fsync txn floor
read-path warm ~4.7x ~2.1-2.4x floored: kernel close-time STATX_BLOCKS invalidation (upstream patch written + VM-validated, see .agents/kernel/)

Every remaining miss carries a named, measured floor documented in .agents/specs/.

Highlights

Perf (default-on, each with a kill switch):

  • Kernel entry/attr TTLs 1s -> 10s; keep-cache with fingerprint revalidation; FOPEN_CACHE_DIR
  • ENOSYS-FLUSH (no close-time round trip) and ENOSYS-OPEN (kernel no_open: zero-message opens via shared per-inode file table)
  • FUSE-over-io_uring transport (vendored fuser ABI 7.31 -> 7.42; probe-gated, falls back to legacy channel)
  • Write batcher: cross-inode group commit, no drain on FORGET, self-invalidation suppression
  • agentfs clone: bulk ingest via SDK import_entries -> streamed ImportSession pipeline (cat-file parse overlapped with import), fabricated git index for clean first status

Correctness / lifecycle:

  • POSIX unlink-while-open: OpenInodes registry with deferred reaping, nlink=0 orphan sweep, integrity invariant amended
  • Signal handling: supervise_child + PDEATHSIG in exec, session teardown in mount — no more orphaned processes or stale mounts on TERM/INT
  • noopen-coherence gate (6 scenarios incl. full POSIX unlink-while-open), durability and integrity gates green; 168 SDK + 109 CLI tests

Measurement infrastructure:

  • git-workload benchmark pinned to the codex fixture by default (a synthetic-fixture drift mid-campaign was caught by session audit and voided; measurement contract now pinned in the roadmap spec)
  • Per-op FUSE dispatch counters, profile checkpoints, A/B multi-run harness

Upstream:

  • Root-caused the read-path floor to fuse_flush's unconditional STATX_BLOCKS invalidation; 17-line FUSE_I_BLOCKS_DIRTY kernel patch written and VM-validated (GETATTR storm 1095 -> 70, 2.2x cycle time; du/mmap correctness intact). Patch archived in .agents/kernel/, pending sign-off + submission.

Next

dev branch cut from main after this lands: aggressive cleanup, restructuring (agentfs.rs is 9.3k lines), knob consolidation, de-slopping.

factory-ain3sh and others added 30 commits May 9, 2026 21:53
Add fork governance, workload baseline, corruption torture, snapshot/restore, and replay/POSIX validation harnesses while landing AgentFS Phase 3 durability and concurrency quick wins.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Tighten v0.5 copy migration and FUSE coalescing after review: preserve overlay config, normalize whiteout parent paths, keep legacy migrate v0.4-only, stream sparse/large file migration and verification, lock/hash the source DB family, and flush FUSE writes across getattr/truncate/cross-handle ordering boundaries.

Profiling coverage now records FUSE flush count/ranges/bytes so coalescer effectiveness is visible in AGENTFS_PROFILE summaries. Validation passed SDK fmt/clippy/tests, CLI fmt/check/clippy/tests, cli/tests/all.sh, phase0 smoke, replay smoke, and diff whitespace checks; pjdfstest skipped with exit 77 because pjdfstest is not installed.

Benchmark results: the local bounded read smoke on /home/ain3sh/factory/factory-mono improved from the earlier Phase 3 baseline of ~125.8x native to 15.17x native with stdout-equivalent output; the synthetic phase0 smoke measured 16.53x native. This is a material profiling/benchmarking improvement but still above the north-star 1.5-2x target.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Document the v0.5 schema/write-path architecture, copy-only migration, and pjdfstest operating model so Phase 5 starts from an explicit correctness baseline.

Add a phase45-ci pjdfstest profile that passes under the current unprivileged FUSE contract, emits selected-test and known-gap report artifacts, and reserves exit 77 for missing prerequisites only. CI now builds pjdfstest and runs the supported profile, while full pjdfstest remains available for Phase 5 triage.

Validation: bash -n scripts/validation/posix/run-pjdfstest.sh; run-pjdfstest.sh --list-profiles; run-pjdfstest.sh --profile phase45-ci (37 files, 142 tests, PASS); scripts/validation/phase0.sh; git diff --check.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Resolve follow-up review findings across the Phase 5 prototype: partial-origin opens now resolve persisted base paths instead of volatile HostFS inodes, detect base-size drift, cover remount/readdir_plus/rename/truncate cases, and keep metadata-only regular-file updates on the partial-origin path.

Tighten NFS write-handle semantics with random bounded write handles and SETATTR/truncate authorization tests. Clean up validation docs/manifests so supported chown tests do not overlap known gaps, selected pjdfstest manifests are reported with path/hash, large-edit benchmark reports partial-origin tables, and backend-risk commands reference existing replay tooling.

Validation: SDK fmt/check/clippy plus focused partial-origin tests; CLI fmt/check/clippy plus NFS handler tests; phase45-ci and phase5-ci pjdfstest; phase0 smoke; validation helper syntax/self-tests; large-edit/backend-risk smoke; git diff --check.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Record Phase 5.5 backend spike results and make the helper capture measured validation outcomes for future upgrade/fallback decisions.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Ships the accumulated phase 6 / 6.5 / 7 / 8 north-star work that was
staged on phase4-north-star-implementation but uncommitted:

- Phase 6:  partial-origin overlay with portable/inline storage, secure
            read-only passthrough plumbing, materialize/integrity/backup
            production safety commands, encrypted-database key/cipher
            handling, fuse cache invalidation tests
- Phase 6.5: read-path fast path (cached inode attrs, read profiler,
            cache tuning, instrumentation counters, passthrough rules)
- Phase 7:  principle-preserving git workload fast path (write batcher,
            FUSE concurrency lanes, cache plumbing, git workload gates)
- Phase 8:  parallel FUSE dispatch + bounded worker pool with shared
            read lane and exclusive write lane, deferred kernel-cache
            invalidation infrastructure, writeback-cache configuration,
            phase 8 validation gates (concurrent-git-stress,
            writeback-durability, writeback-no-fsync-crash,
            fuse-serialization-stress)

Touches: cli/src/{cmd,fuser,mount,nfs,nfsserve,sandbox}/, cli/src/fuse.rs,
cli/src/opts.rs, cli/src/main.rs, sdk/rust/src/{filesystem,profiling}.rs,
sdk/rust/src/lib.rs, sandbox/src/vfs/sqlite.rs, sandbox/Cargo.lock,
MANUAL.md, README.md, SPEC.md, TESTING.md, validation scripts,
.agents/specs/ phase markers, and .agents/05_* session notes.

NOTE: a small portion of the Tier One delta (MutationAudit struct, the
3 kernel-cache default flips in fuse.rs, the rewritten
fuse_sync_inval_enabled_from_env() body, the rewritten
FuseKernelCacheConfig::from_env body, the 4 reworded warn messages, and
the matching FUSE controls section in MANUAL.md/TESTING.md) is bundled
into this commit because it is textually intermingled with prior phase
6-8 work in the same files. The cleanly-separable Tier One CODE (the
fuse-modern abi-7-* cascade in cli/Cargo.toml, the
FuseDispatchMode::from_env auto default in cli/src/fuser/session.rs,
and the clippy fix in cli/src/sandbox/linux.rs) lands in the next
commit; Tier One artifacts (spec, RCA notes, multi-iter benchmark
wrapper, baseline + post-impl aggregate JSONs) land in the commit
after that. See
.agents/specs/2026-05-24-tier-one-spec-enable-kernel-cache-by-default-37x-8-12x.notes.md
for the full RCA covering both the ABI cascade bug and the sync_inval
deadlock.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Three small, cleanly-separable code changes that complete the Tier One
default-on kernel cache. The intermingled remainder of the Tier One
delta (default flips and MutationAudit infrastructure in fuse.rs,
deferred-by-default invalidation, FUSE controls section in MANUAL.md
and TESTING.md) is bundled into the preceding backlog commit because
the same files also carry ~7 000 lines of phase 6-8 work; this commit
contains the only Tier One edits that land in files we did not also
modify for the backlog.

cli/Cargo.toml: add fuse-modern umbrella feature enabling abi-7-19
through abi-7-31 and add it to the default feature set. The vendored
fuser dispatcher gates each FUSE opcode behind its abi-7-N cfg, so
without this cascade the kernel sends opcode 44 (FUSE_READDIRPLUS) and
the dispatcher returns ENOSYS, breaking any readdir on the mount once
the kernel cache fast path is enabled.

cli/src/fuser/session.rs: change FuseDispatchMode::from_env()'s unset
branch from Self::Serial to the same auto resolution used by
AGENTFS_FUSE_WORKERS=auto, so the worker pool is on by default. This
is the matching half of the kernel-cache fast-path default flip in
cli/src/fuse.rs (which is in the backlog commit).

cli/src/sandbox/linux.rs: silence clippy::too_many_arguments on
run_cmd() so cargo clippy -D warnings keeps passing after the lint
profile we re-ran during Tier One ship validation.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…pper

Adds the artifacts that accompany the default-on kernel cache work:

- .agents/specs/2026-05-24-tier-one-spec-*.md: approved spec describing
  the Tier One scope (env-var default flip + invalidation audit +
  abi-7-* feature cascade) and the 8-12x target.
- .agents/specs/2026-05-24-tier-one-spec-*.notes.md: implementation
  notes covering the RCA for both latent bugs surfaced by the default
  flip (FUSE_READDIRPLUS ENOSYS via missing abi-7-21, and the
  sync_invalidation + parallel-workers deadlock on git fork/fsync),
  plus the post-impl benchmark comparison vs the baselines below.
- .agents/benchmarks/baseline-current-default.agg.json: 5-iter median
  baseline of the current branch BEFORE Tier One (overall 4.46x).
- .agents/benchmarks/baseline-main-default.agg.json: 5-iter median
  baseline of origin/main 3a5ed2b AgentFS 0.6.4 (overall 3.85x).
- .agents/benchmarks/post-impl-default.agg.json: 3-iter median after
  Tier One (overall 2.92x; clone 7.21x, checkout 1.55x, status 1.10x,
  read_search 2.19x, edit 9.19x [native sub-ms], diff 0.79x [faster
  than native]). 21% improvement vs current baseline; 24% vs main.
- .agents/benchmarks/{baseline-*,run-*}.json: per-iteration raw JSON
  preserved for reproducibility.
- .agents/benchmarks/fixtures/README.md: reproduction notes; the
  ~63 MiB openai/codex bare clone itself is gitignored.
- scripts/validation/git-workload-benchmark-multi.py: non-invasive
  multi-iteration wrapper around git-workload-benchmark.py that
  reports median + p25/p75 + stdev per phase, used as the canonical
  performance measurement going forward.

Also updates .gitignore for Python __pycache__/ and the large benchmark
fixture directory.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Tier Two prep — direct head-to-head measurement of the same three
workloads (read-heavy, copy-on-write, mixed git) against native, the
original AgentFS at origin/main 3a5ed2b, and Tier One AgentFS at
HEAD 9be0da4. Both agentfs binaries built from clean release profiles
on the same machine, no AGENTFS_FUSE_* env vars set.

Headline (ratio of agentfs / native; lower is better):

| Workload                          | Original | Tier One | Delta |
| read-heavy (full run, w/ startup) |    2.51x |    3.03x |  +21% |
| read-heavy (steady-state only)    |    7.76x |    3.79x |  -51% |
| copy-on-write 50 MiB edit         |    8.19x |    5.42x |  -34% |
| mixed git workload (median)       |    5.16x |    3.21x |  -38% |

Bonus: CoW delta DB growth for the single-byte edit dropped from
172.6 MiB to 50.4 MiB (-71%).

Tier One regressed read-heavy full-run startup by ~10-15 ms because the
mount now negotiates parallel workers + readdirplus + writeback +
ABI 7.31 at FUSE init; this is amortised on sustained workloads (see
the steady-state row dropping 51%) but matters for short-lived
sandboxes. Captured as a Tier Two focus item.

Files: COMPARISON.md (human-readable tables + Tier Two focus notes)
plus the 6 raw per-run JSONs for reproducibility. Tracks at
.agents/benchmarks/tier-two-prep/.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Closes the read-path gap with a HostFS passthrough for unmodified
partial-origin delta inodes (Axis C) and cuts clone-phase write
overhead with both cross-inode batched commits (Axis A1) and a
FUSE-layer per-fh write coalescer (Axis A2). Bundles the two Tier One
cleanups (release-first agentfs binary resolver, feature-gated
FUSE_DO_READDIRPLUS capability negotiation) that were noted during the
Tier Two due-diligence pass.

Net mixed-workload effect (codex fixture, 5-iter / 2-warmup median):
agentfs total 2.91s → 2.51s (-14%); ratio 3.21x → 2.97x.
CoW edit agentfs absolute 0.67s → 0.36s (-46%).

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Spec + implementation notes + before/after benchmark JSONs for Tier
Two (HostFS read passthrough, clone batching, FUSE coalescer). Adds
tier-two-post/COMPARISON.md mirroring the tier-two-prep comparison so
the read-heavy / CoW / mixed numbers across origin/main, Tier One,
and Tier Two are all in one place.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…ad-in-default

Profile-validated findings from Tier 3 due diligence:
- AGENTFS_FUSE_WRITEBACK defaults TRUE in cli (line 130) but FALSE in
  the SDK (env_flag_enabled). The cross-inode batched commit shipped in
  Tier 2 was dead code in the canonical workload.
- Axis C HostFS passthrough never fires (passthrough_attempted=0) even
  with AGENTFS_OVERLAY_PARTIAL_ORIGIN=1 explicitly set: the codex clone
  workload never modifies a base file, so partial-origin mappings are
  never created.
- Tier 2 diff/CoW wins were per-iteration noise, not attributable to A1
  or C; the real Tier 2 deliverables were A2, the lock-fix refactor, and
  the cleanups.

Includes a 5-iter mixed-workload benchmark with AGENTFS_FUSE_WRITEBACK=1
forced to document what Tier 2 would have delivered if the gating had
been correct (agentfs 2.51 s -> 2.29 s).

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…forced

Raw 5-iter / 2-warmup mixed-workload aggregate from the canonical codex
fixture with the SDK batcher actually enabled (the env var the cli
defaults to on but the SDK defaults to off). This is the comparison
artifact for the Tier 2 retroactive correction.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
… worker pool, larger inline tier

Tier 3 ships the three low-risk perf moves whose RCAs were nailed down
by Tier 3 due diligence:

* Axis D — align SDK 'AGENTFS_FUSE_WRITEBACK' default with cli (true
  when unset). The cli has defaulted FUSE writeback ON since Tier 1
  but the SDK gated the cross-inode write batcher behind
  'env_flag_enabled' (default off), making Tier 2's A1 dead in default
  config. Profile counters confirm: enqueues went 0 -> 4759 on the
  canonical workload after the fix.

* Axis F — default AGENTFS_FUSE_CPU_PERCENT 25 -> 50 so 'auto' worker
  resolution yields more parallelism on the typical machine. The
  previous 25% default saturated at 3 workers on a 14-core box with
  570 ms of cumulative dispatch wait during clone.

* Axis I — DEFAULT_INLINE_THRESHOLD 4 KiB -> 16 KiB so the
  (4, 16] KiB tail of codex working-tree files avoids the chunked-
  storage path. fs_config persists per-DB so existing databases keep
  their 4 KiB threshold; only newly-initialised DBs adopt 16 KiB.
  chunk_write_chunks halved on the canonical workload.

* drain_due_timer enhancement — when the per-inode timer fires and the
  inode is ripe, route through drain_pending_batched to commit all
  pending inodes in one txn. Harmless when only one ino is ripe.

Net effect (5-iter / 2-warmup median, codex fixture):
  agentfs total 2.51 s -> 2.28 s (-9%); ratio 2.97x -> 2.73x.

Axis E (defer release/close drain) and Axis H (multi-row VALUES INSERT)
were attempted and reverted; see Tier 3 notes for RCAs. Axis G
(pack-aware streaming writer) deferred to Tier 4 — it depends on the
same 'consistent-without-drain' SDK read path that E needs.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Includes axis-by-axis RCAs for D/F/I (shipped), H/E (attempted +
reverted with profile evidence), G (deferred to Tier 4), and the C
disposition decision. tier-three-post/ has the raw 5-iter JSON for the
final mixed-workload run and the per-axis intermediate runs.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Removes the synchronous SQLite drain from the SDK read path. AgentFSFile::pread
now consults the in-memory write batcher (peek_pending) and merges over
SQLite-resident bytes. Drains become a pure durability operation, triggered
only by fsync / destroy / timer / bytes-threshold.

SDK changes (sdk/rust/src/filesystem/agentfs.rs):
- AgentFSWriteBatcher gains peek_pending, peek_pending_max_end,
  truncate_pending, and discard_pending. peek* are read-only snapshots that
  acquire state.lock briefly without touching the pool. truncate_pending
  shrinks pending in place for AgentFSFile::truncate. discard_pending drops
  all pending writes for an ino, used at unlink/rename/remove sites so a
  later batched drain doesn't try to INSERT into a missing fs_inode row.
- AgentFSFile::{pread,pwrite,pwrite_ranges,truncate,fsync} no longer call
  drain_writes on every op. pwrite routes through batcher.enqueue when the
  batcher is wired. pread peeks the batcher BEFORE acquiring the pool conn
  and drops the conn BEFORE the splice loop to keep timer-drain tasks
  un-starved on the single-conn ephemeral pool. fsync remains the explicit
  durability barrier.
- AgentFS::{getattr,lookup,lstat,stat} no longer call drain_inode_writes.
  New merge_pending_size helper ORs peek_pending_max_end into the SQLite
  size view. Fixes a 30-second ConnectionPoolTimeout deadlock that surfaces
  once the batcher actually holds pending data (lookup held the only
  permit, then drain_pending_batched waited for the same permit).
- AgentFS::{unlink,rename,remove} (both path-based and trait impls) now
  call batcher.discard_pending(ino) before deleting the inode row. Without
  this, the Explicit drain that bundles ALL pending inodes in one txn
  fails with Fs(NotFound) on the deleted ino.
- AgentFSWriteBatcher::enqueue now calls attr_cache.remove(ino) so
  consumers of cached attrs don't see pre-write state after a successful
  pwrite. getattr re-caches the OR'd size so cached_attr agrees with what
  getattr returned.

CLI changes:
- cli/src/fuse.rs: flush_pending_inode no longer calls drain_inode_writes;
  the per-fh FUSE WriteBuffer still flushes into the SDK batcher, but the
  batcher's pending writes now serve FUSE reads through the overlay.
- cli/src/cmd/fs.rs: write_filesystem (one-shot CLI op) calls drain_all
  before returning so the next opener (e.g. cat) sees the bytes.

Tests:
- 157 SDK lib tests pass (148 pre-existing + 9 new overlay tests
  covering read-after-write, partial overlap, hole reads, truncate clipping,
  getattr size growth, concurrent writers, unlink-during-pending,
  fsync-drains-to-sqlite).
- 106 CLI tests pass after the FUSE refactor.
- clippy clean; cargo fmt applied.
- Phase 8 smoke: all 7 gates pass.

Benchmark (9-iter median, codex fixture):
- Mixed median ratio 3.24x vs Tier 3's 2.73x; high variance dominates
  (stdev ~1.7x). agentfs absolute 2.47s vs Tier 3 2.28s. Checkout phase
  improved 40% (overlay paying off); diff/read_search regressed ~50%
  (state.lock acquires per pread). Tier 4 is a foundation commit; Tier 5
  is where the perf win actually lands.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…omparison

Tier 4/5/6 spec lays out the full architectural arc to 1.5x mixed median
with explicit go/no-go gates between tiers:

  - Tier 4 (this commit's code): consistent-without-drain SDK overlay
    foundation. Target ~2.5x. Effort ~3 days, ~500 LOC. Risk: medium.
  - Tier 5: defer release/forget drain (Axis E) + pack-aware streaming
    writer (Axis G), now structurally safe to ship on the Tier 4 foundation.
    Target ~2.0x. Effort ~3-5 days, ~600 LOC.
  - Tier 6: shadow-tree pivot. Working-tree content moves to real HostFS
    files; SQLite keeps overlay metadata only. Reads return shadow fd via
    FOPEN_PASSTHROUGH (Linux 6.9+). Target ~1.5x. Effort ~2-3 weeks,
    ~2000 LOC. Risk: high (architectural break).

Tier 5 -> Tier 6 gate: if mixed median <=1.8x with tight variance, GO Tier 6.
Otherwise re-spec before the shadow-tree pivot.

Honest scope limits called out in the spec:
  - CoW (50 MiB single-byte edit) 1.5x is NOT in this stack; needs Tier 7
    smaller-chunks-for-partial-origin work.
  - Encrypted databases pay a fixed crypto overhead.
  - Cold-mount startup not addressed.

Notes file logs the Tier 4 implementation honestly:
  - 157 SDK tests + 106 CLI tests + 7 Phase 8 gates green.
  - 9-iter benchmark median 3.24x vs Tier 3's 2.73x; high variance
    (stdev 1.72x). Per-phase: checkout -40% (overlay paying off);
    diff/read_search +50% (state.lock acquires per pread, ~50ms absolute).
  - Three latent bugs surfaced and fixed: single-conn pool deadlock in
    lookup, orphan fs_data rows on unlink/rename/remove, and CLI
    write_filesystem durability for fresh openers.
  - Recommendation: GO on Tier 5. Foundation is correct; Tier 5 is where
    perf actually moves.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
factory-ain3sh and others added 28 commits June 11, 2026 11:30
…budget + deviations recorded

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…aching via FOPEN_CACHE_DIR

A FLUSH that drained no writes invalidated the inode anyway, feeding the
drift guard's sticky dropped set: the first close(2) of any file revoked
FOPEN_KEEP_CACHE forever and every warm re-open paid full FUSE READs
(64/1280 opens kept cache; now 1280/1280, READs 1280->64). opendir now
grants FOPEN_CACHE_DIR|FOPEN_KEEP_CACHE (FUSE_NO_OPENDIR_SUPPORT only
advertised when off), halving readdirplus on the git workload, and open()
collapses three block_on hops into one. Read-path warm steady-state
12.7x -> ~4.0x (8/8 A/B pairs, paired wall median 0.744); git workload
dispatches -7.9%, status phase 6.33x -> 1.99x. Kill switches:
AGENTFS_FUSE_FLUSH_INVAL=1, AGENTFS_FUSE_CACHE_DIR=0.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…r = OPEN+FLUSH round trips, next levers logged

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…drop to fingerprint revalidation

Upper/Delta files never got FOPEN_KEEP_CACHE (Layer::Base-only) and the
sticky dropped set revoked eligibility forever after a file's first write,
so every git-created file paid full FUSE READs on each re-open. Drop now
clears the fingerprint and the next read-only open revalidates against
fresh stats; AgentFS grants keep-cache for regular files and the overlay
delegates Delta inodes. Git workload: grants 20->1694, READs 2548->519,
dispatches -5.3%, paired wall 0.906; status 0.71x, diff sub-native,
read_search 2.25x. Kill switches: AGENTFS_KEEPCACHE_DELTA=0,
AGENTFS_FUSE_STICKY_KEEPCACHE_DROP=1.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…t+fsck under 1.5x; residual = OPEN+FLUSH round trips

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Per-CPU io_uring queues serve requests via REGISTER/COMMIT_AND_FETCH
uring_cmds (raw SQE128 rings, no new deps), replacing the read/writev
syscall ping-pong on /dev/fuse; the legacy loop keeps running for INIT,
FORGET, INTERRUPT and notifications, and the kernel falls back to it on
any registration failure. Requests are reassembled into the classic
contiguous layout so the existing parse/dispatch/reply stack is reused;
ChannelSender becomes Fd|Uring. INIT advertises FUSE_OVER_IO_URING only
behind the env gate + kernel offer + ring-setup probe, with max_write
clamped to 1MiB to bound ring memory. Requires fuse.enable_uring=1.
Eval: phase8 repeated-read 3.00x -> 1.81x, base-read steady-state -34%,
git workload parity (clone is SQLite-bound), all correctness gates and
equivalence green. Knobs: AGENTFS_FUSE_URING_DEPTH, _SPIN_US.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…ated-read 1.81x), opt-in pending idle-host A/B

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
The first FLUSH performs its normal drain work and replies ENOSYS,
latching the kernel's connection-wide no_flush so every later close()
skips the FLUSH round trip (the kernel pushes dirty writeback pages via
write_inode_now before checking no_flush, so no data bypasses the
adapter). The buffered tail a closed handle leaves behind until the
async RELEASE is sealed by always-on pending-tail guards: a
pending_dirty_handles atomic gives attr-bearing paths a free fast path,
lookup drains and refetches, readdirplus intersects entries with
pending inodes and refetches once, link drains before the SDK call, and
setattr's drain is now unconditional. These guards also close the
pre-existing pre-close staleness window.
New gate scripts/validation/flush-coherence.py races stat / scandir /
link-stat / read against RELEASE under {flush,noflush} x {default TTL,
entry TTL 0}: 4/4 pass, one FLUSH op total (vs 242), zero mismatches.
Eval: open/read/close cycle 61.7us -> 31.2us (-49%), 26.4us compound
with uring (-57%); repeated-read gate 3.00x -> 1.96x; read-path paired
wall 0.823; git workload parity over 7 pairs. Kill switch
AGENTFS_FUSE_NOFLUSH=0; forced off under AGENTFS_DRAIN_ON_RELEASE=1.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…lt on, coherence gate added

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…ies stats

keep_cache_for_read_open now returns the Stats it consulted so the
adapter fingerprints the grant without a second getattr, and the
adapter grants directly from its own epoch-guarded attr cache when the
delta keep-cache gate is on, skipping the SDK probe entirely (SDK
getattrs in the read_search phase: 207 -> 0 per run). Wall-time
neutral: the eliminated calls were mostly SDK-LRU hits; the measured
per-open floor is the two surviving SQLite SELECTs (overlay
partial_origin + AgentFS::open existence check), carried as input to
the ENOSYS-OPEN evaluation.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…oor = 2 SELECTs; ENOSYS-OPEN is the lever

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…n (AGENTFS_FUSE_NOOPEN=1)

Replying ENOSYS to the first FUSE_OPEN latches the kernel's
connection-wide no_open: every later open(2)/close(2) completes with no
FUSE request (default fuse_file carries fh=0 + FOPEN_KEEP_CACHE) and
FUSE_RELEASE is skipped for every file, including CREATE-opened ones.
All fh=0 traffic resolves through a shared per-inode file table:
read/fsync resolve O_RDONLY, writes resolve O_RDWR (upgrading a
read-resolved entry replaces its file post-copy-up, strictly more
coherent than per-fh stale base handles), CREATE seeds the entry and
echoes fh=0, ftruncate's SETATTR fh path falls through to the same
resolution, and FORGET drains the buffered tail and drops the entry
(soft LRU cap AGENTFS_FUSE_INO_FILES_CAP, clean entries only). The
per-inode WriteBuffer joins the WS7 pending machinery (guards, counter,
flush_all_pending/destroy). Gated on the kernel offering
FUSE_NO_OPEN_SUPPORT; forced off under AGENTFS_DRAIN_ON_RELEASE.
New gate scripts/validation/noopen-coherence.py: close-race loop,
ftruncate via fh=0, O_TRUNC reopen, mmap+msync, eviction-cap and
overlay copy-up upgrade scenarios — 6/6 pass (1 open + 1 release vs
65 + 129 legacy). Light gates green; preliminary micro (loaded host):
open/read/close 64.9 -> 18.8us/cycle (4.72x -> 1.72x). Full A/B and
promotion decision deferred to an idle host.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…open gap classified, eval pending idle host

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…workloads or dead mount entries

agentfs exec and agentfs mount had no signal handling, so SIGTERM/SIGINT
killed the process without running MountHandle's unmount: the mount
table kept a dead entry (ENOTCONN for every later visitor) and exec's
workload child survived as an orphan still running inside it —
interrupted benchmark harnesses leaked both on every kill.

exec now supervises the child (select over child-exit vs
SIGTERM/SIGINT/SIGHUP; forwards SIGTERM, 5s grace, then SIGKILL), sets
PR_SET_PDEATHSIG=SIGKILL on the child so even SIGKILL on agentfs cannot
orphan it, always unmounts and removes the temp mountpoint, and exits
128+signo. The mount command runs the FUSE session on its own thread
and unmounts on the shared mount::shutdown_signal(); NFS foreground
upgrades from ctrl_c-only to the same three signals. Kill matrix:
TERM/INT fully clean (no procs, mounts, or dirs; exits 143/130), KILL
reaps the child via PDEATHSIG (lazy mount entry is the uncatchable
residual). auto_unmount was a dead end: the vendored fuser forces
allow_other with it, which requires user_allow_other in /etc/fuse.conf.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Idle-host A/B: micro open/read/close 47.3 -> 21.2us/cycle (paired
median 0.469); git workload read_search -56..-83%, diff -57..-62%,
status -20..-47%, checkout -22..-34%, fsck -18..-34%, edit and clone
neutral; read-path benchmark neutral (same-run normalized 2.54x ->
2.25x). Correctness with the new default: noopen-coherence 6/6,
flush-coherence 4/4, metadata-mutation, serialization stress,
writeback durability, no-fsync crash, 275 unit tests. Still requires
kernel FUSE_NO_OPEN_SUPPORT and stays off under
AGENTFS_DRAIN_ON_RELEASE.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…k RCA recorded

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…t handle drops

Unlink and rename-replace reaped fs_inode/fs_data rows the moment nlink
hit 0, so any I/O on an unlinked-but-open file failed — even the
close-time writeback mtime SETATTR — in both the per-fh and noopen fh
paths.

Every user-visible AgentFSFile now carries an RAII guard in a shared
OpenInodes registry (the batcher's ephemeral internal handles opt out).
All four deletion sites (public and trait unlink + rename overwrite)
skip row deletion while handles are live, leaving nlink = 0 as the
crash-safe orphan marker; the last handle drop queues the ino and
process_deferred_reaps (hooked at trait unlink/rmdir/rename and
finalize, nlink=0-guarded against rowid reuse) deletes the rows in one
transaction. A mount-time sweep collects crash-stranded orphans. The
integrity invariant namespace.non_root_inode_has_dentry now admits the
orphan state (dentry-less iff nlink = 0).

noopen-coherence scenario 5 restored to full POSIX assertions
(read-back, write-through, fsync, st_nlink==0, clean close): 6/6 PASS
in both modes. Two new SDK tests cover deferred reap and the mount
sweep; test_delete_file_removes_all_chunks now closes its handle before
remove, per the new contract. Documented residuals: ino_files LRU-cap
eviction under noopen can drop the SDK handle before the kernel fd
closes (>65k simultaneous inodes), and a second mount's sweep cannot
see this process's handles — both equivalent-or-better than the
pre-fix instant reap.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…ile-open followup closed

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…thetic only via --synthetic

Bare invocations silently generated a 96x1KB synthetic repo, which twice
produced scoreboard-incomparable ratios (most recently the 07-02 WS9 A/B,
mis-attributed to a kernel baseline shift). The canonical fixture is now
the no-flag default with a stderr note, --synthetic is the explicit
opt-out (warning on missing-fixture fallback), and --read-bytes defaults
to the canonical 4096.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…-fixture; measurement contract pinned; WS9 promotion provisional pending codex re-run

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…motion final; uring equal-or-better on every codex phase

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…es — no more /tmp husks

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…URING=0

Codex A/B (n=5): equal-or-better on every phase — total 3.37x -> 2.92x,
status 0.93x -> 0.60x, read_search 1.41x -> 1.37x, clone -3%; the
synthetic-fixture write-phase regression that kept WS6 opt-in was a
toy-workload artifact. Safe as a default because INIT only advertises
FUSE_OVER_IO_URING after the ring-setup probe succeeds (requires root
sysctl fuse.enable_uring=1); everything else stays on the legacy
/dev/fuse channel. Gates green under the new default: noopen/flush
coherence, serialization, durability, metadata-mutation, 109 CLI tests.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…TX_BLOCKS invalidation under writeback cache; accepted as floor

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…whole-state double write, edit micro floor already <=3ms; deferred SETATTR third codex parity

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…1s -> 0.754s (2.58x -> 2.22x) on codex

ImportSession holds one pooled connection and the dir-path->ino map across
chunk calls; agentfs clone imports directories up front, then overlaps
blob parsing with bounded-channel import chunks. Also de-flakes
overlay_reads_flag_off test (global counter -> per-inode has_pending).

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…VM-validated — GETATTR storm 1095 -> 70, storm cycle 2.2x faster, du/mmap correctness intact

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…ks, and kernel artifacts are the durable record

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
@factory-ain3sh factory-ain3sh merged commit 059bd52 into main Jul 3, 2026
24 of 34 checks passed
@factory-ain3sh factory-ain3sh deleted the phase4-north-star-implementation branch July 3, 2026 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants