feat: encrypted replication for private subtrees (B1/B2/B3) for #18#36
feat: encrypted replication for private subtrees (B1/B2/B3) for #18#36beardthelion wants to merge 57 commits into
Conversation
…al clone upload_pack_excluding emitted a v2 packfile section, but info_refs advertises v0, so real clients negotiated v0 and rejected the response with 'expected ACK/NAK, got packfile'. Frame the v0 stateless-rpc shape instead (NAK, then the pack via side-band-64k when offered). Add an end-to-end test that stands up info_refs + upload_pack_excluding and runs a real git partial clone, asserting the withheld blob's bytes never reach the client while its tree entry and SHA stay visible. A stock full clone cannot consume the pack (it is not closed under reachability, so fetch fails the connectivity check); a partial clone is required.
…tion choice Add a real-git test that partial-clones, pushes a new commit server-side, then fetches: the new object arrives and the withheld blob stays absent. This pins down that ignoring have/want negotiation (always sending a self-contained pack of all refs minus withheld, with NAK) is correct for both clone and fetch; the only cost is a fetch re-sends the full object set. Refactor the real-git tests onto a shared server harness and document the negotiation decision in code and in the plan's follow-ups.
Move the two blocking git shell-outs in the filtered upload-pack path off the async worker thread, matching the tokio::process / spawn_blocking usage already in this file: build_filtered_pack (rev-list + pack-objects) and withheld_blob_oids (per-ref ls-tree) now run inside spawn_blocking so a large repo cannot stall the tokio runtime. Behavior is unchanged. Also fix the Task 0 findings block in the Phase 3 plan: it still recorded v2 packfile framing, which is the exact path that failed against a real client and was corrected to v0. The block now documents the shipped v0 contract. Drop a stray trailing code fence flagged by markdownlint (MD040). The speculative ls-tree timeout and the public/no-rules fast-path from the review are intentionally left out: the timeout guards against adversarial repos we do not yet host, and the fast-path is a micro-optimization not worth the extra branch right now.
kevincodex1 asked to keep the superpowers planning docs out of the repo. The Phase 3 plan was scaffolding for this change, not something the project needs to carry. Removing it leaves only the code and tests in the PR.
Three fixes from the PR Gitlawb#33 review: - withheld_paths now applies the whole-repo "/" read gate (returns repo-not-found when the caller cannot read the root), matching the git read endpoints. Without it the endpoint disclosed a private repo's existence and path layout to unauthorized callers. The withheld_globs doc already assumed this gate existed; now it does. - A nested allow under a denied parent (e.g. "/secret/public/**" allowed, "/secret/**" denied) was over-withheld: the client sparse-excluded the whole parent and hid paths the caller may read. The endpoint now also returns a "reinclude" list (allowed globs strictly under a denied one) and gl clone re-includes them in the sparse spec after the excludes. - Wildcard-free globs like "/docs/private" match both the exact path and a subtree (per glob_matches), but the client only emitted the subtree exclude. sparse_patterns now emits both "/docs/private" and "/docs/private/". Verified the exclude-then-reinclude sparse ordering checks out cleanly with real git, plus unit tests for reincluded_globs, the nested re-include, the exact-path exclude, and sparse_patterns.
…eld-path errors
split_once('/') accepted owner/name/extra, smuggling a path segment into
the repo name that then flowed into the API path and clone URL; reject it.
fetch_withheld swallowed every network/auth/5xx/JSON error into an empty
result, dropping to a stock clone that the node refuses once blobs are
withheld. Now only 404/501 (endpoint unsupported) fall back to empty; the
rest propagate so the real cause surfaces.
The receive-pack replication chokepoint called withheld_blob_oids directly on the tokio worker, where its blocking git ls-tree walk can stall the runtime for repos with many refs. Wrap it in spawn_blocking to match the upload-pack serve path.
Add a regression test for deny /secret, allow /secret/public, deny /secret/public/admin and clarify the sparse-checkout comment. git does not re-traverse an explicitly excluded directory, so emitting all excludes before re-includes keeps the deepest deny in force; the broader parent re-include does not resurrect it.
…hema Read the default branch from the local origin/HEAD symref clone already set, instead of parsing the localized, network-dependent output of git remote show origin. Deserialize the withheld-paths body into a typed struct so a missing or mistyped withheld/reinclude field is a hard error rather than silently becoming an empty list, which would mask a server regression behind a later clone failure.
Give the sync worker's HTTP client a 30s timeout so a stalled peer withheld-paths lookup cannot hang the worker loop. When a repo that was mirrored as a promisor (mode B) becomes fully public, fetch_repo now clears remote.origin.promisor and partialclonefilter and refetches, so the once-withheld blobs are backfilled instead of the mirror staying permanently partial.
At the push chokepoint, after pinning withheld objects, resolve each withheld blob's recipient DIDs and seal it to their Ed25519 keys with seal_blob, pinning the ciphertext to IPFS and recording it in encrypted_blobs. Best-effort per blob: failures are logged and skipped, never pinned in plaintext. Pinata replication is unchanged; B1 encrypts to IPFS only. Adds ed25519-dalek as a direct dependency of gitlawb-node (it was only declared in the workspace Cargo.toml).
Skip re-pinning only when an existing envelope already covers exactly the current recipients. A reader added to a rule after the first pin now gets a re-seal on the next push instead of being permanently locked out. Reader removal stays non-retroactive (the old envelope is already public).
…or present-check Add two hermetic integration tests for recover_from_arweave that drive the full read path over mocked Arweave GraphQL + IPFS gateways: discover the manifest, fetch it, fetch the envelope, decrypt, and install the withheld blob. One covers an authorized recipient (blob installed), the other a non-recipient (nothing recovered). Both simulate origin death by removing the promisor remote and enable uploadpack.allowFilter so the blob is truly withheld over file://. Also harden the local presence check in recover_from_arweave with GIT_NO_LAZY_FETCH=1 and .output() so the expected 'missing object' case does not trigger a wasted promisor fetch or leak git stderr to the user.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds envelope encryption for withheld blobs, visibility-driven blob withholding and filtered upload-pack serving, DB persistence and encrypted-pin/replication, Arweave manifest anchoring, sync worker promisor-aware mirroring, and a client ChangesEncrypted Blob End-to-End Support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (6)
crates/gitlawb-core/src/identity.rs (1)
55-59: 💤 Low valueConsider using
Zeroizingfor consistency withto_seed.The new
seed_bytesreturns the raw seed withoutZeroizing, while the existingto_seedwraps it. Since the seed is sensitive key material, consider returningZeroizing<[u8; 32]>here as well for consistent memory hygiene. The caller inencrypt.rscan dereference it.♻️ Suggested change
- pub fn seed_bytes(&self) -> [u8; 32] { - self.signing_key.to_bytes() + pub fn seed_bytes(&self) -> Zeroizing<[u8; 32]> { + Zeroizing::new(self.signing_key.to_bytes()) }Then in
encrypt.rs:- let my_x = XSecret::from(x25519_secret_from_seed(&keypair.seed_bytes())); + let my_x = XSecret::from(x25519_secret_from_seed(&keypair.seed_bytes()));(No change needed in encrypt.rs since
Zeroizing<[u8; 32]>derefs to[u8; 32].)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-core/src/identity.rs` around lines 55 - 59, seed_bytes currently returns a plain [u8; 32] exposing sensitive seed material; change its signature to pub fn seed_bytes(&self) -> Zeroizing<[u8; 32]> and return Zeroizing::new(self.signing_key.to_bytes()) and add use zeroize::Zeroizing at the top; keep to_seed as-is and note callers (e.g., encrypt.rs) need no changes because Zeroizing<[u8; 32]> derefs to [u8; 32].crates/gitlawb-node/src/encrypted_pin.rs (1)
49-63: 💤 Low valueConsider logging warnings for silent failures.
The
read_objectandpin_git_objectfailures silently continue without logging, unlikeseal_blobandrecord_encrypted_blobfailures which log warnings. For operational visibility, consider adding warnings here as well.♻️ Suggested change
let data = match crate::git::store::read_object(repo_path, oid) { Ok(Some((_t, bytes))) => bytes, - _ => continue, + Ok(None) => { + tracing::warn!(oid = %oid, "blob not found in git store; skipping"); + continue; + } + Err(e) => { + tracing::warn!(oid = %oid, err = %e, "read_object failed; skipping"); + continue; + } }; // ... seal_blob ... let cid = match crate::ipfs_pin::pin_git_object(ipfs_api, oid, &envelope).await { Ok(c) if !c.is_empty() => c, - _ => continue, + Ok(_) => { + tracing::warn!(oid = %oid, "IPFS pin returned empty CID; skipping"); + continue; + } + Err(e) => { + tracing::warn!(oid = %oid, err = %e, "IPFS pin failed; skipping"); + continue; + } };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/encrypted_pin.rs` around lines 49 - 63, Add warning logs for the branches that currently silently continue so operational failures are visible: when crate::git::store::read_object(repo_path, oid) returns Err or Ok(None), emit a tracing::warn! including oid and repo_path before continuing; likewise, when crate::ipfs_pin::pin_git_object(ipfs_api, oid, &envelope).await returns Err or an empty cid, emit a tracing::warn! including oid and ipfs_api (or identifying info) and the result before continuing; keep the existing tracing style used for seal_blob and record_encrypted_blob to stay consistent.crates/gl/src/clone.rs (2)
241-336: ⚡ Quick winSilent failure on encrypted-blobs list fetch hinders troubleshooting.
Lines 255-261 return an empty
Vecwhen the/encrypted-blobsrequest fails, without logging the failure reason. Users won't know whether recovery was skipped because:
- No encrypted blobs exist (legitimate),
- Network error reaching the node,
- Authorization failure, or
- Node internal error.
Adding a debug or info log on non-success responses would help users diagnose recovery issues without changing the best-effort semantics.
♻️ Suggested improvement: log fetch failures
let resp = match client .get_signed(&format!("/api/v1/repos/{owner}/{name}/encrypted-blobs")) .await { Ok(r) if r.status().is_success() => r, - _ => return Ok(vec![]), + Ok(r) => { + tracing::debug!(status = %r.status(), "encrypted-blobs fetch failed; skipping node-based recovery"); + return Ok(vec![]); + } + Err(e) => { + tracing::debug!(err = %e, "encrypted-blobs request error; skipping node-based recovery"); + return Ok(vec![]); + } };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gl/src/clone.rs` around lines 241 - 336, The initial fetch of "/api/v1/repos/{owner}/{name}/encrypted-blobs" in recover_encrypted_blobs silently returns Ok(vec![]) on any non-success result; update the match to log the failure details (status code and/or response body/error) before returning so operators can distinguish "no blobs" from network/auth/node errors — capture the Err or non-success branch from client.get_signed(...) (and the response when present) and emit a processLogger/info/debug or eprintln! with context including node, owner, name and the response status/body or error, then continue returning Ok(vec![]) to preserve best-effort behavior.
204-225: 💤 Low value404 response ambiguity: endpoint-not-implemented vs repo-not-found.
Line 216 treats HTTP 404 and 501 as "no withheld paths" for backward compatibility with older nodes that don't implement the endpoint. However, a 404 could also mean the repository
{owner}/{name}doesn't exist.When the repo is missing, the function returns
Ok((Vec::new(), Vec::new()))and proceeds to clone, only to fail later at the git fetch step with a less clear error. This indirection makes typos in repo names harder to diagnose.Consider logging a warning when receiving 404 to help users distinguish between "old node" and "typo in repo name."
♻️ Optional improvement: log 404 for easier debugging
let resp = match resp { Ok(r) if r.status().is_success() => r, - Ok(r) if matches!(r.status().as_u16(), 404 | 501) => return Ok((Vec::new(), Vec::new())), + Ok(r) if matches!(r.status().as_u16(), 404 | 501) => { + tracing::debug!( + status = r.status().as_u16(), + "withheld-paths endpoint returned {}: assuming no withheld paths (old node or repo not found)", + r.status() + ); + return Ok((Vec::new(), Vec::new())); + } Ok(r) => bail!("withheld-paths lookup failed: {}", r.status()), Err(err) => return Err(err).context("fetching withheld paths"), };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gl/src/clone.rs` around lines 204 - 225, In fetch_withheld, when treating HTTP 404/501 as "no withheld paths" broadened for backward compatibility, add a warning log specifically for 404 responses so repo-not-found vs unimplemented endpoint is visible; update the match that currently returns Ok((Vec::new(), Vec::new())) for status 404 to call the logger (e.g., process_logger.warn or crate logger) with context including node, owner, name and the 404 status before returning empty vectors, while keeping 501 silent for compatibility; reference the fetch_withheld function and the NodeClient.get/NodeClient.get_signed response handling to locate where to insert the warning.crates/gitlawb-node/src/api/encrypted.rs (1)
12-32: 💤 Low valueConsider returning 401 for unauthenticated callers instead of empty list.
Line 17 defaults to
""when no authentication is present. Unauthenticated callers will receive{"blobs": []}(since no blob is authorized to DID""), which is indistinguishable from "authenticated but authorized for zero blobs."Returning a 401 Unauthorized for missing auth would make the security boundary more explicit and help clients distinguish authentication failures from empty result sets.
♻️ Optional improvement: explicit auth requirement
pub async fn list_encrypted_blobs( State(state): State<AppState>, auth: Option<Extension<AuthenticatedDid>>, Path((owner, repo)): Path<(String, String)>, ) -> Result<Json<serde_json::Value>> { - let caller = auth.as_ref().map(|e| e.0 .0.as_str()).unwrap_or(""); + let caller = auth + .as_ref() + .map(|e| e.0 .0.as_str()) + .ok_or_else(|| AppError::Unauthorized("authentication required".into()))?; let record = state .db .get_repo(&owner, &repo) .await? .ok_or_else(|| AppError::RepoNotFound(format!("{owner}/{repo}")))?; let rows = state .db .list_encrypted_blobs_for(&record.id, caller) .await?; let blobs: Vec<_> = rows .into_iter() .map(|(oid, cid)| serde_json::json!({ "oid": oid, "cid": cid })) .collect(); Ok(Json(serde_json::json!({ "blobs": blobs }))) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/api/encrypted.rs` around lines 12 - 32, The handler list_encrypted_blobs currently treats missing auth as caller = "" causing unauthenticated requests to get an empty blob list; instead detect when auth is None and return a 401 Unauthorized error. Update list_encrypted_blobs: replace the caller defaulting logic with an early check on auth (the Extension<AuthenticatedDid> argument) and return Err(AppError::Unauthorized(...)) or the project’s equivalent unauthorized error when auth.is_none(); keep the rest of the function (db lookups and blob mapping) unchanged so only authenticated requests proceed.crates/gitlawb-node/src/ipfs_pin.rs (1)
76-86: ⚡ Quick winIPFS error response body is discarded, hindering diagnostics.
Line 83 constructs the error message using only the HTTP status code. IPFS often returns helpful error details in the response body (e.g., "block not found", "timeout reading from datastore") that would aid debugging when recovery fails.
♻️ Suggested improvement: include response body in error
pub async fn cat(ipfs_api: &str, cid: &str) -> Result<Vec<u8>> { if ipfs_api.is_empty() { return Err(anyhow::anyhow!("IPFS not configured")); } let url = format!("{}/api/v0/cat?arg={}", ipfs_api.trim_end_matches('/'), cid); let resp = reqwest::Client::new().post(&url).send().await?; if !resp.status().is_success() { - return Err(anyhow::anyhow!("ipfs cat {cid}: {}", resp.status())); + let status = resp.status(); + let body = resp.text().await.unwrap_or_default(); + return Err(anyhow::anyhow!("ipfs cat {cid}: {status}: {body}")); } Ok(resp.bytes().await?.to_vec()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/ipfs_pin.rs` around lines 76 - 86, The cat function currently returns an error using only resp.status(); update the failure path in fn cat(ipfs_pin::cat / cat) to read the response body (e.g., let body = resp.text().await.unwrap_or_default()) when resp.status().is_success() is false and include that body (or a truncated version) in the anyhow::anyhow! error message so IPFS error details (like "block not found") are preserved; ensure you await resp.text() only on the error branch (so you don't consume the body prematurely) and handle any text read errors gracefully (fallback to empty or placeholder) before constructing the final error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gitlawb-node/src/api/encrypted.rs`:
- Around line 36-56: The handler get_encrypted_blob currently maps a None from
state.db.encrypted_blob_cid(...) to AppError::RepoNotFound, which is incorrect;
change that error to a more specific variant such as
AppError::BlobNotFound(format!("{owner}/{repo}/{oid}")) or
AppError::AccessDenied when you can detect the caller is unauthorized. Locate
the call in get_encrypted_blob and replace the ok_or_else(||
AppError::RepoNotFound(...)) with ok_or_else(|| AppError::BlobNotFound(...)) or,
if you can distinguish authorization from missing blob within encrypted_blob_cid
(or by adding a helper), return AppError::AccessDenied for unauthorized callers
and AppError::BlobNotFound for genuinely missing OIDs. Ensure you update any
error variant names you use to match existing AppError variants (or add them to
AppError if missing).
In `@crates/gitlawb-node/src/db/mod.rs`:
- Around line 1663-1684: Replace the current silent JSON fallback for recipients
(serde_json::from_str(&recipients).unwrap_or_default()) with explicit
deserialization error handling: in list_encrypted_blobs_for,
list_all_encrypted_blobs, encrypted_blob_cid, and encrypted_blob_recipients,
call serde_json::from_str(&recipients) and on Err log a clear corruption message
(including repo_id and oid/cid where available) and return a propagated error
instead of an empty Vec so callers see the failure; use map_err / context to
convert the serde_json error into the function's Result error type and ensure
the log distinguishes malformed recipients JSON from other DB errors.
In `@crates/gl/src/clone.rs`:
- Around line 563-576: The code in clone.rs silently drops I/O and git errors
when updating the sparse-checkout and running checkout (the read_to_string call,
write to .git/info/sparse-checkout, and git(&dest, &["checkout", "--", "."])
invocations), which can leave recovered files unmaterialized; update this block
to handle the Results instead of using let _ = — e.g., check and log (or return)
errors from std::fs::read_to_string(&spec), std::fs::write(&spec, s), and the
git(...) call, and emit a clear warning mentioning the affected paths/spec so
users know sparse-checkout or checkout failed and recovered files may be missing
(use the local variables spec and paths to construct the message).
---
Nitpick comments:
In `@crates/gitlawb-core/src/identity.rs`:
- Around line 55-59: seed_bytes currently returns a plain [u8; 32] exposing
sensitive seed material; change its signature to pub fn seed_bytes(&self) ->
Zeroizing<[u8; 32]> and return Zeroizing::new(self.signing_key.to_bytes()) and
add use zeroize::Zeroizing at the top; keep to_seed as-is and note callers
(e.g., encrypt.rs) need no changes because Zeroizing<[u8; 32]> derefs to [u8;
32].
In `@crates/gitlawb-node/src/api/encrypted.rs`:
- Around line 12-32: The handler list_encrypted_blobs currently treats missing
auth as caller = "" causing unauthenticated requests to get an empty blob list;
instead detect when auth is None and return a 401 Unauthorized error. Update
list_encrypted_blobs: replace the caller defaulting logic with an early check on
auth (the Extension<AuthenticatedDid> argument) and return
Err(AppError::Unauthorized(...)) or the project’s equivalent unauthorized error
when auth.is_none(); keep the rest of the function (db lookups and blob mapping)
unchanged so only authenticated requests proceed.
In `@crates/gitlawb-node/src/encrypted_pin.rs`:
- Around line 49-63: Add warning logs for the branches that currently silently
continue so operational failures are visible: when
crate::git::store::read_object(repo_path, oid) returns Err or Ok(None), emit a
tracing::warn! including oid and repo_path before continuing; likewise, when
crate::ipfs_pin::pin_git_object(ipfs_api, oid, &envelope).await returns Err or
an empty cid, emit a tracing::warn! including oid and ipfs_api (or identifying
info) and the result before continuing; keep the existing tracing style used for
seal_blob and record_encrypted_blob to stay consistent.
In `@crates/gitlawb-node/src/ipfs_pin.rs`:
- Around line 76-86: The cat function currently returns an error using only
resp.status(); update the failure path in fn cat(ipfs_pin::cat / cat) to read
the response body (e.g., let body = resp.text().await.unwrap_or_default()) when
resp.status().is_success() is false and include that body (or a truncated
version) in the anyhow::anyhow! error message so IPFS error details (like "block
not found") are preserved; ensure you await resp.text() only on the error branch
(so you don't consume the body prematurely) and handle any text read errors
gracefully (fallback to empty or placeholder) before constructing the final
error.
In `@crates/gl/src/clone.rs`:
- Around line 241-336: The initial fetch of
"/api/v1/repos/{owner}/{name}/encrypted-blobs" in recover_encrypted_blobs
silently returns Ok(vec![]) on any non-success result; update the match to log
the failure details (status code and/or response body/error) before returning so
operators can distinguish "no blobs" from network/auth/node errors — capture the
Err or non-success branch from client.get_signed(...) (and the response when
present) and emit a processLogger/info/debug or eprintln! with context including
node, owner, name and the response status/body or error, then continue returning
Ok(vec![]) to preserve best-effort behavior.
- Around line 204-225: In fetch_withheld, when treating HTTP 404/501 as "no
withheld paths" broadened for backward compatibility, add a warning log
specifically for 404 responses so repo-not-found vs unimplemented endpoint is
visible; update the match that currently returns Ok((Vec::new(), Vec::new()))
for status 404 to call the logger (e.g., process_logger.warn or crate logger)
with context including node, owner, name and the 404 status before returning
empty vectors, while keeping 501 silent for compatibility; reference the
fetch_withheld function and the NodeClient.get/NodeClient.get_signed response
handling to locate where to insert the warning.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: b291abdf-f2b1-478d-aa85-b571c76c9cb1
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (24)
.gitignorecrates/gitlawb-core/Cargo.tomlcrates/gitlawb-core/src/encrypt.rscrates/gitlawb-core/src/identity.rscrates/gitlawb-core/src/lib.rscrates/gitlawb-node/Cargo.tomlcrates/gitlawb-node/src/api/encrypted.rscrates/gitlawb-node/src/api/mod.rscrates/gitlawb-node/src/api/repos.rscrates/gitlawb-node/src/api/visibility.rscrates/gitlawb-node/src/arweave.rscrates/gitlawb-node/src/db/mod.rscrates/gitlawb-node/src/encrypted_pin.rscrates/gitlawb-node/src/git/mod.rscrates/gitlawb-node/src/git/smart_http.rscrates/gitlawb-node/src/git/visibility_pack.rscrates/gitlawb-node/src/ipfs_pin.rscrates/gitlawb-node/src/main.rscrates/gitlawb-node/src/pinata.rscrates/gitlawb-node/src/server.rscrates/gitlawb-node/src/sync.rscrates/gitlawb-node/src/visibility.rscrates/gl/src/clone.rscrates/gl/src/main.rs
db: parse_recipients now propagates a descriptive error instead of defaulting corrupt recipients JSON to an empty list, which would have denied authorized readers and handed peers incomplete metadata. gl: clone recovery now warns when the sparse-checkout file cannot be read or written, or when the post-recovery checkout fails, instead of silently discarding those errors and claiming files were recovered.
Adds encrypted replication for path-scoped private subtrees (issue #18), so a repo can be public while a withheld subtree stays durable and recoverable by authorized readers without ever exposing plaintext to the network.
What this does
Three layers, each building on the last:
gl clonetransparently recovers and installs the blobs a reader is authorized to decrypt; everyone else just sees the public tree.glreconstructs the recovery index from public Arweave + IPFS gateways and still restores the authorized blobs. Recovery is bounded by a 30s timeout per gateway.Plaintext for a withheld path is never replicated. Non-recipients can reach the envelope and the manifest but cannot decrypt anything.
Dependencies
This builds on the path-scoped visibility stack and does not stand alone. It expects these to land first:
Because none of those are merged yet, the diff here carries their changes too. The new work in this PR is the encrypted-replication commits (core envelope crypto,
encrypted_pin, theencrypted-blobsendpoints,arweavemanifest anchoring, and theglrecovery path). It will shrink to just that once the prerequisites merge.Tests
Hermetic coverage for the new paths: envelope seal/open round-trips, recipient resolution, manifest anchoring and parsing (mocked Irys/GraphQL), and read-path clone recovery over mocked Arweave + IPFS gateways for both an authorized recipient and a non-recipient. Full suites green.
Live end-to-end across two nodes with Kubo + Irys is the remaining manual step before this is production-ready.
Summary by CodeRabbit