Skip to content

Doghouse: avoid repo-path collisions in JSONL snapshot storage #7

@flyingrobots

Description

@flyingrobots

Follow-up to #4.

Why

JSONLStorageAdapter._get_path() currently derives the storage directory with repo.replace("/", "_").
That aliases distinct repositories onto the same path. For example:

  • a/b_c -> a_b_c
  • a_b/c -> a_b_c

Doghouse is a flight recorder. Cross-repo history contamination is a trust failure.

Current behavior

File:

  • src/doghouse/adapters/storage/jsonl_adapter.py

Today, repo identity is not injective once / is rewritten to _.
Different repos can append snapshots into the same JSONL ledger.

Desired outcome

Use a storage identity scheme that cannot alias valid repo names.

Acceptable approaches include:

  • nested directories (owner/name/pr-123.jsonl)
  • URL-safe encoding
  • a stable hash with enough context to stay inspectable

Acceptance criteria

  • Two distinct repos cannot resolve to the same storage path.
  • Add a regression test proving a/b_c and a_b/c land in different paths.
  • Existing read/write behavior remains simple and local-first.
  • Any migration behavior for previously written histories is documented if needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions