Skip to content

fix(core): preserve external_id during entity upsert on re-index#724

Merged
groksrc merged 1 commit intomainfrom
fix-preserve-external-id-on-reindex
Apr 7, 2026
Merged

fix(core): preserve external_id during entity upsert on re-index#724
groksrc merged 1 commit intomainfrom
fix-preserve-external-id-on-reindex

Conversation

@groksrc
Copy link
Copy Markdown
Member

@groksrc groksrc commented Apr 7, 2026

Summary

  • Force full re-index was generating new external_id UUIDs for every entity, breaking public share links in Basic Memory Cloud
  • Root cause: upsert_entity() merged a freshly-created entity model (with a new UUID) over the existing database record, overwriting the stable external_id
  • Fix: preserve existing_entity.external_id before session.merge() in the file_path conflict resolution path

Details

Reproduction path:

  1. Create a public share link for a note (stores note_external_id in public_share table)
  2. Trigger force full re-index (POST /admin/tenants/{id}/index?force_full=true)
  3. Share link returns 404 — entity exists but has a different external_id

Code flow:

  1. entity_indexer calls create_entity_from_markdown() without an existing entity
  2. entity_model_from_markdown() creates Entity() with external_id = uuid4()
  3. upsert_entity() detects file_path conflict, sets entity.id = existing.id
  4. session.merge() overwrites all fields including external_idbug

Fix: One line — entity.external_id = existing_entity.external_id before merge.

Test plan

  • New test test_upsert_entity_preserves_external_id — creates entity, upserts with different external_id, asserts original is preserved
  • All 234 repository tests pass
  • Lint and typecheck clean
  • fast-check passes (2 pre-existing failures unrelated to this change)

🤖 Generated with Claude Code

Force full re-index was generating new external_id UUIDs for every
entity, breaking public share links that reference entities by their
stable external_id. The upsert logic now preserves the existing
external_id when resolving file_path conflicts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Drew Cain <groksrc@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Claude finished @groksrc's task in 1m 1s —— View job


PR Review: fix(core): preserve external_id during entity upsert on re-index

  • Read changed files
  • Verify fix correctness and logic
  • Check test coverage and quality
  • Review against team checklist

Summary

This is a minimal, targeted, and correct fix. The root cause analysis is accurate, the fix is in exactly the right place, and the test reproduces the bug faithfully.

The fix (entity_repository.py:393):

entity.external_id = existing_entity.external_id

This mirrors the existing entity.id = existing_entity.id pattern immediately above it — a natural and consistent extension of the conflict resolution logic.


Team Checklist

Code Quality & Standards

  • Follows Basic Memory's coding conventions
  • Python 3.12+ async/SQLAlchemy 2.0 patterns
  • Comment follows the project's "Decision Point Comments" style (Trigger / Why / Outcome)
  • 100-character line length maintained
  • No commented-out code

Testing & Documentation

  • New test test_upsert_entity_preserves_external_id covers the exact bug reproduction path
  • Test docstring follows project literate programming conventions
  • No integration tests required (this is a repository-layer fix, not an MCP tool)
  • No schema changes — no Alembic migration needed

Basic Memory Architecture

  • No architectural concerns — minimal change, preserves all existing behaviors
  • external_id is only preserved in the file_path conflict path, which is correct. The permalink conflict path in _handle_permalink_conflict correctly generates a new external_id since it creates a genuinely new entity.

Security & Performance

  • No security concerns
  • No performance impact

Compatibility

  • No file path comparisons
  • No emoji/unicode in code or logs

One Minor Observation (Non-blocking)

existing_entity.external_id could be None for older entities created before external_id was introduced. In that case, entity.external_id = None would discard the freshly-generated UUID. However, this is benign: if the existing entity has no external_id, there are no share links that reference it, so nothing is lost. The entity would just remain without an external_id until the next explicit assignment. No action needed, just noting it for awareness.


Verdict: LGTM. Clean fix, good test, no issues. Ready to merge.

@groksrc groksrc requested a review from phernandez April 7, 2026 15:00
@groksrc groksrc merged commit b35d594 into main Apr 7, 2026
41 checks passed
@groksrc groksrc deleted the fix-preserve-external-id-on-reindex branch April 7, 2026 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants