Add OCR fallback for scanned PDF memories#18

Merged

brianmeyer merged 1 commit intomasterfrom

codex/rec-72-scanned-pdf-ocr

Mar 22, 2026

Owner

brianmeyer commented Mar 22, 2026

Summary\n- add OCR text fallback for scanned/image-only PDF pages while preserving rendered page images\n- index OCR text as sibling document children so scanned pages are searchable and document-as-query can use them\n- extend tests for document extraction, storage ingest, and file-path query routing\n\n## Testing\n- pytest -q tests/test_documents.py tests/test_storage.py tests/test_config_tools.py\n- pytest -x -m 'not live' --tb=short\n


          Add OCR fallback for scanned PDF memories

9fcee4f

brianmeyer merged commit 0ebe474 into master

4 checks passed

brianmeyer deleted the codex/rec-72-scanned-pdf-ocr branch

March 22, 2026 19:02

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet