Skip to content

Add OCR fallback for scanned PDF memories#18

Merged
brianmeyer merged 1 commit intomasterfrom
codex/rec-72-scanned-pdf-ocr
Mar 22, 2026
Merged

Add OCR fallback for scanned PDF memories#18
brianmeyer merged 1 commit intomasterfrom
codex/rec-72-scanned-pdf-ocr

Conversation

@brianmeyer
Copy link
Copy Markdown
Owner

Summary\n- add OCR text fallback for scanned/image-only PDF pages while preserving rendered page images\n- index OCR text as sibling document children so scanned pages are searchable and document-as-query can use them\n- extend tests for document extraction, storage ingest, and file-path query routing\n\n## Testing\n- pytest -q tests/test_documents.py tests/test_storage.py tests/test_config_tools.py\n- pytest -x -m 'not live' --tb=short\n

@brianmeyer brianmeyer merged commit 0ebe474 into master Mar 22, 2026
4 checks passed
@brianmeyer brianmeyer deleted the codex/rec-72-scanned-pdf-ocr branch March 22, 2026 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant