Skip to content

[Feat] Add CLI support for OCR#2058

Open
simont2k wants to merge 1 commit intomindee:mainfrom
simont2k:feat/CLI-support
Open

[Feat] Add CLI support for OCR#2058
simont2k wants to merge 1 commit intomindee:mainfrom
simont2k:feat/CLI-support

Conversation

@simont2k
Copy link
Copy Markdown

@simont2k simont2k commented May 5, 2026

This PR:

  • Add CLI to perform OCR
  • options:
--input_path INPUT_PATH
                      path to input image or PDF file (default: None)
--det_arch DET_ARCH   name of the detection architecture or the model itself to use (default: db_resnet50)
--reco_arch RECO_ARCH
                      name of the recognition architecture or the model itself to use (default: crnn_vgg16_bn)
--assume_straight_pages, --no-assume_straight_pages
                      assume only straight pages without rotated textual elements (default: True)
--straighten_pages    attempt to straighten skewed pages before analysis (default: False)
--preserve_aspect_ratio, --no-preserve_aspect_ratio
                      preserve aspect ratio when resizing pages (default: True)
--symmetric_pad       apply symmetric padding (default: False)
--det_bs DET_BS       batch size for detection (default: 2)
--reco_bs RECO_BS     batch size for recognition (default: 128)
--detect_orientation  automatically detect page orientation (default: False)
--detect_language     detect language of the text (default: False)
--output OUTPUT       path to output results in JSON format (default: results.json)
  • usage example: doctr-cli --input_path sample.png

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant