get_textpage_ocr() should be able to identify image / page rotation and read the content accordingly

**Is your feature request related to a problem? Please describe.**
There are pdfs where some pages are 90 degrees rotated to the left or right, or upside down rotated. These pages cannot be read by get_textpage_ocr() - I only get garbage as output.

**Describe the solution you'd like**
What I'd like is that get_textpage_ocr() identifies if a page / image is rotated 90 degrees to the left or right (or upside down rotated), rotated it back an read the content of the upright standing page / image. As of pymupdf==1.27.2.3 it just tries to read the content of the page as is, so from rotated pages it just reads garbage.

**Describe alternatives you've considered**
Textract on AWS does this automatically.

**Additional context**


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_textpage_ocr() should be able to identify image / page rotation and read the content accordingly #5009

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

get_textpage_ocr() should be able to identify image / page rotation and read the content accordingly #5009

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions