A Unified Toolkit for Deep Learning Based Document Image Analysis
-
Updated
Aug 15, 2024 - Python
A Unified Toolkit for Deep Learning Based Document Image Analysis
A Repo For Document AI
基于PaddleOCR重构,并且脱离PaddlePaddle深度学习训练框架的轻量级OCR,推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle deep learning training framework, with ultra-fast inference speed.
A curated list of resources for Document Understanding (DU) topic
📚 Process PDFs, Word documents and more with spaCy
Document Layout Analysis resources repos for development with PdfPig.
Document Layout Analysis
Page to PAGE Layout Analysis Tool
Detectron2 for Document Layout Analysis
ICDAR 2019: MaskRCNN on PubLayNet datasets. Paragraph detection, table detection, figure detection,...
Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting various tasks such as document cleanup, optical character recognition (OCR), classification, splitting, named entity recognition, and form processing
A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
Tools for extract figure, table, text, .. from a pdf document.
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
BoundaryNet - A Semi-Automatic Layout Annotation Tool
Simple docker deployment of document layout analysis using detectron2
Using a MaskRCNN model trained on the PublayNet dataset with ML.Net in C# / .Net for Document layout analysis and page segmmentation task.
Add a description, image, and links to the document-layout-analysis topic page so that developers can more easily learn about it.
To associate your repository with the document-layout-analysis topic, visit your repo's landing page and select "manage topics."