Skip to content

rahuldongre-us/idp-bedrock

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intelligent Document Processing

Intelligent Document Processing with Amazon Bedrock and Anthropic Claude

Amazon Bedrock is a fully managed service that provides access to high-performing foundation models (FMs) from leading AI companies, including Anthropic's Claude 3 model family. Anthropic's Claude 3 models, such as Opus, Sonnet, and Haiku, excel at understanding complex enterprise content, including charts, graphs, technical diagrams and reports.

By integrating Claude 3 Sonnet with Amazon Bedrock, organizations can automate intelligent document processing (IDP) workflows at scale. This integration enables the extraction of valuable insights from unstructured content—such as documents, images, video, and audio—transforming them into structured formats for further analysis and decision-making.

This serverless architecture leverages the scalability and cost-effectiveness of AWS services while harnessing the cutting-edge intelligence of Anthropic Claude 3 Sonnet. By combining the robust infrastructure of AWS with Anthropic’s foundation models, this solution enables organizations to streamline their document processing workflows, extract valuable insights and enhance overall operational efficiency.

This project will run serverless lambda which access S3 bucket created by Terraform, invoke Bedrock model to generate data from image.

🚀 Features

  • 🧾 Document Understanding via Claude 3 Sonnet (Amazon Bedrock)

  • ☁️ Fully Serverless architecture (S3, Lambda)

  • 📄 Input: Scanned or uploaded documents

  • 🧠 Output: Structured JSON with key-value data

  • 🔁 Real-time, event-driven data flow

  • 🧩 Plug-and-play for document-heavy workflows (finance, legal, healthcare)

💼 Use Cases

  • Invoice automation
  • Contract analytics
  • Healthcare forms processing
  • KYC & compliance document workflows
  • Back-office document digitization

Run Locally

Clone the project

  git clone https://github.com/rahuldongre-us/idp-bedrock.git

Go to the project directory

  cd idp-bedrock

Installation

Ensure both Python and Terraform are installed.

  python --version
  python -m pip install boto3
  unzip terraform_*.zip
  sudo mv terraform /usr/local/bin/
  terraform -v

VS code steps.

Click on the search window.

Search

Select Python:Create Environment

Python Selection

Select Environment Type

Virtual Environment

Python Interpreter version selection.

Python Version

Project Structure.

Project Structure

AWS Bedrock Model Access

AWS Bedrock Model Access

Deployment

Should have aws configure to use with CLI

To deploy this project run

Run following command on the project root.

This should create AWS resources.

  chmod +x aws-infra.sh 
  ./aws-infra.sh

To test locally, should generate response.json file with results.

 aws lambda invoke --function-name <function-name> response.json
  {
      "StatusCode": 200,
      "ExecutedVersion": "$LATEST"
  }

Documentation

AWS Bedrock - AWS Bedrock

Terraform - Terraform

AWS Blogs - AWS Blogs

About

An end-to-end serverless pipeline for Intelligent Document Processing (IDP) using Amazon Bedrock and Anthropic Claude 3 Sonnet. This project extracts structured data from scanned documents (e.g., PDFs, forms, invoices) using GenAI models, and stores results in a scalable cloud-native architecture.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors