OpenFDA BigData Pipeline

OpenFDA BigData Pipeline enables collection, processing, and real-time presentation of data - on adverse drug events from the openFDA database.

The solution uses Apache Kafka as a message broker, Mongo DB as a document storage, Spring Boot for services and is Dockerized.

Architecture

Configuration

The project runs with the default configuration defined in each of services and in pipeline.yml. For more details refer directly to:

Running solution locally in Docker

If you intend to try running project yourself, I have put together a pipeline.yml configuration that can help you get started.

Calling the following command

docker-compose -f pipeline.yml up

will:

Start openfda-producer container
Start zookeper container
Start kafka container
Start mongodb container
Start openfda-consumer container
Start openfda-live-dashboard container which will expose port 8050
Start jupyter-notebook container which will expose port 8888

Accessing the application

Once all your Docker containers are up and running you can access openfda-live-dashaboard web dashboard via a browser under the following URL:

http://localhost:8050

In addition, you can access Jupyter Notebook jupyter-notebook via a browser under the following URL:

http://localhost:8888

Example graphs

Top 20 patient reactions reported between 2020-01-01 and 2022-01-01

Top 20 patient medical products reported between 2020-01-01 and 2022-01-01

Issues and contribution

Bug reports and pull requests are welcome on GitHub at https://github.com/koziolk/openfda-bigdata-pipeline

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
jupyter-notebook		jupyter-notebook
mongodb		mongodb
openfda-consumer		openfda-consumer
openfda-live-dashboard		openfda-live-dashboard
openfda-producer		openfda-producer
results		results
.gitignore		.gitignore
README.md		README.md
pipeline-architecture-pl.drawio		pipeline-architecture-pl.drawio
pipeline-architecture.drawio		pipeline-architecture.drawio
pipeline-architecture.png		pipeline-architecture.png
pipeline-architecture.svg		pipeline-architecture.svg
pipeline-cluster.yml		pipeline-cluster.yml
pipeline.yml		pipeline.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenFDA BigData Pipeline

Contents

Architecture

Configuration

Running solution locally in Docker

Accessing the application

Example graphs

Top 20 patient reactions reported between 2020-01-01 and 2022-01-01

Top 20 patient medical products reported between 2020-01-01 and 2022-01-01

Issues and contribution

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenFDA BigData Pipeline

Contents

Architecture

Configuration

Running solution locally in Docker

Accessing the application

Example graphs

Top 20 patient reactions reported between 2020-01-01 and 2022-01-01

Top 20 patient medical products reported between 2020-01-01 and 2022-01-01

Issues and contribution

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages