End-to-end data engineering pipeline using Medallion Architecture (Bronze→Silver→Validation→Gold) on Databricks with PySpark, Apache Airflow orchestration and Power BI dashboard
-
Updated
May 31, 2026 - Python
End-to-end data engineering pipeline using Medallion Architecture (Bronze→Silver→Validation→Gold) on Databricks with PySpark, Apache Airflow orchestration and Power BI dashboard
About This project focuses on performing an end-to-end analysis of IPL data using Apache Spark on Databricks. It begins with setting up a Databricks environment, followed by ingesting and exploring the IPL dataset.
AI-Powered Movie Recommendation System on Databricks.
use gg colab cleaning data and training, compare and choose the suitable algorithms. Then, create app integrating machine learning and genAI. This app use to recommend top 3 suitable trees should to plant. This projects also has been supported by AI (chatgpt, copilot)
Movie market trend analysis using Apache PySpark and Databricks — revenue patterns, genre performance, and ratings across 1000 IMDB films (2006–2016)
Azure End To End Data Enginnering Project
End-to-end medallion architecture pipeline on Databricks processing 19.4M NYC Yellow Taxi records through Bronze, Silver, and Gold Delta Lake layers. Covers Unity Catalog, schema evolution handling, OPTIMIZE/ZORDER, and 4 Gold aggregation tables.
Add a description, image, and links to the data-bricks topic page so that developers can more easily learn about it.
To associate your repository with the data-bricks topic, visit your repo's landing page and select "manage topics."