An end-to-end data analytics project analyzing 100,000+ real e-commerce orders from Olist, a Brizilian marketplace. Built to answer real business questions around delivery performance, freight costs, and seller rliability.
- Which states have the highest late delivery rates?
- Which product categories have the highest freight costs relative to price?
- Which sellers have the worst cancellation and late delivery rates?
- How have order volume and delivery performance trended month over month?
- Python - data pipeline to load 9 CSV files into a SQLite database
- SQL - 4 queries to analyze delivery, freight, and seller performance
- Tableau Public - interactive dashboard to visualize findings
- The state of MA had the highest late delivery rate at 23.93%
- Christmas/seasonal items (artigos_de_natal) had the highest freight-to-price ratio at 36%, meaning shipping costs nearly equal the product price
- The worst performing seller had a 34% late delivery rate
- Order volume peaked in late 2017 and remained high through mid- 2018.
logistics-analytics/ ├── queries/ # SQL query files │ ├── late_delivery_by_state.sql │ ├── freight_cost_by_category.sql │ ├── seller_performance.sql │ └── monthly_trends.sql ├── main.py # Python pipeline to load data and run queries └── README.md
- Download the Olist dataset from Kaggle
- Place all CSV files in a 'data/' folder
- Run 'python main.py' to load data and generate output CSVs
- Open Tableau Public and connect to the CSVs in the 'outputs/' folder