Analysing Netflix Data Cleaning

A data cleaning and preprocessing project focused on preparing the Netflix dataset for analysis using Python, Pandas, and Jupyter Notebook.

This project demonstrates real-world data cleaning workflows including handling missing values, fixing mixed data types, and transforming raw data into analysis-ready datasets.

Project Overview

Raw datasets are rarely clean.
This project focuses on transforming messy Netflix content data into a structured and usable format suitable for analytics and visualization.

Key objectives:

Identify and handle missing values
Fix inconsistent/mixed-type columns
Convert date columns into datetime format
Create derived analytical features
Prepare clean dataset for further analysis

Tech Stack

Python
Pandas
Matplotlib
Jupyter Notebook

Project Structure

Analysing-Netflix-Data-Cleaning/
│
├── data/
│ ├── netflix_titles.csv
│ └── cleaned-data.csv
│
├── notebook/
│ └── netflix_data_cleaning.ipynb
│
└── README.md

Data Cleaning Workflow

flowchart LR
    A[Raw Netflix Dataset] --> B[Data Inspection]
    B --> C[Handle Missing Values]
    C --> D[Fix Mixed-Type Columns]
    D --> E[Convert Date Columns]
    E --> F[Feature Engineering]
    F --> G[Cleaned Dataset Ready]

    style A fill:#1f77b4,color:#fff
    style B fill:#9467bd,color:#fff
    style C fill:#2ca02c,color:#fff
    style D fill:#ff7f0e,color:#fff
    style E fill:#17becf,color:#fff
    style F fill:#e377c2,color:#fff
    style G fill:#d62728,color:#fff

Cleaning Steps Performed

1️Missing Values Handling

Filled categorical columns using "Unknown" or mode values
Verified null counts column-wise
Ensured dataset consistency after imputation

Mixed-Type Column Fix

Cleaned the duration column by splitting into:
- Numeric duration value
- Duration type (Minutes / Seasons)
Standardized data types for analysis readiness

Datetime Conversion

Converted date_added into proper datetime format
Extracted new analytical features:
- year_added
- month_added
- month_name

Data Validation

Verified column datatypes
Removed inconsistencies and formatting issues
Saved a fully cleaned dataset for downstream analysis

Output

A cleaned and structured dataset ready for:

Exploratory Data Analysis (EDA)
Data Visualization
Dashboard Creation
Business Insights

Project Source

This project is inspired by the learning project from roadmap.sh:

</> https://roadmap.sh/projects/cleaning-netflix-dataset

Implementation and analysis were completed independently as part of learning real-world data analytics workflows.

Future Improvements

Content trend analysis
Genre popularity insights
Dashboard using Power BI / Tableau
Time-series visualization

Learning Outcome

This project strengthened understanding of:

Real-world data preprocessing
Pandas data transformation
Analytical thinking
Structuring analytics projects for GitHub portfolios

Connect

If you have feedback or suggestions, feel free to connect or open an issue in this repository!

⭐ If you found this project helpful, consider giving it a star!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Dashboard		Dashboard
.gitignore		.gitignore
README.md		README.md
analysis.ipynb		analysis.ipynb
cleaned-data.csv		cleaned-data.csv
netflix_titles.csv		netflix_titles.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysing Netflix Data Cleaning

Project Overview

Tech Stack

Project Structure

Data Cleaning Workflow

Cleaning Steps Performed

1️Missing Values Handling

Mixed-Type Column Fix

Datetime Conversion

Data Validation

Output

Project Source

Future Improvements

Learning Outcome

Connect

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Analysing Netflix Data Cleaning

Project Overview

Tech Stack

Project Structure

Data Cleaning Workflow

Cleaning Steps Performed

1️Missing Values Handling

Mixed-Type Column Fix

Datetime Conversion

Data Validation

Output

Project Source

Future Improvements

Learning Outcome

Connect

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages