Skip to content

HitanDubey/machine_learning_models-and-questions-practice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Machine Learning Practice Repository

Version License: MIT Python 3.8+ ML Practice

A comprehensive collection of implemented machine learning algorithms with practical datasets and coding exercises. Designed for hands-on learning, interview preparation, and professional development.

Table of Contents

Quick Overview

This repository contains 15+ production-ready machine learning implementations with working examples on real datasets. Each algorithm is implemented using scikit-learn with clear explanations and practical applications.

Key Highlights

  • βœ“ Production-ready code for all major ML algorithms
  • βœ“ Real datasets included for immediate practice
  • βœ“ Hands-on exercises to test your understanding
  • βœ“ Well-documented with clear comments and explanations
  • βœ“ Beginner to advanced progressive learning path

Implemented Algorithms

Category Algorithms Status Directory/File
Supervised Learning
Regression Multi-linear Regression βœ“ multilinear/
Classification Logistic Regression βœ“ logisticregression.py
Multi-class Multi-class Classification βœ“ multiclassclassification.py, logistic_multiclass.py
Tree-based Decision Tree, Random Forest βœ“ DecisionTree/, RandomForest/
Instance-based K-Nearest Neighbors βœ“ K-NN,Classification/
Regularization L1 & L2 Regularization βœ“ L1_L2_regularization/
Support Vectors Support Vector Machine βœ“ SupportVectorMachine/
Probabilistic Naive Bayes βœ“ native_bayes/
Unsupervised Learning
Clustering K-Means Clustering βœ“ k_means_cluster/
Data Processing
Preprocessing One-Hot Encoding βœ“ onehotencoding/
Feature Engineering Simple Imputer βœ“ simpleimputer/
Validation Train-Test Split, K-Fold CV βœ“ TrainTestSplit.py, K-Fold-Cross-Validation/
Utilities Data Generators βœ“ simple Uneta regression/AgeneratenewCV.py

Technology Stack

Component Technology Purpose
Core Framework Python 3.8+ Primary programming language
ML Library Scikit-learn 1.3+ Machine learning implementations
Data Processing Pandas, NumPy Data manipulation and analysis
Visualization Matplotlib, Seaborn Results visualization
Development Python scripts Implementation and testing

Installation & Setup

Prerequisites

  • Python 3.8 or higher
  • pip package manager
  • Git (for cloning)

Quick Start

# Clone the repository
git clone https://github.com/yourusername/ml-practice.git
cd ml-practice

# Install dependencies
pip install numpy pandas scikit-learn matplotlib seaborn

# Run your first algorithm
python logisticregression.py

## repo Structure 

ml-practice/
β”œβ”€β”€ README.md                            # Project documentation
β”œβ”€β”€ requirements.txt                     # Python dependencies (to be created)
β”‚
β”œβ”€β”€ Core ML Models/
β”‚   β”œβ”€β”€ DecisionTree/                    # Decision Tree classifier
β”‚   β”‚   β”œβ”€β”€ salaries.csv
β”‚   β”‚   β”œβ”€β”€ salary.py
β”‚   β”‚   β”œβ”€β”€ titanic.csv
β”‚   β”‚   └── titanic.py
β”‚   β”‚
β”‚   β”œβ”€β”€ RandomForest/                    # Random Forest ensemble
β”‚   β”‚   β”œβ”€β”€ digitrecopy.py
β”‚   β”‚   └── int.py
β”‚   β”‚
β”‚   β”œβ”€β”€ K-NN,Classification/             # K-Nearest Neighbors
β”‚   β”‚   └── knn.py
β”‚   β”‚
β”‚   β”œβ”€β”€ SupportVectorMachine/            # SVM implementation
β”‚   β”‚   β”œβ”€β”€ digits.py
β”‚   β”‚   └── petech.py
β”‚   β”‚
β”‚   β”œβ”€β”€ native_bayes/                    # Naive Bayes classifier
β”‚   β”‚   β”œβ”€β”€ spam.csv
β”‚   β”‚   └── spam.py
β”‚   β”‚
β”‚   └── k_means_cluster/                 # K-Means clustering
β”‚       β”œβ”€β”€ elbow_income_F(K).py
β”‚       β”œβ”€β”€ income.csv
β”‚       └── income.py
β”‚
β”œβ”€β”€ Regression Models/
β”‚   β”œβ”€β”€ multilinear/                     # Multiple linear regression
β”‚   β”‚   β”œβ”€β”€ exercise.py
β”‚   β”‚   β”œβ”€β”€ hiring.csv
β”‚   β”‚   β”œβ”€β”€ home.py
β”‚   β”‚   └── homepicnic.csv
β”‚   β”‚
β”‚   β”œβ”€β”€ logisticregression.py            # Binary classification
β”‚   β”œβ”€β”€ multiclassclassification.py      # Multi-class classification
β”‚   β”œβ”€β”€ logistic_multiclass.py           # Alternative multi-class
β”‚   β”œβ”€β”€ insurance.py                     # Insurance data example
β”‚   β”œβ”€β”€ petclinicinfo.py                 # Pet clinic example
β”‚   β”‚
β”‚   └── L1_L2_regularization/            # Regularization techniques
β”‚       β”œβ”€β”€ Melbourne_housing_FUL
β”‚       └── regu.py
β”‚
β”œβ”€β”€ simple Uneta regression/             # Simple regression examples
β”‚   β”œβ”€β”€ Afinit.py
β”‚   β”œβ”€β”€ AgeneratenewCV.py
β”‚   β”œβ”€β”€ aren_with_pricen.csv
β”‚   β”œβ”€β”€ aren.csv
β”‚   β”œβ”€β”€ canada_per_capita_income.csv
β”‚   β”œβ”€β”€ exercise.py
β”‚   β”œβ”€β”€ ml.csv
β”‚   └── student_performance.py
β”‚
β”œβ”€β”€ Data Processing/
β”‚   β”œβ”€β”€ onehotencoding/                  # Categorical encoding
β”‚   β”‚   β”œβ”€β”€ caprice.py
β”‚   β”‚   β”œβ”€β”€ capricorn.csv
β”‚   β”‚   β”œβ”€β”€ capricornteam.py
β”‚   β”‚   β”œβ”€β”€ homepicnic.csv
β”‚   β”‚   └── homeprice.py
β”‚   β”‚
β”‚   β”œβ”€β”€ simpleimputer/                   # Missing value handling
β”‚   β”‚   └── missing_value_fill.py
β”‚   β”‚
β”‚   └── TrainTestSplit.py                # Train-test splitting
β”‚
β”œβ”€β”€ Validation & Testing/
β”‚   β”œβ”€β”€ K-Fold-Cross-Validation/         # Cross-validation techniques
β”‚   β”‚   └── digits.py
β”‚   β”œβ”€β”€ test/                            # Testing directory
β”‚   β”‚   └── test.py
β”‚   └── unsupervised/                    # Unsupervised learning test
β”‚       β”œβ”€β”€ income.csv
β”‚       └── k_means_cluster.py
β”‚
β”œβ”€β”€ Practice & Exercises/
β”‚   β”œβ”€β”€ practice_md/                     # Practice exercises
β”‚   β”‚   β”œβ”€β”€ social_media_viral_cont...
β”‚   β”‚   └── socialmedialist.py
β”‚   β”‚
β”‚   └── practice_md/                     # Additional practice
β”‚       └── (practice files)
β”‚
└── Datasets/                            # All dataset files
    β”œβ”€β”€ StudentPerformance.csv
    β”œβ”€β”€ titanic.csv
    β”œβ”€β”€ insurance_data.csv
    β”œβ”€β”€ Hr_comma_sup.csv
    β”œβ”€β”€ int_petal_sapal.png
    └── other datasets

About

πŸ“š A practical repository for machine learning. Features model implementations from scratch and a curated set of Q&A to test and solidify core concepts through hands-on coding.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages