Skip to content

Shashank911/Train-Test-Split-Evaluation-Metrics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Task 5 – Train-Test Split & Model Evaluation πŸ“Œ Overview This repository contains the implementation of model training and evaluation on the Heart Disease dataset as part of an AI & ML Internship task. The goal of this task is to understand how machine learning models are evaluated using proper data splitting and performance metrics.

❀️ Dataset Information Dataset: Heart Disease Dataset

Problem Type: Binary Classification

Target Variable: Indicates presence (1) or absence (0) of heart disease

Features: Medical attributes such as age, cholesterol, blood pressure, etc.

🎯 Objective The objective of this task is to:

Split the dataset into training and testing sets

Train a classification model

Evaluate performance using accuracy, precision, recall, and confusion matrix

πŸ›  Tools & Libraries Used Python

Pandas

NumPy

Scikit-learn

βš™οΈ Steps Performed Loaded the dataset using Pandas

Separated features (X) and target (y)

Split data into 80% training and 20% testing

Trained a Logistic Regression model

Made predictions on test data

Evaluated model using:

Accuracy

Precision

Recall

Confusion Matrix

Classification Report

πŸ“Š Evaluation Metrics Metric Meaning Accuracy Overall correctness of predictions Precision How many predicted positive cases were actually positive Recall How many actual positive cases were correctly identified Confusion Matrix Shows TP, TN, FP, FN values F1-score Balance between precision and recall

πŸ“ˆ Key Insights Logistic Regression performed well on the dataset

Model shows balanced precision and recall

Confusion matrix helps understand prediction errors

Train-test split ensured the model generalizes to unseen data

πŸ“ Repository Structure arduino Copy code Task-5-Model-Evaluation/ β”‚ β”œβ”€β”€ heart.csv β”œβ”€β”€ Heart_Model.ipynb └── README.md 🧠 Concepts Learned Importance of train-test split

Model evaluation techniques

Understanding classification metrics

Avoiding overfitting

βœ… Conclusion The Logistic Regression model was successfully trained and evaluated on the Heart Disease dataset. The evaluation metrics indicate that the model can reliably predict heart disease presence based on patient data.

About

he objective of this task is to train a machine learning model on the Heart Disease dataset and evaluate its performance using standard evaluation metrics. The dataset contains medical attributes of patients and a target variable indicating the presence or absence of heart disease.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors