Skip to content

SBanditaDas/Heart-Disease-Prediction-Using-ML-Classification-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation


❤️ Heart Disease Prediction

Overview :

This project builds a machine learning model to predict the presence of heart disease based on clinical and lifestyle features. It supports early diagnosis and risk stratification for patients, helping healthcare providers make informed decisions.


Dataset Description :

The dataset includes patient-level data with various diagnostic and demographic attributes.

Key columns:

  • Age: Age of the patient
  • Sex: Gender (1 = male, 0 = female)
  • ChestPainType: Type of chest pain (e.g., typical angina, asymptomatic)
  • RestingBP: Resting blood pressure (mm Hg)
  • Cholesterol: Serum cholesterol (mg/dl)
  • FastingBS: Fasting blood sugar > 120 mg/dl (1 = true, 0 = false)
  • RestingECG: Resting electrocardiographic results
  • MaxHR: Maximum heart rate achieved
  • ExerciseAngina: Exercise-induced angina (1 = yes, 0 = no)
  • Oldpeak: ST depression induced by exercise
  • ST_Slope: Slope of the peak exercise ST segment
  • HeartDisease: Target variable (1 = disease present, 0 = no disease)

Workflow Summary :

1. Data Loading

df = pd.read_csv('/kaggle/input/heart-disease-dataset/heart.csv')

2. Preprocessing

df = df.dropna()
df_encoded = pd.get_dummies(df, drop_first=True)
X = df_encoded.drop('HeartDisease', axis=1)
y = df_encoded['HeartDisease']

3. Train-Test Split

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.2, random_state=42)

4. Model Training

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

5. Evaluation

y_pred = model.predict(X_test)
print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))

6. Prediction on New Patient

input_df = pd.DataFrame(np.zeros((1, len(X_train.columns))), columns=X_train.columns)
# Fill in values based on confirmed X_train.columns
model.predict(input_df)

Performance Metrics :

  • Accuracy: ~90–95% on test data
  • Robust classification across age, cholesterol, and ECG features
  • Top predictors: Chest pain type, ST slope, MaxHR, and ExerciseAngina

Dependencies :

numpy
pandas
scikit-learn

Author: Sushree Bandita Das

S_Bandita_Das sushree-bandita-das-160651309 SBanditaDas dasbanditasushree

About

đź’“This project applies Random Forest classification to predict the presence of heart disease using patient-level diagnostic and lifestyle features. By analyzing indicators like chest pain type, cholesterol, and ECG results, the model supports early diagnosis and risk stratification to assist healthcare professionals in making informed decisions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors