Skip to content

pmcelroy4/Linear-Regression---House-Price-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

🏡 Housing Price Prediction

This project explores a housing dataset and builds a machine learning model to predict median house values based on demographic and geographic features.

The focus is on combining exploratory data analysis (EDA) with a clean modeling pipeline to understand what drives housing prices and establish a strong baseline model.


🚀 Project Overview

This notebook walks through an end-to-end workflow:

  • Data loading and exploration
  • Visualization of feature distributions and relationships
  • Handling missing data
  • Encoding categorical variables
  • Building a preprocessing + modeling pipeline
  • Training and evaluating a regression model

The final output is a Linear Regression model evaluated using RMSE.


📊 Key Insights

Some high-level takeaways from the analysis:

  • Median income shows a strong positive relationship with house value
  • Location-based features (latitude/longitude, ocean proximity) play a significant role
  • Several features are right-skewed, suggesting potential for transformation in future iterations

📁 Project Structure

housing-price-prediction/
├── housing.csv                  # Dataset
├── housing_model.ipynb          # Main notebook (EDA + modeling)
├── README.md                    # Project documentation

🧠 Modeling Approach

  • Model: Linear Regression

  • Preprocessing:

    • Missing value imputation (median)
    • One-hot encoding for categorical features
  • Pipeline: ColumnTransformer + Pipeline (scikit-learn)

  • Train/Test Split: 80/20


📈 Evaluation

Model performance is evaluated using:

  • Root Mean Squared Error (RMSE)

This provides an interpretable measure of average prediction error in housing price units.


🛠️ Tech Stack

  • Python
  • pandas, numpy
  • matplotlib, seaborn
  • scikit-learn

▶️ How to Run

  1. Clone the repository:
git clone https://github.com/pmcelroy4/house_price_model.git
cd housing-price-prediction
  1. Install dependencies (if applicable):
pip install -r requirements.txt
  1. Launch the notebook:
jupyter notebook housing_model.ipynb

🔮 Future Improvements

  • Try more advanced models (Random Forest, Gradient Boosting)
  • Add feature engineering (e.g., rooms per household, bedrooms per room)
  • Perform cross-validation for more robust evaluation
  • Apply scaling or transformations to skewed features

💡 Why This Project

This project demonstrates:

  • A structured approach to EDA and feature understanding
  • Practical use of scikit-learn pipelines
  • The ability to move from raw data → insights → model

About

Machine Learning Model with exploratory data anlaysis and housing price prediction based on various features.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors