This repository contains the implementation of an IoT-based occupancy estimation system. Using advanced data preprocessing techniques and machine learning models, it aims to estimate occupancy in real-time based on environmental sensor data. The primary focus is to improve the accuracy and efficiency of predictions, making it suitable for smart building applications.
- Preprocessing of environmental sensor data (e.g., temperature, humidity, light).
- Implementation of advanced machine learning algorithms, including XGBoost.
- Tools for feature engineering and model evaluation.
- Real-time prediction capabilities.
- Modular and extensible codebase.
-
Clone the repository:
git clone https://github.com/pepperumo/IoT_Occupancy_Estimation.git
-
Navigate to the project directory:
cd IoT_Occupancy_Estimation -
Create a new conda environment:
conda create --name iot_occupancy_env python=3.10 conda activate iot_occupancy_env
-
Install the required dependencies:
conda env update --file conda_environment_requirements.yml
-
Prepare the dataset by placing it in the
data/directory. -
Open the Jupyter notebook for preprocessing and training:
jupyter notebook notebooks/Preprocessing_and_XGBoost.ipynb
-
Follow the steps in the notebook to preprocess the data, train the model, and evaluate it.
-
The trained model is saved as
models/xgboost_model.pkland can be used for predictions.
IoT_Occupancy_Estimation/
├── data/
│ └── Occupancy_Estimation.csv # Dataset used for training and evaluation
├── models/
│ └── xgboost_model.pkl # Trained XGBoost model
├── notebooks/
│ └── Preprocessing_and_XGBoost.ipynb # Jupyter notebook for preprocessing and modeling
├── conda_environment_requirements.yml # Conda environment configuration file
├── requirements.txt # Additional Python dependencies
├── README.md # Project documentation
└── LICENSE # License information
The dataset used for this project includes:
- Environmental features such as temperature, PIR, and light intensity.
- Occupancy labels indicating whether a room is occupied or not.
Ensure the dataset is properly formatted and placed in the data/ directory.
The primary model used for occupancy estimation is XGBoost. The pipeline includes:
- Feature engineering
- Model training
- Hyperparameter optimization
- Evaluation metrics (accuracy, precision, recall, F1-score)
Contributions are welcome! To contribute:
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Make your changes and commit them:
git commit -m "Add new feature" - Push your changes:
git push origin feature-name
- Create a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.