👶 Stuntify: Intelligent Stunting Prediction System

⚠️ IMPORTANT CLINICAL DISCLAIMER: The underlying machine learning model is trained entirely on synthetic, machine-generated data, not real-world patient records. This application is built as a proof-of-concept for educational and portfolio demonstration purposes only. It should NOT be used for actual medical diagnosis, screening, or clinical decision-making. Always consult a certified healthcare professional for medical advice.

DATASET: https://www.kaggle.com/datasets/jabirmuktabir/stunting-wasting-dataset

📌 Overview

Stuntify Web App is an End-to-End Machine Learning application designed to democratize access to early stunting detection.

It isn't just a dashboard; it's an intelligent decision support system. By bridging the gap between complex medical data and a user-friendly interface, Stuntify allows users to input simple anthropometric measurements and receive instant, medically-aligned classifications. Under the hood, it orchestrates a rigorous MLOps Inference Pipeline, ensuring that every user input undergoes the exact same preprocessing standards as the clinical training data.

✨ Key Features

🧠 Multi-Artifact Orchestration (The "Invisible" Brain)

The system acts as a synchronized inference unit. It doesn't just "guess"; it reconstructs the mathematical environment by loading 4 frozen artifacts:

Gender Encoder: Translates categories (Laki-laki) into machine-readable vectors.
Standard Scaler: Normalizes input metrics (Age, Height, Weight) to match the model's distribution.
Classifier Model: The core logic engine (Random Forest/Decision Tree) trained for high precision.
Target Decoder: Translates the mathematical prediction back to human-readable labels (e.g., Severely Stunted).

🛡️ Defensive & Modular Architecture

Decoupled Logic: Separation of concerns via preprocess.py (Schema), model.py (Inference), and app.py (UI).
Input Sanity Checks: The UI enforces strict min/max value constraints to prevent biological impossibilities (e.g., negative height).
Production Simulation: Includes a comprehensive simulation pipeline to verify data integrity before inference.

🛠️ Tech Stack

Core: Python 3.9+
Frontend: Streamlit (Interactive Web Framework)
Computation: NumPy, Pandas, Scikit-Learn, Joblib
Handling Imbalance: SMOTE-NC (Synthetic Minority Over-sampling Technique)

📂 Project Structure

The repository is organized to simulate a real-world production environment:

📂 app                                    # 🧠 Core application logic
│   ├── 🐍 app.py                         # 🚀 Streamlit UI (main frontend entry point)
│   ├── 🐍 model.py                       # ⚙️ Model inference logic & artifact loader
│   ├── 🐍 preprocess.py                  # 🛠️ Data preprocessing utilities & feature encoding
│   └── 📂 __pycache__                    # 🔒 Auto-generated Python bytecode cache
│
📂 assets                                 # 🎨 Visual assets for documentation & UI preview
│   ├── 📄 decision_tree.pdf              # 📑 Decision tree visualization (PDF format)
│   ├── 🖼️ decision_tree.png              # 🌳 Decision tree visualization (image preview)
│   └── 🖼️ app_interface.png              # 📱 Screenshot of the Streamlit application interface
│
📂 models                                 # 📦 Serialized ML artifacts (model & encoders)
│   ├── 📦 gender_encoder.joblib          # 🔤 Encoder for gender feature
│   ├── 📦 stunting_encoder.joblib        # 🔤 Encoder for target label categories
│   ├── 📦 best_model.joblib              # 🧠 Final trained machine learning model
│   └── 📦 scaler.joblib                  # 📊 Feature scaling model
│
📂 notebooks                              # 🔬 Research & experimentation workspace
│   └── 📓 stunting-prediction.ipynb      # 📈 EDA, SMOTE, model training & evaluation notebook

🚀 The Inference Pipeline (How It Works)

Unlike basic notebooks, this project implements a strict lifecycle for every user interaction:

Ingestion: The User inputs data via the Streamlit Form: {"Gender", "Age", "Height", "Weight"}.
Schema Alignment: preprocess.py transforms raw inputs into a structured DataFrame matching the training schema.
Context Reconstruction: model.py loads the serialized artifacts.
Processing: The data flows through the pipeline:

Input Validated -> Encoded -> Scaled -> Predicted -> Decoded
Visualization: The result is presented instantly with clear, actionable context.

⚖️ Performance & Disclaimer

Metric	Score	Note
Accuracy	100%	All predictions are correct based on the confusion matrix
Recall	100%	No stunting cases were missed (perfect sensitivity)
Precision	100%	No false positives across all classes

⚠️ Why is Accuracy near 100%?

You might notice the model achieves near-perfect accuracy. This is not a sign of overfitting, but rather a reflection of the deterministic nature of the dataset.

Clinical Logic: Stunting is medically defined by a strict formula involving Height-for-Age.
Synthetic Dataset: The data used is synthetic and machine-generated. Because the dataset was built using clean, algorithmic rules without the unpredictable noise of real-world data, it is naturally much easier for a machine learning model to perfectly recognize the underlying patterns.
Model Behavior: The model has successfully "reverse-engineered" these medical rules derived from WHO Growth Standards.
Conclusion: The model functions correctly as a Rule-Approximation System.

📦 Installation & Usage

Clone the Repository

git clone https://github.com/viochris/Stunting-prediction-project.git
cd Stunting-prediction-project

Install Dependencies
```
pip install -r requirements.txt
```
Run the Web App Execute the Streamlit application:
```
streamlit run app/app.py
```
Output: The app will open in your browser at http://localhost:8501

📷 Model & Application Screenshots

Decision Tree Visualization

A preview of the decision tree structure used in the stunting prediction model:

Streamlit Application Interface

Example interface when entering input data for stunting prediction:

Author: Silvio Christian, Joe "Code that speaks Data, Logic that saves lives."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

👶 Stuntify: Intelligent Stunting Prediction System

📌 Overview

✨ Key Features

🧠 Multi-Artifact Orchestration (The "Invisible" Brain)

🛡️ Defensive & Modular Architecture

🛠️ Tech Stack

📂 Project Structure

🚀 The Inference Pipeline (How It Works)

⚖️ Performance & Disclaimer

⚠️ Why is Accuracy near 100%?

📦 Installation & Usage

📷 Model & Application Screenshots

Decision Tree Visualization

Streamlit Application Interface

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
app		app
assets		assets
models		models
notebooks		notebooks
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

👶 Stuntify: Intelligent Stunting Prediction System

📌 Overview

✨ Key Features

🧠 Multi-Artifact Orchestration (The "Invisible" Brain)

🛡️ Defensive & Modular Architecture

🛠️ Tech Stack

📂 Project Structure

🚀 The Inference Pipeline (How It Works)

⚖️ Performance & Disclaimer

⚠️ Why is Accuracy near 100%?

📦 Installation & Usage

📷 Model & Application Screenshots

Decision Tree Visualization

Streamlit Application Interface

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages