An end-to-end Machine Learning pipeline that replaces static firewall rules with real-time intelligent traffic classification — trained on industry-standard datasets and served via an interactive web dashboard.
Traditional firewalls block traffic by matching against fixed rules — they fail completely against zero-day attacks and behavioural anomalies. This system trains ML models on real network traffic to automatically classify connections as normal or malicious and make dynamic ALLOW / LOG / BLOCK decisions.
Live demo: Upload any network traffic CSV → models train in real time → inspect any packet in the browser.
| Model | Accuracy | F1-Score | ROC-AUC | Train Time (50k rows) |
|---|---|---|---|---|
| Logistic Regression | 99.58% | 0.9950 | 0.9999 | ~9s |
| SVM (SGD) | 99.51% | 0.9945 | 0.9998 | <1s |
| Random Forest | 99.50% | 0.9937 | 0.9996 | ~8s |
| Ensemble | 99.59% | 0.9949 | 0.9997 | ~18s |
Firewall simulation (5,000 packets): 1,962 attacks blocked · 2 false positives · 1 missed attack
Network Traffic CSV
↓
Data Ingestion ← Auto-detects CIC-IDS-2017 / UNSW-NB15 / NSL-KDD
↓
Feature Engineering ← LabelEncoder + StandardScaler + NaN imputation
↓
Model Training ← LR · SGD-SVM · Random Forest · Ensemble
↓
Evaluation ← Accuracy · Precision · Recall · F1 · ROC-AUC
↓
Firewall Engine ← P(attack) → ALLOW / LOG / BLOCK
↓
Web Dashboard ← Flask + real-time SSE log streaming
| Dataset | Source | Rows | Attack Types |
|---|---|---|---|
| CIC-IDS-2017 | Univ. of New Brunswick | ~2.8M | DDoS, DoS, Brute Force, PortScan, Botnet |
| UNSW-NB15 | UNSW Canberra | 2.54M | Exploits, Backdoors, Fuzzers, Worms, Recon |
# 1. Clone
git clone https://github.com/YOUR_USERNAME/intelligent-firewall-ml.git
cd intelligent-firewall-ml
# 2. Install
pip install -r requirements.txt
# 3. Launch dashboard
python app.py
# Open http://localhost:5000In the dashboard:
- Upload your dataset CSV (or generate synthetic data)
- Select models and click Start Training
- Watch live metrics appear in the Results tab
- Try the Firewall Inspector with presets (DoS, Port Scan, Brute Force)
intelligent_firewall/
├── src/
│ ├── preprocessing.py # Multi-format data loader + feature pipeline
│ ├── models.py # LR, SGD-SVM, Random Forest, Ensemble
│ ├── evaluator.py # Metrics, confusion matrix, ROC, feature importance
│ └── firewall.py # Real-time packet classification engine
├── data/
│ ├── raw/ # Place dataset CSVs here (not in repo — see below)
│ └── generate_synthetic.py
├── templates/
│ └── index.html # Full single-page dashboard
├── app.py # Flask web application
├── train.py # CLI training entry point
└── requirements.txt
Datasets are not included in this repository due to size. Download links above.
- SGD-SVM instead of kernel SVM — reduces training from 45 minutes to <1 second on 50k samples (O(n) vs O(n²))
- Auto dataset detection — column-signature heuristics identify CIC-IDS-2017, UNSW-NB15, NSL-KDD automatically
- Three-tier firewall decisions — LOG tier for uncertain predictions reduces false positives (inspired by Sommer & Paxson, IEEE S&P 2010)
- Cross-platform path handling — pathlib.Path with .as_posix() prevents Windows backslash corruption in JavaScript
- Server-Sent Events — real-time log streaming without WebSockets
- Sharafaldin et al., "Toward Generating a New Intrusion Detection Dataset," ICISSP 2018
- Moustafa & Slay, "UNSW-NB15: A Comprehensive Dataset for NIDS," MilCIS 2015
- Sommer & Paxson, "Outside the Closed World: On Using ML for NID," IEEE S&P 2010
- Breiman, "Random Forests," Machine Learning Journal, 2001
| Name | Roll Number |
|---|---|
| Pratham Sorte | 1032240024 |
| Tushar Gitte | 1032240020 |
| Sarthak Parashetti | 1032240067 |
| Abhineet Chowdhury | 1032240036 |
MIT World Peace University, Pune — B.E. Computer Engineering, 2024-25
MIT License — free to use, modify, and distribute with attribution.