Spam Email Classifier

The project uses a multi-layer neural network to classify emails as spam or legitimate. The classifier has 98.6% accuracy, and a 97% F1 score.

Key Features

Text preprocessing with NLTK for cleaning and normalization
TF-IDF vectorization for text representation
Multi-layer neural network using PyTorch
REST API for email classification

Model Architecture

The model uses a multi-layer neural network with:

Input layer matching feature dimensions
Three hidden layers with ReLU activation and Batch Normalization
Dropout regularization to prevent overfitting
Output layer with sigmoid activation for binary classification

Evaluation Results

The model's performance can be seen through the following visualizations:

Confusion Matrix

The confusion matrix shows:

True Negatives (top left): Correctly identified non-spam emails
False Positives (top right): Non-spam emails incorrectly flagged as spam
False Negatives (bottom left): Spam emails that were missed
True Positives (bottom right): Correctly identified spam emails

Probability Distribution

This histogram shows how the model distributes probability scores for spam and non-spam emails.

API

To run the project's REST API use the following command:

python app/api.py

Emails that need to be classified can be routed in a POST request to the /predict endpoint with the email provided in the request body.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
app		app
data		data
tests		tests
visualizations		visualizations
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
convert_to_onnx.py		convert_to_onnx.py
evaluate.py		evaluate.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spam Email Classifier

Key Features

Model Architecture

Evaluation Results

Confusion Matrix

Probability Distribution

API

About

Uh oh!

Releases

Packages

Uh oh!

Languages

AryanB1/SpamEmailDetector

Folders and files

Latest commit

History

Repository files navigation

Spam Email Classifier

Key Features

Model Architecture

Evaluation Results

Confusion Matrix

Probability Distribution

API

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages