This project implements a neural network-based classifier to distinguish between benign and malignant breast cancer cases using a public dataset.
- Overview
- Problem Statement
- Dataset
- Preprocessing
- Model Architecture
- Evaluation Metrics
- SDG Alignment
- Requirements
- Installation
- Usage
- Results
- Contributing
- Acknowledgements
Early and accurate diagnosis of breast cancer significantly increases survival rates. This project uses a neural network model to classify breast cancer cases based on diagnostic features. The model is trained and evaluated using standardised machine learning workflows.
Breast cancer remains one of the leading causes of cancer-related deaths among women worldwide. The challenge is to classify cancer diagnoses accurately using predictive modelling to support early intervention.
The dataset used is derived from the Breast Cancer Wisconsin Diagnostic Dataset, containing features computed from digitised images of a fine needle aspirate (FNA) of a breast mass.
- Target Labels:
0→ Malignant1→ Benign
- Feature scaling using standardisation (zero mean, unit variance)
- Splitting the dataset into training and testing sets
- Encoding labels for binary classification
The neural network is built using a feedforward design and includes:
- Input layer corresponding to the number of features
- One or more hidden layers with ReLU activation
- Output layer with sigmoid activation for binary classification
- Accuracy
- Precision
- Recall
- F1-score
- Confusion Matrix
| Goal | Relevance |
|---|---|
| SDG 3: Good Health and Well-being | Enhances early detection and diagnosis through AI, aiding in reducing mortality and promoting well-being. |
| SDG 9: Industry, Innovation and Infrastructure | Utilises innovative neural network algorithms and contributes to healthcare infrastructure through data-driven insights. |
- Python 3.8+
- scikit-learn
- pandas
- numpy
- tensorflow / keras
- matplotlib
- seaborn
Clone the repository and install dependencies:
git clone https://github.com/your-username/breast-cancer-classification-nn.git
cd breast-cancer-classification-nn
pip install -r requirements.txtTo run the model and replicate results:
jupyter notebook BreastCancerClassificationNN.ipynbFollow the notebook steps for data loading, preprocessing, model training, and evaluation.
The model demonstrates high classification performance on the test set, achieving:
- Accuracy: >95%
- Low false positives and false negatives
Feel free to fork the repository and submit pull requests. Please ensure contributions are well-documented and tested.
- Dataset provided by UCI Machine Learning Repository
- Thanks to all open-source contributors