This project focuses on modeling a binary image classification task using Convolutional Neural Networks (CNNs) in TensorFlow.
The goal is to detect whether a given image contains pizza 🍕 or steak 🥩 by building a deep learning classifier.
The dataset used in this project comes from the Food-101 dataset, which consists of 101,000 real-world images categorized into 101 different food types.
For this project, I selected two categories: pizza and steak, making this a binary classification problem.
One of the most crucial steps in any machine learning project is preparing the dataset. In this case:
- The dataset is already split into training and test sets.
- The training dataset contains 1,500 images (750 per class).
- The test dataset contains 500 images (250 per class).
- The images are resized using the
target_sizeparameter to match the input size required by our model. - Since this is a binary classification problem, we set
class_mode='binary'. - The
batch_sizeis set to 32 (default value).
This project is designed to run on Google Colab for easy access to GPU acceleration. Follow these steps to get started:
-
Open Google Colab.
-
Upload the notebook or open a new one.
-
Install required dependencies:
-
Mount Google Drive (if using external data):
from google.colab import drive drive.mount('/content/drive')
-
Run the notebook cells to preprocess data, train the model, and evaluate performance.
The CNN model is compiled using:
- Loss function:
binary_crossentropy(since it's a binary classification task). - Optimizer:
Adamoptimizer with default settings. - Evaluation metric:
accuracy.
After compiling, the model is trained on the dataset.
- The
steps_per_epochis set as1500/32 = ~47, meaning the model will go through approximately 47 batches per epoch.
Once training is complete, I evaluate the model by:
- Plotting the training curves to visualize loss and accuracy trends.
- Using pandas to analyze predictions and model performance.
The model's performance can be further improved by:
- Using data augmentation to generate more diverse training samples.
- Fine-tuning a pre-trained model like MobileNetV2 or ResNet50.
- Adding more layers or different architectures to improve feature extraction.
This project is open-source and available under the MIT License.
