Skip to content

rishabhpatel9/Bias-Mitigation-FaceRecog

Repository files navigation

Detecting and Mitigating Algorithmic Bias in Face Recognition Algorithms: A Research Study

Python TensorFlow AIF360 License: GPL v3

This repository contains the source code and documentation for my Master's project, which investigates the presence of gender bias in facial recognition systems and evaluates technical methods for its mitigation.

Project Description

Facial recognition technology is increasingly utilized across various sectors, including security and recruitment. However, studies have shown that these systems can demonstrate significant performance disparities across different demographic groups. This study specifically examines gender bias within a Convolutional Neural Network (CNN) framework.

The project involves building a robust gender classification model and using the IBM AI Fairness 360 (AIF360) toolkit to audit and mitigate bias. The objective is to achieve equitable model performance without compromising overall classification accuracy.

Methodology

Data Preprocessing

The model training process utilizes the FERET dataset along with other facial image sets. Preprocessing steps are implemented to ensure the model focuses on structural features rather than noise:

  • Resizing images to 64x64 pixels for computational efficiency.
  • Conversion to grayscale to eliminate potential color-based bias.
  • Histogram equalization to standardize lighting conditions and contrast across the dataset.
  • Normalizing data to support reliable convergence during training.

Bias Detection

Bias is quantified using several fairness metrics provided by the AIF360 library:

  • Disparate Impact: Compares the probability of favorable outcomes between unprivileged and privileged groups.
  • Statistical Parity Difference: Measures the difference in favorable rate between the two groups.
  • Equal Opportunity Difference: Evaluates the difference in true positive rates.

Bias Mitigation

To resolve identified disparities, the Reweighing algorithm is implemented. This pre-processing technique calculates weights for each training instance to ensure that the distribution is balanced across protected attributes (gender) and labels (male/female) before the model begins the learning phase.

Technical Implementation

The project is implemented in Python using the following primary libraries:

  • TensorFlow and Keras: For designing and training the CNN architecture.
  • AI Fairness 360 (AIF360): Used for both the auditing and mitigation phases of the study.
  • OpenCV: Employed for image processing and manipulation.
  • Scikit-learn: Used for evaluating traditional performance metrics such as precision and recall.

Experimental Results

The effectiveness of the bias mitigation strategy was evaluated across multiple dataset configurations using standard classification metrics and fairness indicators.

Key Performance Metrics

The following table summarizes the comparative results between the baseline model and the de-biased model using the Reweighing technique:

Configuration Dataset Test Accuracy Bias Mitigation
Baseline Dataset 1 95.77% None
Mitigated Dataset 1 95.91% Reweighing
Baseline Dataset 3 (Female Privileged) 84.69% None
Mitigated Dataset 3 (Female Privileged) 86.36% Reweighing
Baseline Dataset 3 (Male Privileged) 84.69% None
Mitigated Dataset 3 (Male Privileged) 83.97% Reweighing

Observations

  • Bias Reduction: The application of the Reweighing algorithm significantly improved parity in classification errors. For instance, in Dataset 1, the mitigation process reduced false positive disparities while maintaining an exceptionally high overall accuracy.
  • Metric Trade-offs: The results highlight a common challenge in algorithmic fairness: the trade-off between raw accuracy and demographic parity. In some dataset configurations, achieving a more equitable outcome required a slight reduction in overall performance metrics.
  • Reliability: The training curves demonstrate that the de-biased models achieve stable convergence, indicating that the reweighing process does not introduce training instabilities.

Performance Visualization

The following training curves illustrate model convergence across different dataset experimental setups:

Dataset 1: Baseline Performance

Training Curves - Dataset 1 Figure 1: Accuracy and Loss curves for the initial gender classification model on Dataset 1.

Dataset 3: Female Privileged Dataset Distribution

Training Curves - DS3 Female Privileged Figure 2: Performance metrics for a dataset configuration where female subjects were the privileged group.

Dataset 3: Male Privileged Dataset Distribution

Training Curves - DS3 Male Privileged Figure 3: Performance metrics for a dataset configuration where male subjects were the privileged group.

Repository Structure

  • datasets/: Contains the image data subsets used for training and validation.
  • images/: Training performance visualizations and graphs.
  • models/: Stores the saved weights and architectures of trained models.
  • gc-ds1.ipynb: Notebook containing the investigation and mitigation steps for Dataset 1.
  • gc-ds3-femalepriv.ipynb: Analysis focused on datasets with a female-privileged distribution.
  • gc-ds3-malepriv.ipynb: Analysis focused on datasets with a male-privileged distribution.
  • notesandinstructions.py: Documentation of utility functions and preprocessing logic.
  • codeanalysis.txt: Detailed comparative results and metric summaries.
  • requirements.txt: List of dependencies required to run the notebooks.

Setup and Requirements

The project requires Python 3.8 or higher. To install the necessary dependencies, run:

pip install -r requirements.txt

To view the experiments, launch Jupyter Notebook and open the relevant .ipynb file:

jupyter notebook gc-ds1.ipynb

How to Cite

If you use this work in your research, please cite it as follows:

Patel, R. (2026). Detecting and Mitigating Algorithmic Bias in Face Recognition Algorithms: A Research Study. MSc Dissertation, University of Hertfordshire.

References and Acknowledgments

  • Datasets: The facial images used involve the FERET database and various synthetic demographic splits.
  • Fairness Framework: IBM AI Fairness 360 (AIF360) - https://aif360.mybluemix.net/
  • Literature:
    • Buolamwini, J., & Gebru, T. (2018). "Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification."
    • Bellamy, R. K. E., et al. (2018). "AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating AI bias."

Author: Rishabh Patel
Course: MSc Computer Science
Date: March 2026
Project: Detecting and Mitigating Algorithmic Bias in Face Recognition Algorithms

About

Gender classification model that uses a CNN to classify images of faces as male or female. The notebook includes code for data preprocessing, model architecture, training, and evaluation which will then be used for algorithmic bias detection.

Topics

Resources

License

Stars

Watchers

Forks

Contributors