Skip to content

Asmit159/ResNet-50

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

ResNet: Custom ResNet-50 & MLOps Pipeline

Python PyTorch Kaggle

ResNet is an end-to-end deep learning pipeline featuring a Custom ResNet-50 architecture built entirely from scratch. Designed for ImageNet-100, this project demonstrates advanced distributed training dynamics, stochastic regularization, and model interpretability without relying on pre-trained ImageNet weights.

graph LR
    %% Input and Stem
    A["Input Image<br>3x224x224"] --> B("Stem<br>Conv 7x7, Stride 2")
    B --> C("BatchNorm + ReLU")
    C --> D("MaxPool 3x3")
    
    %% Main Architecture Blocks (Bottlenecks)
    D --> E["Layer 1<br>3x Bottleneck Blocks<br>Output Depth: 256"]
    E --> F["Layer 2<br>4x Bottleneck Blocks<br>Output Depth: 512"]
    F --> G["Layer 3<br>6x Bottleneck Blocks<br>Output Depth: 1024"]
    G --> H["Layer 4<br>3x Bottleneck Blocks<br>Output Depth: 2048"]
    
    %% Classification Head
    H --> I("AdaptiveAvgPool2d")
    I --> J("Flatten")
    J --> K["Linear Projection<br>100 Classes"]
Loading

Results

  • Accuracy: Achieved 86.9% Top-1 and 96.2% Top-5 validation accuracy on the custom 100-class dataset.
  • Ablation Superiority: Successfully outperformed warmed-up, industry-standard pre-trained baselines (PT ResNet-50, VGG-16, MobileNetV2) in a heavily controlled benchmarking environment.
  • Performance: Engineered with nn.DataParallel for Dual NVIDIA T4 GPUs and Mixed Precision (AMP), reducing VRAM consumption by ~40%.

Visual Diagnostics

1. Model Convergence

The 90-epoch learning curve demonstrates highly stable validation accuracy despite the aggressive noise introduced by CutMix and MixUp regularization.

simple_best

2. Architecture Ablation Study

Benchmarking the custom ResNet against ImageNet-pretrained industrial models (using a controlled 5-epoch classification head warmup).

portfolio_ablation_chart image

3. Interpretability (Grad-CAM)

Diagnostic heatmaps generated via forward/backward hooks on the final convolutional bottleneck (layer4[-1].conv3), proving the network isolates target morphology rather than background artifacts.

Visual Interpretability & Bias Diagnostics (Grad-CAM)

"Black box" models are unacceptable in production. I engineered a custom Gradient-weighted Class Activation Mapping (Grad-CAM) pipeline attached to the final convolutional bottleneck (layer4[-1].conv3) to actively debug model behavior and verify morphological feature extraction across varied target classes.

Best of Class: Water Ouzel (Dipper) Best of Class: Rock Crab
Water Ouzel Rock Crab
Success: Tench (100% Confidence) Success: Tench (Alternate View)
Tench 100 Tench Correct
Diagnostic: Texture Bias (Predicted 84) Best of Class: Wombat
Texture Bias Image 6

Core Technical Features & Engineering

Architectural Design

  • Native Topology Implementation : Bypassed torchvision.models to manually construct the AlgoReasoner ResNet-50 pipeline, including precise bottleneck triplets (1x1, 3x3, 1x1) and dynamically calculated projection shortcuts.

  • Kaiming (He) Initialization : Implemented mathematically rigorous weight initialization specifically designed for deep networks with asymmetrical non-linearities (ReLU), preventing vanishing/exploding gradients during the volatile early epochs of from-scratch training.

Advanced Regularization

  • Probabilistic CutMix & MixUp : Engineered a custom data-collator that dynamically applies CutMix (spatial patch blending) and MixUp (feature/label interpolation) with a randomized stochastic probability per batch. This forces the network to learn global structural cues rather than memorizing local pixel noise.

  • Optimization Schedule: Utilized a gradual learning rate warmup phase to stabilize initial high-entropy gradients, seamlessly transitioning into a Cosine Annealing (CosineAnnealingLR) decay schedule to gently guide the optimizer into narrow local minima.

MLOps, Profiling & Compute

  • Inference Profiling : Instrumented custom timing decorators to actively measure and log model throughput (images/sec) and raw inference latency (ms/image), ensuring the architecture remains viable for real-time production deployment.

  • Hardware Scaling & AMP : Orchestrated multi-GPU training via PyTorch nn.DataParallel (Dual NVIDIA T4s), heavily optimized with Automatic Mixed Precision (torch.amp.autocast) to reduce VRAM overhead by ~40% while accelerating tensor math.

  • Cloud-Native Fault Tolerance : Automated dynamic state-dictionary checkpointing (_latest.pth and _best.pth) directly within the training loop, ensuring zero data loss during preemptive cloud server disconnects.

Visual Interpretability

  • "White-Box" Diagnostics (Grad-CAM) : Refused the "black-box" AI paradigm by engineering a custom Gradient-weighted Class Activation Mapping pipeline. By attaching forward and backward hooks to the final spatial feature map (layer4[-1].conv3), the model actively renders diagnostic heatmaps proving its predictions are grounded in actual target morphology.

💻 Installation & Usage

1. Clone the repository

git clone [https://github.com/yourusername/ResNet-50.git](https://github.com/Asmit159/ResNet-50.git)
cd ResNet-50

2. Install dependencies

pip install -r requirements.txt

3. Run the training pipeline

python train.py --epochs 90 --batch_size 128 --mixed_precision True

4. Generate diagnostics (Grad-CAM & Ablation)

python evaluate.py --checkpoint weights/resnet50_best.pth

Author Asmit Mandal

About

ResNet is an end-to-end deep learning pipeline featuring a Custom ResNet-50 architecture built entirely from scratch. Designed for ImageNet-100, this project demonstrates advanced distributed training dynamics, stochastic regularization, and model interpretability without relying on pre-trained ImageNet weights.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors