Skip to content

WadiiBoulila/Weight-Initialization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Weight-Initialization

This repository is an implementation of "An Effective Weight Initialization Method for Deep Learning: Application to Satellite Image Classification" paper. Comparative analyses with existing weight initialization techniques made on various CNN models reveal that the proposed weight initialization method outperforms the previous competitive techniques in terms of classification accuracy.

Abstract

Significantly increased interest in satellite images has triggered the need for efficient mechanisms for extracting useful information from massive satellite images to provide better insight into them. Even though deep learning has shown significant progress in image classification. Nevertheless, in the literature, only a few results can be found on weight initialization techniques. These techniques train the networks' weights on massive datasets and fine-tune the weights of pre-trained networks. In this study, a novel weight initialization method is proposed in the context of satellite image classification. The proposed weight initialization method is mathematically detailed during the forward and backward passes of the CNN model. Extensive experiments are carried out using six real-world datasets. Comparative analyses with existing weight initialization techniques made on various pre-trained CNN models reveal that the proposed weight initialization technique outperforms the previous competitive techniques in classification accuracy.

Results

The proposed weight initialization method is applied to three pre-trained models, namely Resnet152V2, VGG19, and MobileNetV2. The models were trained for 100 epochs, each consisting of 32 batches. Xavier, He, and the proposed weight initialization method are applied to the three CNN models. All the models are trained on a learning rate 1e-4 with Adam optimizer.

model init cifar100 ucmerced aid ksa patternnet
ResNet152 He 0.5507 0.5381 0.3915 0.7108 0.7298
Xavier 0.4975 0.5095 0.4140 0.7308 0.7451
Proposed 0.5514 0.5452 0.4300 0.7338 0.7896
VGG19 He 0.6690 0.6786 0.503 0.8292 0.8461
Xavier 0.6658 0.6762 0.507 0.8308 0.8362
Proposed 0.6737 0.6833 0.5120 0.8400 0.8462
MobileNetV2 He 0.5682 0.4500 0.3510 0.6831 0.7298
Xavier 0.5652 0.4333 0.3435 0.7031 0.7451
Proposed 0.5683 0.4690 0.3575 0.7246 0.7896

The figure below details the performances of the proposed weight initialization method on four public remote senging datasets, namely, UC-Merced, AID, KSA, and PatternNet.

The training progress plots in the figures below illustrate the performance of the proposed weight initialization method, as well as the Xavier, He, and zerO methods, on the CIFAR-100 dataset. The first figure displays the training progress of validation accuracy, while the second figure focuses on validation loss.

The analysis of the plots shows that the proposed weight initialization method outperforms the three other weight initialization techniques in terms of both accuracy and loss, as shown in both the overall training progress and the zoomed-in subplots. The performance advantage of the proposed method is visually apparent, with consistently higher accuracy values and lower loss values throughout the training process.

The comparison with He, Xavier, and zerO initialization methods further confirms the superior performance of the proposed approach. Notably, the zoomed-in subplots highlight the enhanced accuracy and reduced loss achieved by our proposed method in the final ten iterations. These findings highlight the effectiveness of the proposed weight initialization method in improving accuracy and minimizing the discrepancy between predicted and actual values.

Dataset Setup

To download the dataset:


before running the code, you have to put the dataset as zipped files in compressed directory. The code will unzip it and split it randomly.
 .
 └── data 
     └── compressed 
         ├── UCMerced_LandUse.zip 
         ├── KSA.zip 
         ├── AID.zip
         └── PatternNet.zip 

Help

running parameters

Parameter Name Description Default
[-ds][--dataset_name] dataset name should be ucmerced, aid, ksa, patternnet, or wadii ucmerced
[-mn][--model_name] model name should be any vgg, resnet, and mobilenet models mobilenet_v2
[-wi][--weight_init] weight initialization method should be ether xavier or he to use the famouse method, and you can choose any other name to use our proposed method custom
[-is][--image_size] image size tfor data transforms 224
[-tr][--train] training option False
[-ev][--eval] evaluation option False
[-evs][--eval_summary] evaluation summary option False
[-ep][--epochs] training iteration number 100
[-bs][--batch_size] training/evaluation batch size 16
[-lr][--learning_rate] training learning rate 0.0001
[-sv][--save] save the model and training history True
[-ow][--overwrite] overwrite the current model with the same dataset_name, model_name, and init_name True
[-pr][--printing] print the used hyperparameters, dataset details, and model training progress True
[-an][--avg_num] number of the evaluation for taking the average 10
[-sp][--summary_save_path] save path of the summary file ./results/log/

Usage

To run the training and evaluation with the default values, run the following command:

python run.py --train --eval

To run the code using your own parameters, run the following command

!python run.py                \
--dataset_name ucmerced       \
--model_name mobilenet_v2     \
--weight_init proposed        \
--train                       \
--eval                        \
--eval_summary                \
--epochs 1                    \
--batch_size 32               \
--learning_rate 0.0001        \
--save                        \
--overwrite                   \
--printing                    \
--avg_num 3                   \
--summary_save_path summary

or run the following command for short:

!python run.py -ds ucmerced -mn mobilenet_v2 -wi proposed -tr -ev -evs -ep 1 -bs 32 -lr 0.0001 -sv -ow -pr -an 3 -sp summary

Notes

  • setting printing option as false will make you run the code in silent mode
  • eval_summary option will evaluate all the saved checkpoints and generate an organized CSV file that contains all the evaluations
  • in eval_summary option, the evaluations will use the validation set with avg_num. For example, I ran the code with eval_summary and avg_num = 3, this will evaluate the validation set 3 times and compute the average of all the the results
  • setting overwrite option as false will stop the training if you had a saved checkpoint and history with the same parameters (dataset name, model name, weight init, and epochs)

Citation

If you use any part of this work please cite using the following Bibtex format:

@article{BOULILA2024124344,
title = {An effective weight initialization method for deep learning: Application to satellite image classification},
journal = {Expert Systems with Applications},
volume = {254},
pages = {124344},
year = {2024},
issn = {0957-4174},
doi = {https://doi.org/10.1016/j.eswa.2024.124344},
url = {https://www.sciencedirect.com/science/article/pii/S0957417424012107},
author = {Wadii Boulila and Eman Alshanqiti and Ayyub Alzahem and Anis Koubaa and Nabil Mlaiki}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages