Weight-Initialization

This repository is an implementation of "An Effective Weight Initialization Method for Deep Learning: Application to Satellite Image Classification" paper. Comparative analyses with existing weight initialization techniques made on various CNN models reveal that the proposed weight initialization method outperforms the previous competitive techniques in terms of classification accuracy.

Abstract

Significantly increased interest in satellite images has triggered the need for efficient mechanisms for extracting useful information from massive satellite images to provide better insight into them. Even though deep learning has shown significant progress in image classification. Nevertheless, in the literature, only a few results can be found on weight initialization techniques. These techniques train the networks' weights on massive datasets and fine-tune the weights of pre-trained networks. In this study, a novel weight initialization method is proposed in the context of satellite image classification. The proposed weight initialization method is mathematically detailed during the forward and backward passes of the CNN model. Extensive experiments are carried out using six real-world datasets. Comparative analyses with existing weight initialization techniques made on various pre-trained CNN models reveal that the proposed weight initialization technique outperforms the previous competitive techniques in classification accuracy.

Results

The proposed weight initialization method is applied to three pre-trained models, namely Resnet152V2, VGG19, and MobileNetV2. The models were trained for 100 epochs, each consisting of 32 batches. Xavier, He, and the proposed weight initialization method are applied to the three CNN models. All the models are trained on a learning rate 1e-4 with Adam optimizer.

model	init	cifar100	ucmerced	aid	ksa	patternnet
ResNet152	He	0.5507	0.5381	0.3915	0.7108	0.7298
	Xavier	0.4975	0.5095	0.4140	0.7308	0.7451
	Proposed	0.5514	0.5452	0.4300	0.7338	0.7896
VGG19	He	0.6690	0.6786	0.503	0.8292	0.8461
	Xavier	0.6658	0.6762	0.507	0.8308	0.8362
	Proposed	0.6737	0.6833	0.5120	0.8400	0.8462
MobileNetV2	He	0.5682	0.4500	0.3510	0.6831	0.7298
	Xavier	0.5652	0.4333	0.3435	0.7031	0.7451
	Proposed	0.5683	0.4690	0.3575	0.7246	0.7896

The figure below details the performances of the proposed weight initialization method on four public remote senging datasets, namely, UC-Merced, AID, KSA, and PatternNet.

The training progress plots in the figures below illustrate the performance of the proposed weight initialization method, as well as the Xavier, He, and zerO methods, on the CIFAR-100 dataset. The first figure displays the training progress of validation accuracy, while the second figure focuses on validation loss.

The analysis of the plots shows that the proposed weight initialization method outperforms the three other weight initialization techniques in terms of both accuracy and loss, as shown in both the overall training progress and the zoomed-in subplots. The performance advantage of the proposed method is visually apparent, with consistently higher accuracy values and lower loss values throughout the training process.

The comparison with He, Xavier, and zerO initialization methods further confirms the superior performance of the proposed approach. Notably, the zoomed-in subplots highlight the enhanced accuracy and reduced loss achieved by our proposed method in the final ten iterations. These findings highlight the effectiveness of the proposed weight initialization method in improving accuracy and minimizing the discrepancy between predicted and actual values.

Dataset Setup

To download the dataset:

before running the code, you have to put the dataset as zipped files in compressed directory. The code will unzip it and split it randomly.

 .
 └── data 
     └── compressed 
         ├── UCMerced_LandUse.zip 
         ├── KSA.zip 
         ├── AID.zip
         └── PatternNet.zip

Help

running parameters

Parameter Name	Description	Default
[-ds][--dataset_name]	dataset name should be ucmerced, aid, ksa, patternnet, or wadii	ucmerced
[-mn][--model_name]	model name should be any vgg, resnet, and mobilenet models	mobilenet_v2
[-wi][--weight_init]	weight initialization method should be ether xavier or he to use the famouse method, and you can choose any other name to use our proposed method	custom
[-is][--image_size]	image size tfor data transforms	224
[-tr][--train]	training option	False
[-ev][--eval]	evaluation option	False
[-evs][--eval_summary]	evaluation summary option	False
[-ep][--epochs]	training iteration number	100
[-bs][--batch_size]	training/evaluation batch size	16
[-lr][--learning_rate]	training learning rate	0.0001
[-sv][--save]	save the model and training history	True
[-ow][--overwrite]	overwrite the current model with the same dataset_name, model_name, and init_name	True
[-pr][--printing]	print the used hyperparameters, dataset details, and model training progress	True
[-an][--avg_num]	number of the evaluation for taking the average	10
[-sp][--summary_save_path]	save path of the summary file	./results/log/

Usage

To run the training and evaluation with the default values, run the following command:

python run.py --train --eval

To run the code using your own parameters, run the following command

!python run.py                \
--dataset_name ucmerced       \
--model_name mobilenet_v2     \
--weight_init proposed        \
--train                       \
--eval                        \
--eval_summary                \
--epochs 1                    \
--batch_size 32               \
--learning_rate 0.0001        \
--save                        \
--overwrite                   \
--printing                    \
--avg_num 3                   \
--summary_save_path summary

or run the following command for short:

!python run.py -ds ucmerced -mn mobilenet_v2 -wi proposed -tr -ev -evs -ep 1 -bs 32 -lr 0.0001 -sv -ow -pr -an 3 -sp summary

Notes

setting printing option as false will make you run the code in silent mode
eval_summary option will evaluate all the saved checkpoints and generate an organized CSV file that contains all the evaluations
in eval_summary option, the evaluations will use the validation set with avg_num. For example, I ran the code with eval_summary and avg_num = 3, this will evaluate the validation set 3 times and compute the average of all the the results
setting overwrite option as false will stop the training if you had a saved checkpoint and history with the same parameters (dataset name, model name, weight init, and epochs)

Citation

If you use any part of this work please cite using the following Bibtex format:

@article{BOULILA2024124344,
title = {An effective weight initialization method for deep learning: Application to satellite image classification},
journal = {Expert Systems with Applications},
volume = {254},
pages = {124344},
year = {2024},
issn = {0957-4174},
doi = {https://doi.org/10.1016/j.eswa.2024.124344},
url = {https://www.sciencedirect.com/science/article/pii/S0957417424012107},
author = {Wadii Boulila and Eman Alshanqiti and Ayyub Alzahem and Anis Koubaa and Nabil Mlaiki}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data/uncompressed		data/uncompressed
docs		docs
README.md		README.md
datasets.py		datasets.py
init.py		init.py
main.py		main.py
networks.py		networks.py
run.py		run.py
training.py		training.py
utils.py		utils.py
vis.py		vis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Weight-Initialization

Abstract

Results

Dataset Setup

Help

Usage

Notes

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Weight-Initialization

Abstract

Results

Dataset Setup

Help

Usage

Notes

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages