This repository is an implementation of "An Effective Weight Initialization Method for Deep Learning: Application to Satellite Image Classification" paper. Comparative analyses with existing weight initialization techniques made on various CNN models reveal that the proposed weight initialization method outperforms the previous competitive techniques in terms of classification accuracy.
Significantly increased interest in satellite images has triggered the need for efficient mechanisms for extracting useful information from massive satellite images to provide better insight into them. Even though deep learning has shown significant progress in image classification. Nevertheless, in the literature, only a few results can be found on weight initialization techniques. These techniques train the networks' weights on massive datasets and fine-tune the weights of pre-trained networks. In this study, a novel weight initialization method is proposed in the context of satellite image classification. The proposed weight initialization method is mathematically detailed during the forward and backward passes of the CNN model. Extensive experiments are carried out using six real-world datasets. Comparative analyses with existing weight initialization techniques made on various pre-trained CNN models reveal that the proposed weight initialization technique outperforms the previous competitive techniques in classification accuracy.
The proposed weight initialization method is applied to three pre-trained models, namely Resnet152V2, VGG19, and MobileNetV2. The models were trained for 100 epochs, each consisting of 32 batches. Xavier, He, and the proposed weight initialization method are applied to the three CNN models. All the models are trained on a learning rate 1e-4 with Adam optimizer.
| model | init | cifar100 | ucmerced | aid | ksa | patternnet |
|---|---|---|---|---|---|---|
| ResNet152 | He | 0.5507 | 0.5381 | 0.3915 | 0.7108 | 0.7298 |
| Xavier | 0.4975 | 0.5095 | 0.4140 | 0.7308 | 0.7451 | |
| Proposed | 0.5514 | 0.5452 | 0.4300 | 0.7338 | 0.7896 | |
| VGG19 | He | 0.6690 | 0.6786 | 0.503 | 0.8292 | 0.8461 |
| Xavier | 0.6658 | 0.6762 | 0.507 | 0.8308 | 0.8362 | |
| Proposed | 0.6737 | 0.6833 | 0.5120 | 0.8400 | 0.8462 | |
| MobileNetV2 | He | 0.5682 | 0.4500 | 0.3510 | 0.6831 | 0.7298 |
| Xavier | 0.5652 | 0.4333 | 0.3435 | 0.7031 | 0.7451 | |
| Proposed | 0.5683 | 0.4690 | 0.3575 | 0.7246 | 0.7896 |
The figure below details the performances of the proposed weight initialization method on four public remote senging datasets, namely, UC-Merced, AID, KSA, and PatternNet.
The training progress plots in the figures below illustrate the performance of the proposed weight initialization method, as well as the Xavier, He, and zerO methods, on the CIFAR-100 dataset. The first figure displays the training progress of validation accuracy, while the second figure focuses on validation loss.
The analysis of the plots shows that the proposed weight initialization method outperforms the three other weight initialization techniques in terms of both accuracy and loss, as shown in both the overall training progress and the zoomed-in subplots. The performance advantage of the proposed method is visually apparent, with consistently higher accuracy values and lower loss values throughout the training process.
The comparison with He, Xavier, and zerO initialization methods further confirms the superior performance of the proposed approach. Notably, the zoomed-in subplots highlight the enhanced accuracy and reduced loss achieved by our proposed method in the final ten iterations. These findings highlight the effectiveness of the proposed weight initialization method in improving accuracy and minimizing the discrepancy between predicted and actual values.
To download the dataset:
before running the code, you have to put the dataset as zipped files in compressed directory. The code will unzip it and split it randomly.
.
└── data
└── compressed
├── UCMerced_LandUse.zip
├── KSA.zip
├── AID.zip
└── PatternNet.zip
running parameters
| Parameter Name | Description | Default |
|---|---|---|
| [-ds][--dataset_name] | dataset name should be ucmerced, aid, ksa, patternnet, or wadii | ucmerced |
| [-mn][--model_name] | model name should be any vgg, resnet, and mobilenet models | mobilenet_v2 |
| [-wi][--weight_init] | weight initialization method should be ether xavier or he to use the famouse method, and you can choose any other name to use our proposed method | custom |
| [-is][--image_size] | image size tfor data transforms | 224 |
| [-tr][--train] | training option | False |
| [-ev][--eval] | evaluation option | False |
| [-evs][--eval_summary] | evaluation summary option | False |
| [-ep][--epochs] | training iteration number | 100 |
| [-bs][--batch_size] | training/evaluation batch size | 16 |
| [-lr][--learning_rate] | training learning rate | 0.0001 |
| [-sv][--save] | save the model and training history | True |
| [-ow][--overwrite] | overwrite the current model with the same dataset_name, model_name, and init_name | True |
| [-pr][--printing] | print the used hyperparameters, dataset details, and model training progress | True |
| [-an][--avg_num] | number of the evaluation for taking the average | 10 |
| [-sp][--summary_save_path] | save path of the summary file | ./results/log/ |
To run the training and evaluation with the default values, run the following command:
python run.py --train --eval
To run the code using your own parameters, run the following command
!python run.py \
--dataset_name ucmerced \
--model_name mobilenet_v2 \
--weight_init proposed \
--train \
--eval \
--eval_summary \
--epochs 1 \
--batch_size 32 \
--learning_rate 0.0001 \
--save \
--overwrite \
--printing \
--avg_num 3 \
--summary_save_path summary
or run the following command for short:
!python run.py -ds ucmerced -mn mobilenet_v2 -wi proposed -tr -ev -evs -ep 1 -bs 32 -lr 0.0001 -sv -ow -pr -an 3 -sp summary
- setting printing option as false will make you run the code in silent mode
- eval_summary option will evaluate all the saved checkpoints and generate an organized CSV file that contains all the evaluations
- in eval_summary option, the evaluations will use the validation set with avg_num. For example, I ran the code with eval_summary and avg_num = 3, this will evaluate the validation set 3 times and compute the average of all the the results
- setting overwrite option as false will stop the training if you had a saved checkpoint and history with the same parameters (dataset name, model name, weight init, and epochs)
If you use any part of this work please cite using the following Bibtex format:
@article{BOULILA2024124344,
title = {An effective weight initialization method for deep learning: Application to satellite image classification},
journal = {Expert Systems with Applications},
volume = {254},
pages = {124344},
year = {2024},
issn = {0957-4174},
doi = {https://doi.org/10.1016/j.eswa.2024.124344},
url = {https://www.sciencedirect.com/science/article/pii/S0957417424012107},
author = {Wadii Boulila and Eman Alshanqiti and Ayyub Alzahem and Anis Koubaa and Nabil Mlaiki}
}


