USP Electronic Research Repository

A strategic weight refinement maneuver for convolutional neural networks

Sharma, Patrick and Sharma, Adarsh K. and Kumar, Dinesh and Sharma, Anuraganand (2021) A strategic weight refinement maneuver for convolutional neural networks. [Conference Proceedings]

[thumbnail of A_Strategic_Weight_Refinement_Maneuver_for_Convolutional_Neural_Networks.pdf] PDF - Accepted Version
Restricted to Repository staff only

Download (4MB) | Request a copy

Abstract

Stochastic Gradient Descent algorithms (SGD) remain a popular optimizer for deep learning networks and have been increasingly used in applications involving large datasets producing promising results. SGD approximates the gradient on a small subset of training examples, randomly selected in every iteration during network training. This randomness leads to the selection of an inconsistent order of training examples resulting in ambiguous values to solve the cost function. This paper applies Guided Stochastic Gradient Descent (GSGD) - a variant of SGD in deep learning neural networks. GSGD minimizes the training loss and maximizes the classification accuracy by overcoming the inconsistent order of data examples in SGDs. It temporarily bypasses the inconsistent data instances during gradient computation and weight update, leading to better convergence at the rate of $O(\textbackslashfrac1\textbackslashrho T-)$. Previously, GSGD has only been used in the shallow learning networks like the logistic regression. We try to incorporate GSGD in deep learning neural networks like the Convolutional Neural Networks (CNNs) and evaluate the classification accuracy in comparison with the same networks trained with SGDs. We test our approach on benchmark image datasets. Our baseline results show GSGD leads to a better convergence rate and improves classification accuracy by up to 3% of standard CNNs.

Item Type: Conference Proceedings
Additional Information: DOI: 10.1109/IJCNN52387.2021.9533359
Uncontrolled Keywords: Benchmark testing, Convolutional neural networks, Convolutional Neural Networks, Cost function, Deep learning, Deep Learning, Recurrent neural networks, Stochastic Gradient Descent, Training, Usability
Subjects: Q Science > QA Mathematics > QA76 Computer software
Divisions: School of Information Technology, Engineering, Mathematics and Physics (STEMP)
Depositing User: Anuraganand Sharma
Date Deposited: 04 Oct 2021 00:19
Last Modified: 05 Apr 2022 03:19
URI: https://repository.usp.ac.fj/id/eprint/13054

Actions (login required)

View Item View Item