USP Electronic Research Repository

A strategic weight refinement maneuver for convolutional neural networks

Sharma, Patrick and Sharma, Adarsh K. and Kumar, Dinesh and Sharma, Anuraganand (2021) A strategic weight refinement maneuver for convolutional neural networks. [Conference Proceedings]

[img] PDF - Accepted Version
Restricted to Repository staff only

Download (4085Kb)

    Abstract

    Stochastic Gradient Descent algorithms (SGD) remain a popular optimizer for deep learning networks and have been increasingly used in applications involving large datasets producing promising results. SGD approximates the gradient on a small subset of training examples, randomly selected in every iteration during network training. This randomness leads to the selection of an inconsistent order of training examples resulting in ambiguous values to solve the cost function. This paper applies Guided Stochastic Gradient Descent (GSGD) - a variant of SGD in deep learning neural networks. GSGD minimizes the training loss and maximizes the classification accuracy by overcoming the inconsistent order of data examples in SGDs. It temporarily bypasses the inconsistent data instances during gradient computation and weight update, leading to better convergence at the rate of $O(\textbackslashfrac1\textbackslashrho T-)$. Previously, GSGD has only been used in the shallow learning networks like the logistic regression. We try to incorporate GSGD in deep learning neural networks like the Convolutional Neural Networks (CNNs) and evaluate the classification accuracy in comparison with the same networks trained with SGDs. We test our approach on benchmark image datasets. Our baseline results show GSGD leads to a better convergence rate and improves classification accuracy by up to 3% of standard CNNs.

    Item Type: Conference Proceedings
    Additional Information: DOI: 10.1109/IJCNN52387.2021.9533359
    Uncontrolled Keywords: Benchmark testing, Convolutional neural networks, Convolutional Neural Networks, Cost function, Deep learning, Deep Learning, Recurrent neural networks, Stochastic Gradient Descent, Training, Usability
    Subjects: Q Science > QA Mathematics > QA76 Computer software
    Divisions: School of Information Technology, Engineering, Mathematics and Physics (STEMP)
    Depositing User: Anuraganand Sharma
    Date Deposited: 04 Oct 2021 12:19
    Last Modified: 05 Apr 2022 15:19
    URI: http://repository.usp.ac.fj/id/eprint/13054
    UNSPECIFIED

    Actions (login required)

    View Item

    Document Downloads

    More statistics for this item...