Author Information
Abstract
As an optimization technique, the gradient descent method is widely adopted in the training process of deep learning. In traditional gradient descent methods, the gradient of each dimension has the same weight wtih the updating direction, which results in poor performances when there are multiple small gradient dimensions (e.g. near the saddle point). To improve the accuracy and convergence speed of the neural network training, we propose a novel multi-dimensional adaptive learning rate gradient descent optimization algorithm (M-AdaGrad) in this paper. Specifically, in the M-AdaGrad, the learning rate will be updated according to a newly designed weight function related to the current gradient. Experiments on a set of sigmoid-based functions verify that, compared with traditional gradient descent methods such as AdaGrad and Adam, the M-AdaGrad gives more confidence to the larger gradient direction and has a larger probability to reach a more optimal position with a faster speed. Due to its excellent performance in network training, the M-AdaGrad is successfully applied to the magneto-optical nondestructive test of crack detection based on the generative adversarial network.
Keywords
References

This work is licensed under a This work is licensed under a Creative Commons Attribution 4.0 International License.