Multi-Dimensional Adaptive Learning Rate Gradient Descent Optimization Algorithm for Network Training in Magneto-Optical Defect Detection

Yiping Liang; Lulu Tian; Xu Zhang; Xiao Zhang; Libing Bai

doi:10.53941/ijndi.2024.100016

Abstract

As an optimization technique, the gradient descent method is widely adopted in the training process of deep learning. In traditional gradient descent methods, the gradient of each dimension has the same weight wtih the updating direction, which results in poor performances when there are multiple small gradient dimensions (e.g. near the saddle point). To improve the accuracy and convergence speed of the neural network training, we propose a novel multi-dimensional adaptive learning rate gradient descent optimization algorithm (M-AdaGrad) in this paper. Specifically, in the M-AdaGrad, the learning rate will be updated according to a newly designed weight function related to the current gradient. Experiments on a set of sigmoid-based functions verify that, compared with traditional gradient descent methods such as AdaGrad and Adam, the M-AdaGrad gives more confidence to the larger gradient direction and has a larger probability to reach a more optimal position with a faster speed. Due to its excellent performance in network training, the M-AdaGrad is successfully applied to the magneto-optical nondestructive test of crack detection based on the generative adversarial network.

References

1.
Yu, N.X.; Yang, R.; Huang, M.J. Deep common spatial pattern based motor imagery classification with improved objective function. Int. J. Network Dyn. Intell., 2022, 1: 73−84. doi: 10.53941/ijndi0101007
2.
Zeng, N.Y.; Wu, P.S.; Wang, Z.D.; et al. A small-sized object detection oriented multi-scale feature fusion approach with application to defect detection. IEEE Trans. Instrum. Meas., 2022, 71: 3507014. doi: 10.1109/TIM.2022.3153997
3.
Perdios, D.; Vonlanthen, M.; Martinez, F.; et al. CNN-based image reconstruction method for ultrafast ultrasound imaging. IEEE Trans. Ultrason. Ferroelectr. Freq. Control, 2022, 69: 1154−1168. doi: 10.1109/TUFFC.2021.3131383
4.
Li, H.; Wang, P.; Shen, C.H. Toward end-to-end car license plate detection and recognition with deep neural networks. IEEE Trans. Intell. Transport. Syst., 2019, 20: 1126−1136. doi: 10.1109/TITS.2018.2847291
5.
Abdelhamid, A.A.; El-Kenawy, E.S.M.; Alotaibi, B.; et al. Robust speech emotion recognition using CNN+LSTM based on stochastic fractal search optimization algorithm. IEEE Access, 2022, 10: 49265−49284. doi: 10.1109/ACCESS.2022.3172954
6.
Lu, Y.J.; Tian, H.; Cheng, J.; et al. Decoding lip language using triboelectric sensors with deep learning. Nat. Commun., 2022, 13: 1401. doi: 10.1038/s41467-022-29083-0
7.
De Lope, J.; Graña, M. An ongoing review of speech emotion recognition. Neurocomputing, 2023, 528: 1−11. doi: 10.1016/j.neucom.2023.01.002
8.
Cai, L.N.; Cao, K.T.; Wu, Y.P.; et al. Spectrum sensing based on spectrogram-aware CNN for cognitive radio network. IEEE Wirel. Commun. Lett., 2022, 11: 2135−2139. doi: 10.1109/LWC.2022.3194735
9.
Minakova, S.; Stefanov, T. Memory-throughput trade-off for CNN-based applications at the edge. ACM Trans. Des. Autom. Electron. Syst., 2023, 28: 2. doi: 10.1145/3527457
10.
Zhang, X.Y.; Zhou, X.Y.; Lin, M.X.; et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; IEEE: New York, 2018; pp. 6848–6856. doi: 10.1109/CVPR.2018.00716
11.
Tian, L.L.; Wang, Z.D.; Liu, W.B.; et al. A new GAN-based approach to data augmentation and image segmentation for crack detection in thermal imaging tests. Cogn. Comput., 2021, 13: 1263−1273. doi: 10.1007/s12559-021-09922-w
12.
Liu, K.H.; Ye, Z.H.; Guo, H.Y.; et al. FISS GAN: A generative adversarial network for foggy image semantic segmentation. IEEE/CAA J. Autom. Sin., 2021, 8: 1428−1439. doi: 10.1109/JAS.2021.1004057
13.
Li, Y.F.; Peng, X.Y.; Zhang, J.; et al. DCT-GAN: Dilated convolutional transformer-based GAN for time series anomaly detection. IEEE Trans. Knowl. Data Eng., 2023, 35: 3632−3644. doi: 10.1109/TKDE.2021.3130234
14.
Kordi Ghasrodashti, E.; Sharma, N. Hyperspectral image classification using an extended auto-encoder method. Signal Process. Image Commun., 2021, 92: 116111. doi: 10.1016/j.image.2020.116111
15.
Cao, X.; Luo, Y.H.; Zhu, X.Y.; et al. DAEANet: Dual auto-encoder attention network for depth map super-resolution. Neurocomputing, 2021, 454: 350−360. doi: 10.1016/j.neucom.2021.04.096
16.
Wang, Y.S.; Yao, H.X.; Zhao, S.C. Auto-encoder based dimensionality reduction. Neurocomputing, 2016, 184: 232−242. doi: 10.1016/j.neucom.2015.08.104
17.
Frame, J.M.; Kratzert, F.; Raney II, A.; et al. Post-processing the national water model with long short-term memory networks for streamflow predictions and model diagnostics. J. Am. Water Resour. Assoc., 2021, 57: 885−905. doi: 10.1111/1752-1688.12964
18.
Zhang, Y.Z.; Xiong, R.; He, H.W.; et al. Long short-term memory recurrent neural network for remaining useful life prediction of lithium-ion batteries. IEEE Trans. Veh. Technol., 2018, 67: 5695−5705. doi: 10.1109/TVT.2018.2805189
19.
Lino, M.; Fotiadis, S.; Bharath, A.A.; et al. Multi-scale rotation-equivariant graph neural networks for unsteady Eulerian fluid dynamics. Phys. Fluids, 2022, 34: 087110. doi: 10.1063/5.0097679
20.
Sabir, Z.; Raja, M.A.; Baleanu, D.; et al. Investigations of non-linear induction motor model using the Gudermannian neural networks. Therm. Sci., 2022, 26: 3399−3412. doi: 10.2298/TSCI210508261S
21.
Zhang, M.H.; Chen, Y.X. Link prediction based on graph neural networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, 3–8 December 2018; Curran Associates Inc: Red Hook, 2008; pp. 5171–5181.
22.
Liu, W.B.; Wang, Z.D.; Yuan, Y.; et al. A novel sigmoid-function-based adaptive weighted particle swarm optimizer. IEEE Trans. Cybern., 2021, 51: 1085−1093. doi: 10.1109/TCYB.2019.2925015
23.
Li, Y.Z.; Yuan, Y. Convergence analysis of two-layer neural networks with ReLU activation. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc: Red Hook, 2017; pp. 597–607.
24.
Luo, X.; Liu, Z.G; Li, S.; et al. A fast non-negative latent factor model based on generalized momentum method. IEEE Trans. Syst. Man Cybern. Syst., 2021, 51: 610−620. doi: 10.1109/TSMC.2018.2875452
25.
Duchi, J.; Hazan, E.; Singer, Y. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 2011, 12: 2121−2159.
26.
Cao, Y.Z.; Das, S.; Van Wyk, H.W. Adaptive stochastic gradient descent for optimal control of parabolic equations with random parameters. Numer. Methods Partial, 2022, 38: 2104−2122. doi: 10.1002/num.22869
27.
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; ICLR, 2015; pp. 1–13. doi: 10.48550/arXiv.1412.6980
28.
Chen, J.R.; Jin, S.; Lyu, L.Y. A consensus-based global optimization method with adaptive momentum estimation. Commun. Comput. Phys., 2022, 31: 1296−1316. doi: 10.4208/cicp.OA-2021-0144
29.
Tieleman, T.; Hinton, G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks Mach. Learn., 2012, 4: 26−31.
30.
Tian, L.L.; Cheng, Y.H.; Yin, C.; et al. Design of the MOI method based on the artificial neural network for crack detection. Neurocomputing, 2017, 226: 80−89. doi: 10.1016/j.neucom.2016.11.032
31.
Bai, L.B.; Liang, Y.P.; Shao, J.L.; et al. Moran’s index-based tensor decomposition for eddy current pulsed thermography sequence processing. IEEE Trans. Instrum. Meas., 2021, 70: 6009212. doi: 10.1109/TIM.2021.3096277

Scilight Press

Author Information

Abstract

Keywords

References

About Scilight

Journals

Publishing Policies

Contact Us