A Survey of Learning in Optimal Control and Differential Game

Yuxuan Zhu; Chuandong Li; Runtian Zeng

doi:10.53941/jmlis.2026.100004

Abstract

With the widespread applications of large-scale multi-agent systems, optimal control and differential game have become essential components of modern control theory. However, traditional methods often struggle with the inherent challenges posed by high dimensionality when addressing high-dimensional problems. The rapid development of deep learning has provided new solution ideas and methods to address this challenge. This paper reviews the research status and progress of solution methods for optimal control and differential game. First, this review elaborates on the fundamental theoretical frameworks of optimal control theory for continuous-time systems and differential game. Second, this paper introduces in detail two main deep learning methods: Deep Reinforcement Learning (DRL) and Physics-Informed Deep Learning (PIDL). Based on this, this study analyzes the specific applications of these two methods in addressing the aforementioned problems. Finally, this article summarizes the main problems and limitations of existing research and points out future research directions.

References

1.
Wu, B.; Liu, Y.; Wang, Q. Path Tracking of Autonomous Vehicle Based on Optimal Control. World Electr. Veh. J. 2025, 16, 340.
2.
Sokolov, B.; Dolgui, A.; Ivanov, D. Optimal Control Algorithms and Their Analysis for Short-Term Scheduling in Manufacturing Systems. Algorithms 2018, 11, 57.
3.
Chi, Y.; Dong, Y.; Zhang, L.; et al. Collaborative Control of UAV Swarms for Target Capture Based on Intelligent Control Theory. Mathematics 2025, 13, 413.
4.
Dong, J.; Bao, A.-R.-H.; Liu, Y.; et al. Dynamic Differential Game Strategy of the Energy Big Data Ecosystem Considering Technological Innovation. Sustainability 2022, 14, 7158.
5.
Owen, G. Game Theory; Emerald Group Publishing: Leeds, UK, 2013.
6.
Wang, X.; Wang, S.; Liang, X.; et al. Deep Reinforcement Learning: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 5064–5078.
7.
Wang, N.; Chen, Y.; Zhang, D. A Comprehensive Review of Physics-Informed Deep Learning and Its Applications in Geoenergy Development. Innov. Energy 2025, 2, 100087.
8.
Kirk, D.E. Optimal Control Theory: An Introduction; Courier Corporation: North Chelmsford, MA, USA, 2004.
9.
Ba s¸ ar, T.; Olsder, G.J. Dynamic Noncooperative Game Theory; SIAM: Philadelphia, PA, USA 1998.
10.
Lasry, J.-M.; Lions, P.-L. Mean Field Games. Jpn. J. Math. 2007, 2, 229–260.
11.
Carmona, R.; Delarue, F. Probabilistic Theory of Mean Field Games with Applications I–II; Springer: Berlin/Heidelberg, Germany, 2018; Volume 3.
12.
LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444.
13.
Mienye, I.D.; Swart, T.G. A Comprehensive Review of Deep Learning: Architectures, Recent Advances, and Applications. Information 2024, 15, 755.
14.
Deng, L.; Liu, Y. Deep Learning in Natural Language Processing; Springer: Berlin/Heidelberg, Germany, 2018.
15.
Li, F.; Zhang, H.; Zhang, Y.-F.; et al. Vision-Language Intelligence: Tasks, Representation Learning, and Large Models. arXiv 2022, arXiv:2203.01922.
16.
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 1998; Volume 1.
17.
Watkins, C.J.C.H.; Dayan, P. Q-Learning. Mach. Learn. 1992, 8, 279–292.
18.
Sutton, R.S.; McAllester, D.; Singh, S.; et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation. Adv. Neural Inf. Process. Syst. 1999, 12, 1057 - 1063.
19.
Konda, V.; Tsitsiklis, J. Actor-Critic Algorithms. Adv. Neural Inf. Process. Syst. 1999, 12.
20.
Vamvoudakis, K.G.; Lewis, F.L. Online Actor–Critic Algorithm to Solve the Continuous-Time Infinite Horizon Optimal Control Problem. Automatica 2010, 46, 878–888.
21.
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; et al. Physics-Informed Machine Learning. Nat. Rev. Phys. 2021, 3, 422–440.
22.
Simpkins, A. System Identification: Theory for the User, (ljung, l.; 1999)[on the Shelf]. IEEE Robot. Autom. Mag. 2012, 19, 95–96.
23.
Bellman, R. Dynamic Programming. Science 1966, 153, 34–37.
24.
Heydari, A. Optimal Impulsive Control Using Adaptive Dynamic Programming and Its Application in Spacecraft Ren- dezvous. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4544–4552.
25.
Zhou, M.; Han, J.; Lu, J. Actor-Critic Method for High Dimensional Static Hamilton–Jacobi–Bellman Partial Differential Equations Based on Neural Networks. Siam J. Sci. Comput. 2021, 43, A4043–A4066.
26.
Modares, H.; Lewis, F.L.; Naghibi-Sistani, M.-B. Integral Reinforcement Learning and Experience Replay for Adaptive Optimal Control of Partially-Unknown Constrained-Input Continuous-Time Systems. Automatica 2014, 50, 193–202.
27.
Xue, S.; Luo, B.; Liu, D. Integral Reinforcement Learning Based Event-Triggered Control with Input Saturation. Neural Netw. 2020, 131, 144–153.
28.
Mu, C.; Wang, K.; Qiu, T. Dynamic Event-Triggering Neural Learning Control for Partially Unknown Nonlinear Systems. IEEE Trans. Cybern. 2020, 52, 2200–2213.
29.
Li, Y.; Xu, Y.; Li, K. Dynamic Event-Triggered Optimal Control for Heterogeneous Vehicle Platoon Based on Integral Reinforcement Learning. IEEE Trans. Netw. Sci. Eng. 2025, 12, 1885 - 1897.
30.
Liang, Z.; Gu, Y.; Li, P.; et al. Prescribed-Time Formation Tracking in Multi-Agent Systems via Reinforcement Learning- Based Hybrid Impulsive Control with Time Delays. Expert Syst. Appl. 2025, 272, 126723.
31.
Ren, H.; Gui, H.; Zhong, R. Efficient Fuel-Optimal Multi-Impulse Orbital Transfer via Contrastive Pre-Trained Reinforce- ment Learning. Adv. Space Res. 2025, 75, 7377–7396.
32.
Nasir, Y.; Durlofsky, L.J. Deep Reinforcement Learning for Optimal Well Control in Subsurface Systems with Uncertain Geology. J. Comput. Phys. 2023, 477, 111945.
33.
Wang, S.; Blanchet, J.; Glynn, P.W. An Efficient High-Dimensional Gradient Estimator for Stochastic Differential Equations. Adv. Neural Inf. Process. Syst. 2024, 37, 88045–88090.
34.
Dixit, A.; Elsheikh, A. A Multilevel Reinforcement Learning Framework for PDE-Based Control. arXiv 2022, arXiv:2210.08400.
35.
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-Informed Neural Networks: A Deep Learning Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations. J. Comput. Phys. 2019, 378, 686–707.
36.
Garca-Cervera, C.J.; Kessler, M.; Periago, F. Control of Partial Differential Equations via Physics-Informed Neural Networks. J. Optim. Theory Appl. 2023, 196, 391–414.
37.
Furfaro, R.; D’Ambrosio, A.; Schiassi, E.; et al. Physics-Informed Neural Networks for Closed-Loop Guidance and Control in Aerospace Systems. In Proceedings of the AIAA SCITECH 2022 Forum, San Diego, CA, USA, 3–7 January 2022; p. 0361.
38.
Tayal, M.; Singh, A.; Kolathaya, S.; et al. A Physics-Informed Machine Learning Framework for Safe and Optimal Control of Autonomous Systems. arXiv 2025, arXiv:2502.11057.
39.
Mowlavi, S.; Nabi, S. Optimal Control of PDEs Using Physics-Informed Neural Networks. J. Comput. Phys. 2023, 473, 111731.
40.
Antonelo, E.A.; Camponogara, E.; Seman, L.O.; et al. Physics-Informed Neural Nets for Control of Dynamical Systems. Neurocomputing 2024, 579, 127419.
41.
Demo, N.; Strazzullo, M.; Rozza, G. An Extended Physics Informed Neural Network for Preliminary Analysis of Parametric Optimal Control Problems. Comput. Math. Appl. 2023, 143, 383–396.
42.
Jiao, Z.; Luo, X.; Yi, X. Neural Optimal Controller for Stochastic Systems via Pathwise HJB Operator. arXiv 2024, arXiv:2402.15592.
43.
M ”u ller, J.; Zeinhofer, M. Position: Optimization in SciML Should Employ the Function Space Geometry. arXiv 2024, arXiv:2402.07318.
44.
Song, Y.; Yuan, X.; Yue, H. The ADMM-PINNs Algorithmic Framework for Nonsmooth PDE-Constrained Optimization: A Deep Learning Approach. Siam J. Sci. Comput. 2024, 46, C659–C687.
45.
Hao, Z.; Ying, C.; Su, H.; et al. Bi-Level Physics-Informed Neural Networks for PDE Constrained Optimization Using Broyden’s Hypergradients. arXiv 2022, arXiv:2209.07075.
46.
Wang, Y.; Wu, Z. Physics-Informed Reinforcement Learning for Optimal Control of Nonlinear Systems. AIChE J. 2024, 70, e18542.
47.
Kamtue, K.; Moura, J.M.F.; Sangpetch, O. Solving Functional Optimization with Deep Networks and Variational Principles. arXiv 2024, arXiv:2410.06277.
48.
Wang, S.; Bhouri, M.A.; Perdikaris, P. Fast PDE-Constrained Optimization via Self-Supervised Operator Learning. arXiv 2021, arXiv:2110.13297, .
49.
Shukla, K.; Oommen, V.; Peyvan, A.; et al. Deep Neural Operators as Accurate Surrogates for Shape Optimization. Eng. Appl. Artif. Intell. 2024, 129, 107615.
50.
Song, Y.; Yuan, X.; Yue, H. Accelerated Primal-Dual Methods with Enlarged Step Sizes and Operator Learning for Nonsmooth Optimal Control Problems. arXiv 2023, arXiv:2307.00296.
51.
Hwang, R.; Lee, J.Y.; Shin, J.Y.; et al. Solving PDE-Constrained Control Problems Using Operator Learning. Proc. AAAI Conf. Artif. Intell. 2022, 36, 4504–4512.
52.
Chen, C.; Wu, J.-L. Neural Dynamical Operator: Continuous Spatial-Temporal Model with Gradient-Based and Derivative- Free Optimization Methods. J. Comput. Phys. 2025, 520, 113480.
53.
Ngiar, G.; Mahoney, M.W.; Krishnapriyan, A.S. Learning Differentiable Solvers for Systems with Hard Constraints. arXiv 2022, arXiv:2207.08675.
54.
Cheng, Z.; Li, Z.; Wang, X.; et al. Accelerating PDE-Constrained Optimization by the Derivative of Neural Operators. arXiv 2025, arXiv:2506.13120.
55.
Luo, D.; O’Leary-Roseberry, T.; Chen, P.; et al. Efficient PDE-Constrained Optimization under High-Dimensional Uncertainty Using Derivative-Informed Neural Operators. Siam J. Sci. Comput. 2025, 47, C899–C931.
56.
Go, J.; Chen, P. Accurate, Scalable, and Efficient Bayesian Optimal Experimental Design with Derivative-Informed Neural Operators. Comput. Methods Appl. Mech. Eng. 2025, 438, 117845.
57.
Su, H.; Zhang, H.; Sun, S.; et al. Integral Reinforcement Learning-Based Online Adaptive Event-Triggered Control for Non-Zero-Sum Games of Partially Unknown Nonlinear Systems. Neurocomputing 2020, 377, 243–255.
58.
Wang, H.; Zhang, Y. Impulsive Maneuver Strategy for Multi-Agent Orbital Pursuit-Evasion Game under Sparse Rewards. Aerosp. Sci. Technol. 2024, 155, 109618.
59.
Chen, F.; Martin, N.; Chen, P.-Y.; et al. Deciding Bank Interest Rates—A Major-Minor Impulse Control Mean-Field Game Perspective. arXiv 2024, arXiv:2411.14481.
60.
Ruthotto, L.; Osher, S.J.; Li, W.; et al. A Machine Learning Framework for Solving High-Dimensional Mean Field Game and Mean Field Control Problems. Proc. Natl. Acad. Sci. USA 2020, 117, 9183–9193.
61.
Zhang, L.; Ghimire, M.; Zhang, W.; et al. Value Approximation for Two-Player General-Sum Differential Games with State Constraints. IEEE Trans. Robot. 2024, 40, 4631–4649.
62.
Chen, X.; Liu, S.; Di, X. A Hybrid Framework of Reinforcement Learning and Physics-Informed Deep Learning for Spatiotemporal Mean Field Games. In Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems, London, UK, 29 May–2 June 2023.
63.
Yin, W.; Shen, Z.; Meng, P.; et al. An Online Interactive Physics-Informed Adversarial Network for Solving Mean Field Games. Eng. Anal. Bound. Elem. 2024, 169, 106002.
64.
Zhang, L.; Ghimire, M.; Xu, Z.; et al. Pontryagin Neural Operator for Solving General-Sum Differential Games with Parametric State Constraints. In Proceedings of the 6th Annual Learning for Dynamics & Control Conference, Oxford, UK, 15–17 July 2024; pp. 1728–1740.
65.
Liu, S.; Chen, X.; Di, X. Scalable Learning for Spatiotemporal Mean Field Games Using Physics-Informed Neural Operator. Mathematics 2024, 12, 803.
66.
Zeng, R.; Li, C.; Li, L.; et al. Solving Mean Field Game Based on Physics-Informed Operator Learning and Deep Reinforcement Learning. J. Comput. Phys. 2025, 545, 114457.
67.
Li, C.; Wang, W. Optimal Impulse Control and Impulse Game for Continuous-Time Deterministic Systems: A Review. Artif. Intell. Sci. Eng. 2025, 1, 208–219.

Scilight Press

Author Information

Abstract

Keywords

References

About Scilight

Journals

Publishing Policies

Contact Us