CERL: Evolutionary Reinforcement Learning for Partitioned Collaborative Inference on On-Device Models

Lin Tan; Songtao Guo; Pengzhan Zhou; Zhufang Kuang

doi:10.53941/jmlis.2025.100005

Abstract

The rapid expansion of intelligent mobile applications has made the deployment of Deep Neural Networks (DNNs) on Mobile Edge Devices (EDs) a critical necessity. However, the inherent computational limitations of EDs frequently result in severe energy inefficiency and compromised inference precision. To overcome these hurdles, we present a Distributed Collaborative Inference (DCI) framework aimed at minimizing on-device overhead by dispersing inference burdens across a mesh of EDs and Mobile Edge Computing (MEC) nodes. To effectively navigate the system's dynamic complexities, we construct an evolutionary reinforcement learning algorithm anchored in the Cross-Entropy Method (CEM). A novel feature of this algorithm is the utilization of negative Temporal Difference (TD) error as a fitness criterion for isolating elite population members. By leveraging high-fidelity samples from these elites, the learning trajectory is expedited, facilitating optimal decision-making in volatile environments. Simulation outcomes demonstrate the superiority of our method over current standards, evidencing a 57.5% rise in task completion rates and a 65.7% decrease in aggregate system costs.

References

1.
Tang, S.; Yu, Y.; Wang, H.; et al. A Survey on Scheduling Techniques in Computing and Network Convergence. IEEE Commun. Surv. Tutorials 2024, 26, 160–195.
2.
Walia, G.K.; Kumar, M.; Gill, S.S. AI-Empowered Fog/Edge Resource Management for IoT Applications: A Comprehensive Review, Research Challenges, and Future Perspectives. IEEE Commun. Surv. Tutorials 2024, 26, 619–669.
3.
Duan, S.; Wang, D.; Ren, J.; et al. Distributed Artificial Intelligence Empowered by End-Edge-Cloud Computing: A Survey. IEEE Commun. Surv. Tutorials 2023, 25, 591–624.
4.
Kar, B.; Yahya, W.; Lin, Y.D.; et al. Offloading Using Traditional Optimization and Machine Learning in Federated Cloud–Edge–Fog Systems: A Survey. IEEE Commun. Surv. Tutorials 2023, 25, 1199–1226.
5.
Liu, Z.; Song, J.; Qiu, C.; et al. Hastening Stream Offloading of Inference via Multi-Exit DNNs in Mobile Edge Computing. IEEE Trans. Mob. Comput. 2024, 23, 535–548.
6.
Tan, L.; Kuang, Z.; Zhao, L.; et al. Energy-Efficient Joint Task Offloading and Resource Allocation in OFDMA-Based Collaborative Edge Computing. IEEE Trans. Wirel. Commun. 2022, 21, 1960–1972.
7.
Wang, J.; Cao, C.; Wang, J.; et al. Optimal Task Allocation and Coding Design for Secure Edge Computing With Heterogeneous Edge Devices. IEEE Trans. Cloud Comput. 2022, 10, 2817–2833.
8.
Tang, X.; Chen, X.; Zeng, L.; et al. Joint Multiuser DNN Partitioning and Computational Resource Allocation for Collaborative Edge Intelligence. IEEE Internet Things J. 2021, 8, 9511–9522.
9.
Li, X.; Bi, S. Optimal AI Model Splitting and Resource Allocation for Device-Edge Co-Inference in Multi-User Wireless Sensing Systems. IEEE Trans. Wirel. Commun. 2024, 23, 11094–11108.
10.
Liang, H.; Sang, Q.; Hu, C.; et al. DNN Surgery: Accelerating DNN Inference on the Edge Through Layer Partitioning. IEEE Trans. Cloud Comput. 2023, 11, 3111–3125.
11.
Li, J.; Liang, W.; Li, Y.; et al. Throughput Maximization of Delay-Aware DNN Inference in Edge Computing by Exploring DNN Model Partitioning and Inference Parallelism. IEEE Trans. Mob. Comput. 2023, 22, 3017–3030.
12.
Tan, L.; Guo, S.; Zhou, P.; et al. Multi-UAV-Enabled Collaborative Edge Computing: Deployment, Offloading and Resource Optimization. IEEE Trans. Intell. Transp. Syst. 2024, 25, 18305–18320.
13.
Xu, C.; Guo, J.; Li, Y.; et al. Dynamic Parallel Multi-Server Selection and Allocation in Collaborative Edge Computing. IEEE Trans. Mob. Comput. 2024, 23, 10523–10537.
14.
Li, Y.; Zeng, D.; Gut, L.; et al. DNN Partitioning and Assignment for Distributed Inference in SGX Empowered Edge Cloud. In Proceedings of the IEEE 44th International Conference on Distributed Computing Systems (ICDCS), Jersey City, NJ, USA, 23–26 July 2024.
15.
Mohammed, T.; Joe-Wong, C.; Babbar, R.; et al. Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading. In Proceedings of the IEEE INFOCOM 2020—IEEE Conference on Computer Communications, Virtual, 6–9 July 2020.
16.
Eng, K.X.; Xie, Y.; Pereira, M.; et al. A Vision and Proof of Concept for New Approach to Monitoring for Safer Future Smart Transportation Systems. Sensors 2024, 24, 6018.
17.
Liu, H.; Fouda, M.E.; Eltawil, A.M.; et al. Split DNN Inference for Exploiting Near-Edge Accelerators. In Proceedings of the IEEE International Conference on Edge Computing and Communications (EDGE), Shenzhen, China, 7–13 July 2024.
18.
Dong, C.; Hu, S.; Chen, X.; et al. Joint Optimization With DNN Partitioning and Resource Allocation in Mobile Edge Computing. IEEE Trans. Netw. Serv. Manag. 2021, 18, 3973–3986.
19.
Zeng, L.; Chen, X.; Zhou, Z.; et al. CoEdge: Cooperative DNN Inference With Adaptive Workload Partitioning Over Heterogeneous Edge Devices. IEEE/Acm Trans. Netw. 2021, 29, 595–608.
20.
He, W.; Guo, S.; Guo, S.; et al. Joint DNN Partition Deployment and Resource Allocation for Delay-Sensitive Deep Learning Inference in IoT. IEEE Internet Things J. 2020, 7, 9241–9254.
21.
Zeng, Q.; Du, Y.; Huang, K.; et al. Energy-Efficient Resource Management for Federated Edge Learning With CPU-GPU Heterogeneous Computing. IEEE Trans. Wirel. Commun. 2021, 20, 7947–7962.
22.
Tan, L.; Guo, S.; Zhou, P.; et al. HAT: Task Offloading and Resource Allocation in RIS-Assisted Collaborative Edge Computing. IEEE Trans. Netw. Sci. Eng. 2024, 11, 4665–4678.
23.
Li, P.; HAO, J.; Tang, H.; et al. Value-Evolutionary-Based Reinforcement Learning. In Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 21–27 July 2024.
24.
de Boer, P.T.; Kroese, D.P.; Mannor, S.; et al. A Tutorial on the Cross-Entropy Method. Ann. Oper. Res. 2005, 134, 19–67.
25.
Larraaga, P.; Lozano, J.A. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation; Springer
26.
Science & Business Media: Berlin/Heidelberg, Germany, 2012.
27.
Li, P.; Hao, J.; Tang, H.; et al. Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid Algorithms. IEEE Trans. Evol. Comput. 2024, 29, 1707–1728.
28.
Zhang, H.; Shao, J.; Jiang, Y.; et al. State Deviation Correction for Offline Reinforcement Learning. Proc. AAAI Conf. Artif. Intell. 2022, 36, 9022–9030.
29.
Sun, W.; Zhang, H.; Wang, R.; et al. Reducing Offloading Latency for Digital Twin Edge Networks in 6G. IEEE Trans. Veh. Technol. 2020, 69, 12240–12251.
30.
MrYxJ. Calflops: A FLOPs and Params Calculate Tool for Neural Networks in PyTorch Framework, 2023. Available online: https://github.com/MrYxJ/calculate-flops.pytorch (accessed on 20 August 2025).
31.
Coldwell, J. Latencies for Typical Modern Processor. Available online: http://www.phys.ufl.edu/ oldwell/MultiplePre cision/fpvsintmult.htm (accessed on 20 August 2025).
32.
Tan, L.; Kuang, Z.; Gao, J.; et al. Energy-Efficient Collaborative Multi-Access Edge Computing via Deep Reinforcement Learning. IEEE Trans. Ind. Inform. 2023, 19, 7689–7699.
33.
Rahmati, M.; Pagano, A. Federated Learning-Driven Cybersecurity Framework for IoT Networks with Privacy Preserving and Real-Time Threat Detection Capabilities. Informatics 2025, 12, 62.

Scilight Press

Author Information

Abstract

Keywords

References

About Scilight

Journals

Publishing Policies

Contact Us