2510001889
  • Open Access
  • Article

CERL: Evolutionary Reinforcement Learning for Partitioned Collaborative Inference on On-Device Models

  • Lin Tan 1,   
  • Songtao Guo 1, *,   
  • Pengzhan Zhou 1,   
  • Zhufang Kuang 2

Received: 24 Aug 2025 | Revised: 21 Oct 2025 | Accepted: 24 Oct 2025 | Published: 29 Oct 2025

Abstract

With the proliferation of intelligent mobile applications, the ability to deploy and operate Deep Neural Networks (DNNs) on mobile Edge Devices (EDs) has become fundamentally important. The primary challenge, however, stems from the constrained computational power of EDs, which often leads to excessive energy use and degraded inference accuracy. To mitigate these issues, we introduce a Distributed Collaborative Inference (DCI) system designed to lower on-device inference costs. Our system achieves this by distributing the inference workload across a network of multiple EDs and Mobile Edge Computing (MEC) servers. To dynamically model the complex relationships within the system and make optimal decisions, we develop a Evolutionary Reinforcement Learning algorithm based on the Cross-Entropy Method (CEM). This algorithm uniquely employs negative temporal difference (TD) error as a fitness metric to identify and select elite individuals from the population. These elite solutions generate high-quality samples that accelerate the learning process, enabling the system to determine optimal decisions for distributed collaborative inference and resource management within complex, dynamic search spaces. Our extensive simulation results confirm that this approach markedly surpasses existing methods and benchmark standards, yielding a 57.5% increase in the successful completion rate of inference tasks and a 65.7% reduction in total system costs.

References 

  • 1.
    Tang, S.; Yu, Y.; Wang, H.; et al. A Survey on Scheduling Techniques in Computing and Network Convergence. IEEE Commun. Surv. Tutorials 2024, 26, 160–195.
  • 2.
    Walia, G.K.; Kumar, M.; Gill, S.S. AI-Empowered Fog/Edge Resource Management for IoT Applications: A Comprehen- sive Review, Research Challenges, and Future Perspectives. IEEE Commun. Surv. Tutorials 2024, 26, 619–669.
  • 3.
    Duan, S.; Wang, D.; Ren, J.; et al. Distributed Artificial Intelligence Empowered by End-Edge-Cloud Computing: A Survey. IEEE Commun. Surv. Tutorials 2023, 25, 591–624.
  • 4.
    Kar, B.; Yahya, W.; Lin, Y.D.; et al. Offloading Using Traditional Optimization and Machine Learning in Federated Cloud–Edge–Fog Systems: A Survey. IEEE Commun. Surv. Tutorials 2023, 25, 1199–1226.
  • 5.
    Liu, Z.; Song, J.; Qiu, C.; et al. Hastening Stream Offloading of Inference via Multi-Exit DNNs in Mobile Edge Computing. IEEE Trans. Mob. Comput. 2024, 23, 535–548.
  • 6.
    Tan, L.; Kuang, Z.; Zhao, L.; et al. Energy-Efficient Joint Task Offloading and Resource Allocation in OFDMA-Based Collaborative Edge Computing. IEEE Trans. Wirel. Commun. 2022, 21, 1960–1972.
  • 7.
    Wang, J.; Cao, C.; Wang, J.; et al. Optimal Task Allocation and Coding Design for Secure Edge Computing With Heterogeneous Edge Devices. IEEE Trans. Cloud Comput. 2022, 10, 2817–2833.
  • 8.
    Tang, X.; Chen, X.; Zeng, L.; et al. Joint Multiuser DNN Partitioning and Computational Resource Allocation for Collaborative Edge Intelligence. IEEE Internet Things J. 2021, 8, 9511–9522.
  • 9.
    Li, X.; Bi, S. Optimal AI Model Splitting and Resource Allocation for Device-Edge Co-Inference in Multi-User Wireless Sensing Systems. IEEE Trans. Wirel. Commun. 2024, 23, 11094–11108.
  • 10.
    Liang, H.; Sang, Q.; Hu, C.; et al. DNN Surgery: Accelerating DNN Inference on the Edge Through Layer Partitioning. IEEE Trans. Cloud Comput. 2023, 11, 3111–3125.
  • 11.
    Li, J.; Liang, W.; Li, Y.; et al. Throughput Maximization of Delay-Aware DNN Inference in Edge Computing by Exploring DNN Model Partitioning and Inference Parallelism. IEEE Trans. Mob. Comput. 2023, 22, 3017–3030.
  • 12.
    Tan, L.; Guo, S.; Zhou, P.; et al. Multi-UAV-Enabled Collaborative Edge Computing: Deployment, Offloading and Resource Optimization. IEEE Trans. Intell. Transp. Syst. 2024, 25, 18305–18320.
  • 13.
    Xu, C.; Guo, J.; Li, Y.; et al. Dynamic Parallel Multi-Server Selection and Allocation in Collaborative Edge Computing. IEEE Trans. Mob. Comput. 2024, 23, 10523–10537.
  • 14.
    Li, Y.; Zeng, D.; Gut, L.; et al. DNN Partitioning and Assignment for Distributed Inference in SGX Empowered Edge Cloud. In Proceedings of the IEEE 44th International Conference on Distributed Computing Systems (ICDCS), Jersey City, NJ, USA, 23–26 July 2024.
  • 15.
    Mohammed, T.; Joe-Wong, C.; Babbar, R.; et al. Distributed Inference Acceleration with Adaptive DNN Partitioning and Offloading. In Proceedings of the IEEE INFOCOM 2020—IEEE Conference on Computer Communications, Virtual, 6–9 July 2020.
  • 16.
    Eng, K.X.; Xie, Y.; Pereira, M.; et al. A Vision and Proof of Concept for New Approach to Monitoring for Safer Future Smart Transportation Systems. Sensors 2024, 24, 6018.
  • 17.
    Liu, H.; Fouda, M.E.; Eltawil, A.M.; et al. Split DNN Inference for Exploiting Near-Edge Accelerators. In Proceedings of the IEEE International Conference on Edge Computing and Communications (EDGE), Shenzhen, China, 7–13 July 2024.
  • 18.
    Dong, C.; Hu, S.; Chen, X.; et al. Joint Optimization With DNN Partitioning and Resource Allocation in Mobile Edge Computing. IEEE Trans. Netw. Serv. Manag. 2021, 18, 3973–3986.
  • 19.
    Zeng, L.; Chen, X.; Zhou, Z.; et al. CoEdge: Cooperative DNN Inference With Adaptive Workload Partitioning Over Heterogeneous Edge Devices. IEEE/Acm Trans. Netw. 2021, 29, 595–608.
  • 20.
    He, W.; Guo, S.; Guo, S.; et al. Joint DNN Partition Deployment and Resource Allocation for Delay-Sensitive Deep Learning Inference in IoT. IEEE Internet Things J. 2020, 7, 9241–9254.
  • 21.
    Zeng, Q.; Du, Y.; Huang, K.; et al. Energy-Efficient Resource Management for Federated Edge Learning With CPU-GPU Heterogeneous Computing. IEEE Trans. Wirel. Commun. 2021, 20, 7947–7962.
  • 22.
    Tan, L.; Guo, S.; Zhou, P.; et al. HAT: Task Offloading and Resource Allocation in RIS-Assisted Collaborative Edge Computing. IEEE Trans. Netw. Sci. Eng. 2024, 11, 4665–4678.
  • 23.
    Li, P.; HAO, J.; Tang, H.; et al. Value-Evolutionary-Based Reinforcement Learning. Forty-first International Conference on Machine Learning. In Proceedings of the Forty-First International Conference on Machine Learning, Vienna, Austria, 21–27 July 2024.
  • 24.
    de Boer, P.T.; Kroese, D.P.; Mannor, S.; et al. A Tutorial on the Cross-Entropy Method. Ann. Oper. Res. 2005, 134, 19–67.
  • 25.
    Larraaga, P.; Lozano, J.A. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012.
  • 26.
    Li, P.; Hao, J.; Tang, H.; et al. Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid Algorithms. IEEE Trans. Evol. Comput. 2024, 29, 1707–1728.
  • 27.
    Zhang, H.; Shao, J.; Jiang, Y.; et al. State Deviation Correction for Offline Reinforcement Learning. Proc. AAAI Conf. Artif. Intell. 2022, 36, 9022–9030.
  • 28.
    Sun, W.; Zhang, H.; Wang, R.; et al. Reducing Offloading Latency for Digital Twin Edge Networks in 6G. IEEE Trans. Veh. Technol. 2020, 69, 12240–12251.
  • 29.
    MrYxJ. Calflops: A FLOPs and Params Calculate Tool for Neural Networks in PyTorch Framework, 2023. Available online: https://github.com/MrYxJ/calculate-flops.pytorch (accessed on 20 October 2025).
  • 30.
    Coldwell, J. Latencies for Typical Modern Processor. Available online: http://www.phys.ufl.edu/coldwell/MultiplePrecision/fpvsintmult.htm (accessed on 25 October 2025).
  • 31.
    Tan, L.; Kuang, Z.; Gao, J.; et al. Energy-Efficient Collaborative Multi-Access Edge Computing via Deep Reinforcement Learning. IEEE Trans. Ind. Inform. 2023, 19, 7689–7699.
  • 32.
    Rahmati, M.; Pagano, A. Federated Learning-Driven Cybersecurity Framework for IoT Networks with Privacy Preserving and Real-Time Threat Detection Capabilities. Informatics 2025, 12, 62.
Share this article:
How to Cite
Tan, L.; Guo, S.; Zhou, P.; Kuang, Z. CERL: Evolutionary Reinforcement Learning for Partitioned Collaborative Inference on On-Device Models. Journal of Machine Learning and Information Security 2025, 1 (1), 5.
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2025 by the authors.