2505000625
  • Open Access
  • Article
LLM-Prompting Driven AutoML: From Sleep Disorder—Classification to Beyond
  • Yutong Zhao 1, †,   
  • Jianye Pang 2, †,   
  • Xinjie Zhu 2, †,   
  • Wenhua Shao 3, *

Received: 28 Mar 2025 | Revised: 06 May 2025 | Accepted: 09 May 2025 | Published: 12 May 2025

Abstract

Traditional automated machine learning (AutoML) often faces limitations in manual effort, complexity management, and subjective design choices. This paper introduces a novel LLM-driven AutoML framework centered on the innovation of decomposed prompting. We hypothesize that by strategically breaking down complex AutoML tasks into sequential, guided sub-prompts, Large Language Models (LLMs) operating within a code sandbox on standard PCs can autonomously design, implement, evaluate, and select high-performing machine learning models. To validate this, we primarily applied our decomposed prompting approach to sleep disorder classification (illustrating potential benefits in healthcare). To assess the generalizability and robustness of our method across different data types, we subsequently evaluated it on the established 20 Newsgroups text classification benchmark. We rigorously compared decomposed prompting against zero-shot and few-shot prompting strategies, as well as a manually engineered baseline. Our results demonstrate that decomposed prompting significantly outperforms these alternatives. Our results demonstrate that decomposed prompting significantly outperforms alternatives, enabling the LLM to autonomously achieve superior classifier design and performance, particularly showing strong results in the primary sleep disorder domain and demonstrating robustness in the benchmark task. These findings underscore the transformative potential of decomposed prompting as a key technique for advancing LLM-driven AutoML across diverse application areas beyond the specific examples explored here, paving the way for more automated and accessible problem-solving in scientific and engineering disciplines.

References 

  • 1.
    Ibomoiye Domor Mienye, N.; Jere, N. Survey of Decision Trees: Concepts, Algorithms, and Applications. IEEE Xplore. Available online: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10562290(13 May 2024).
  • 2.
    Kumari, A.; Akhtar, M.; Shah, R.; et al. Support matrix machine: A review. arXiv2023, arXiv:2310.19717.
  • 3.
    Curth, A.; Jeffares, A.; van der Schaar, M. Why do random forests work? Understanding tree ensembles as self-regularizing adaptive smoothers. arXiv2024, arXiv:2402.01502.
  • 4.
    Vaswani, A.; Shazeer, N.; Parmar, N.; et al. Attention is all you need. Neural Inf. Process. Syst. 2017, 30.
  • 5.
    Kim, Y.; Xu, X.; McDuff, D.; et al. Health-LLM: Large language models for health prediction via wearable sensor data. arXiv2024, arXiv:2401.06885.
  • 6.
    Nori, H.; Lee, Y.T.; Zhang, S.; et al. Can generalist foundation models outcompete special-purpose tuning? Case study in medicine. arXiv2023, arXiv:2311.16452.
  • 7.
    Saab, K.; Tu, T.; Weng, W.-H.; et al. Capabilities of Gemini models in medicine. arXiv2024, arXiv:2404.18416.
  • 8.
    Singhal, K.; Tu, T.; Gottweis, J.; et al. Towards expert-level medical question answering with large language models. arXiv2023, arXiv:2305.09617.
  • 9.
    McDuff, D.; Schaekermann, M.; Tu, T.; et al. Towards accurate differential diagnosis with large language models. arXiv2023, arXiv:2312.00164.
  • 10.
    Wang, G.; Zhao, W.; Han, J.; et al. MedFound: The first medical large language model passing the physician qualification examination. Artif. Intell. 2024, 5, 1–12.
  • 11.
    McDuff, D.; Xu, X.; Kim, Y.; et al. Personal health large language model (PH-LLM): Leveraging large language models for personalized health insights. arXiv2023, arXiv:2311.17133.
  • 12.
    Zhang, Y.; Maziarka, P.; Klicpera, J.; et al. DiffSBDD: Equivariant diffusion for structure-based drug design. arXiv2024, arXiv:2403.14338.
  • 13.
    Liu, X.; Uchiyama, M.; Okawa, M.; et al. Prevalence and correlates of insomnia in the Japanese general population: Results from the Japan epidemiological sleep study. Sleep 2000, 23, 497–506.
  • 14.
    Katz, D.M.; Bommarito, M.J.; Gao, S.; et al. GPT-4 passes the bar exam. Trans. R. Soc. A2024, 382, 20230254
  • 15.
    Sleep Health and Lifestyle Dataset. Available online: https://www.kaggle.com/datasets/uom190346a/sleep-health-and-lifestyle-dataset (accessed on 10 May 2025).
  • 16.
    20Newsgroup Dataset. Available online: http://qwone.com/~jason/20Newsgroups/ (accessed on 10 May 2025).
  • 17.
    Doubao–Your Intelligent AI Assistant. doubao.com, ByteDance. Available online: doubao.com/download/desktop (accessed on 13 May 2024).
  • 18.
    Wei, J.; Wang, X.; Schuurmans, D.; et al. Chain-of-thought prompting elicits reasoning in large language models. arXiv2022, arXiv:2201.11903.
  • 19.
    Xiang, V.; Snell, C.; Gandhi, K.; et al. Towards system 2 reasoning in LLMs: Learning how to think with meta chain-of-thought. arXiv2025, arXiv:2501.04682.
  • 20.
    Zhou, X.; Huang, M.; Wang, H.; et al. Contextual prompting for few-shot text classification. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, UAE, 7–11 December 2022; pp. 9312–9327.
  • 21.
    Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; et al. SMOTE: Synthetic minority over-sampling technique. Artif. Intell. Res. 2002, 16, 321–357.
  • 22.
    LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444.
  • 23.
    LeCun, Y.; Bottou, L.; Bengio, Y.; et al. Gradient-based learning applied to document recognition. IEEE1998, 86, 2278–2324.
  • 24.
    Graves, A.; Mohamed, A.-R.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 6645–6649.
  • 25.
    Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780.
  • 26.
    Cho, K.; Van Merriënboer, B.; Bahdanau, D.; et al. On the properties of neural machine translation: Encoder–decoder approaches. arXiv2014, arXiv:1409.1259.
Share this article:
How to Cite
Zhao, Y.; Pang, J.; Zhu, X.; Shao, W. LLM-Prompting Driven AutoML: From Sleep Disorder—Classification to Beyond. Transactions on Artificial Intelligence 2025, 1 (1), 59–82. https://doi.org/10.53941/tai.2025.100004.
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2025 by the authors.