2510001649
  • Open Access
  • Article

Verse-in-Wine: A Generative AI Framework for Chinese Calligraphy Painting with Drinking Culture

  • Ronghua Cai,   
  • James She *

Submitted: 29 Aug 2025 | Revised: 17 Sep 2025 | Accepted: 09 Oct 2025 | Published: 21 Oct 2025

Abstract

This paper presents Verse-in-Wine, a generative framework that integrates Chinese classical poetry, traditional wine culture, and calligraphy painting through large language models (LLMs) and visual generation. Given user-selected intention keywords from culturally grounded categories, the system recommends poetic lines, maps them to symbolic wines and historical calligraphy styles, and synthesizes visually coherent outputs. A fully functional prototype was developed and evaluated through both automated and user studies. LLM-based evaluation across 300 samples achieved an overall score of 0.9165, while a user study with 100 samples yielded a comparable human rating of 0.8900, confirming both the system’s cultural fidelity and usability. The framework demonstrates how generative AI can meaningfully engage with heritage aesthetics, linking related cultures for artistic expression.

References 

  • 1.
    Chen, H. Elegant Scholar Sipping Wine, Ming Dynasty. Artwork preserved at the Shanghai Museum, Shanghai, China.
  • 2.
    Chenghua. The Eight Immortals Drinking (partial), Ming Dynasty. Artwork preserved at the Palace Museum, Beijing, China.
  • 3.
    Pourreza, R.; Bhattacharyya, A.; Panchal, S.; et al. Painter: Teaching Auto-regressive Language Models to Draw Sketches. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Paris, France, 2–3 October 2023; pp. 305–314.
  • 4.
    Rombach, R.; Blattmann, A.; Lorenz, D.; et al. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10674–10685.
  • 5.
    Qu, L.; Wu, S.; Fei, H.; et al. LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation. In Proceedings of the 31st ACM International Conference on Multimedia (MM ’23), Ottawa, ON, Canada, 29 October–3 November 2023; pp. 643–654.
  • 6.
    Yang, Z.; Peng, D.; Kong, Y.; et al. FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning. Proc. AAAI Conf. Artif. Intell. 2024, 38, 6603–6611.
  • 7.
    Follmer, S.; Brade, S.; Wang, B.; et al. Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models. In ACM Symposium on User Interface Software and Technology, UIST; Association for Computing Machinery: New York,NY, USA, 2023.
  • 8.
    Arawjo, I.; Swoopes, C.; Vaithilingam, P.; et al. ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24), Honolulu, HI, USA, 11–16 May 2024.
  • 9.
    Zhang, Y.; Fang, Z.; Yang, X.; et al. Reconnecting the Broken Civilization: Patchwork Integration of Fragments from Ancient Manuscripts. In Proceedings of the 31st ACM International Conference on Multimedia (MM ’23), Ottawa, ON, Canada, 29 October–3 November 2023; pp. 1157–1166.
  • 10.
    Zhu, S.; Xue, H.; Nie, N.; et al. Reproducing the Past: A Dataset for Benchmarking Inscription Restoration. In Proceedings of the 32nd ACM International Conference on Multimedia (MM ’24), Melbourne, VIC, Australia, 28 October–1 November 2024; pp. 7714–7723.
  • 11.
    Pan, J.; Li, L.; Yamaguchi, H.; et al. Reconstructing, Understanding, and Analyzing Relief Type Cultural Heritage from a Single Old Photo. In Proceedings of the 32nd ACM International Conference on Multimedia (MM ’24), Melbourne, VIC, Australia, 28 October–1 November 2024; pp. 7724–7733.
  • 12.
    Bin, Y.; Shi, W.; Ding, Y.; et al. GalleryGPT: Analyzing Paintings with Large Multimodal Models. In Proceedings of the 32nd ACM International Conference on Multimedia (MM ’24), Melbourne, VIC, Australia, 28 October–1 November 2024; pp. 7734–7743.
  • 13.
    Silva, M. Interaction with Immersive Cultural Heritage Environments: Using XR Technologies to Represent Multiple Perspectives on Serralves Museum. In Proceedings of the 30th ACM International Conference on Multimedia (MM ’22), Lisboa, Portugal, 10–14 October 2022; pp. 6920–6924.
  • 14.
    Rachabatuni, P.K.; Principi, F.; Mazzanti, P.; et al. Context-aware chatbot using MLLMs for Cultural Heritage. In Proceedings of the ACM Multimedia Systems Conference (MMSys ’24), Bari, Italy, 15–18 April 2024; pp. 459–463.
  • 15.
    Zhou, A.L.; Zhang, K. Shanshui Journey: Using AI to Reproduce the Experience of Chinese Literati Ink Paintings. Leonardo 2024, 57, 370–378.
  • 16.
    Isola, P.; Zhu, J.Y.; Zhou, T.; et al. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5967–5976.
  • 17.
    Zhu, J.Y.; Park, T.; Isola, P.; et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251.
  • 18.
    Kim, J.; Kim, M.; Kang, H.; et al. U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation. In Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 26–30 April 2020.
  • 19.
    Cai, Y.T. zi2zi: Master Chinese Calligraphy with Conditional Adversarial Networks. 2017. Available online: https://github.com/kaonashi-tyc/zi2zi (accessed on 14 January 2024).
  • 20.
    Chang, B.; Zhang, Q.; Pan, S.; et al. Generating Handwritten Chinese Characters Using CycleGAN. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 199–207.
  • 21.
    Liu, R.; Yuan, S.; Chen, M.; et al. MaLiang: An Emotion-driven Chinese Calligraphy Artwork Composition System. In Proceedings of the 28th ACM International Conference on Multimedia (MM ’20), New York, NY, USA, 12–16 October; pp. 4394–4396.
  • 22.
    Zhou, P.; Zhao, Z.; Zhang, K.; et al. An End-to-End Model for Chinese Calligraphy Generation. Multimed. Tools Appl. 2021, 80, 6737–6754.
  • 23.
    Tuo, Y.; Xiang, W.; He, J.Y.; et al. AnyText: Multilingual Visual Text Generation and Editing. arXiv 2023, arXiv:2311.03054.
  • 24.
    Chen, Y.S.; Chao, M.T. Skeletonization application: Chinese calligraphy character representation and reconstruction. J. Electron. Imaging 2018, 27, 051202.
  • 25.
    Chao, M.T.; Chen, Y.S. A compact representation of character skeleton using skeletal line based shape descriptor. In Applications of Digital Image Processing XLII; Tescher, A.G., Ebrahimi, T., Eds.; SPIE: San Diego, CA, USA, 2019; p. 99.
  • 26.
    Wang, T.Q.; Liu, C.L. Fully Convolutional Network Based Skeletonization for Handwritten Chinese Characters. Proc. AAAI Conf. Artif. Intell. 2018, 32, 11868.
  • 27.
    Wang, T.Q.; Jiang, X.; Liu, C.L. Query Pixel Guided Stroke Extraction with Model-Based Matching for Offline Handwritten Chinese Characters. Pattern Recognit. 2022, 123, 108416.
  • 28.
    Jiang, Y.; Lian, Z.; Tang, Y.; et al. SCFont: Structure-Guided Chinese Font Generation via Deep Stacked Networks. Proc. AAAI Conf. Artif. Intell. 2019, 33, 4015–4022.
  • 29.
    Lian, Z.; Zhao, B.; Xiao, J. Automatic generation of large-scale handwriting fonts via style learning. In SIGGRAPH Asia 2016 Technical Briefs; ACM: Macau, China, 2016; pp. 1–4.
  • 30.
    Yuan, S.; Dai, A.; Yan, Z.; et al. Learning to Generate Poetic Chinese Landscape Painting with Calligraphy. arXiv 2023, arXiv:2305.04719.
  • 31.
    Cai, R.; She, J. Pop Calligraphy Artwork: AI Meets Guangzhong Wu on Social Media. In Proceedings of the 17th International Symposium on Visual Information Communication and Interaction, New York, NY, USA, 11–13 December 2024.
  • 32.
    ELsharif, W.; Agus, M.; Alzubaidi, M.; et al. Cultural Relevance Index: Measuring Cultural Relevance in AI-Generated Images. In Proceedings of the IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR), San Jose, CA, USA, 7–9 August 2024; pp. 410–416.
Share this article:
How to Cite
Cai, R.; She, J. Verse-in-Wine: A Generative AI Framework for Chinese Calligraphy Painting with Drinking Culture. Transactions on Artificial Intelligence 2025, 1 (1), 248–264. https://doi.org/10.53941/tai.2025.100017.
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2025 by the authors.