Open Access
Article

Federated Bimodal Graph Neural Networks for Text-Image Retrieval

Xueming Yan1, 2
Chuyue Wang1
Yaochu Jin3, *
Author Information
Submitted: 24 Dec 2024 | Accepted: 20 Mar 2025 | Published: 27 Jun 2025

Abstract

Text-image retrieval is a key challenge in computer vision and natural language processing, aiming to retrieve the most semantically relevant image or text given a query in the opposite modality. However, growing privacy and security concerns make traditional centralized learning approaches increasingly unsuitable for handling sensitive multimodal data. In this paper, we propose FedBi-GNNs, a federated learning framework for bimodal graph neural networks, which enables collaborative training across decentralized clients without sharing private data. Each client independently constructs heterogeneous graphs from local text and image data and learns correspondences via bimodal graph matching. These local representations are then aggregated at a central server using a heterogeneous federated aggregation scheme. Empirical results on the MSCOCO benchmark demonstrate that FedBi-GNNs significantly outperform existing state-of-the-art methods, offering improved retrieval accuracy, enhanced privacy preservation, and greater robustness to data heterogeneity across clients.

References

Share this article:
Graphical Abstract
How to Cite
Yan, X., Wang, C., & Jin, Y. (2025). Federated Bimodal Graph Neural Networks for Text-Image Retrieval. International Journal of Network Dynamics and Intelligence, 4(2), 100009. https://doi.org/10.53941/ijndi.2025.100009
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2025 by the authors.
scilight logo

About Scilight

Contact Us

Suite 4002 Level 4, 447 Collins Street, Melbourne, Victoria 3000, Australia
General Inquiries: info@sciltp.com
© 2025 Scilight Press Pty Ltd All rights reserved.