2601002815
  • Open Access
  • Article

Prediction of Synthetic Lethality in Escherichia coli Based on Feature Engineering through Graph Embedding

  • Qian Xu 1,2,†,   
  • YiMiao Feng 3,†,   
  • Haixia Guo 1,   
  • Yawei Su 1,   
  • Xiaoru Chen 4,   
  • Haoran Sun 5,   
  • Jing Feng 4,   
  • Fengbiao Guo 1,2,*

Received: 19 Oct 2025 | Revised: 08 Jan 2026 | Accepted: 12 Jan 2026 | Published: 22 Jan 2026

Abstract

Synthetic lethality (SL) is a genetic interaction that refers to the phenomenon of cell death caused by the simultaneous inactivation of two non-lethal genes. Due to high-cost constraints and time consumption of experimental screening, computational prediction methods have become the main research tool. Currently, methods based on machine learning have been widely used in SL research, and discovering effective features to enhance the accuracy of predictions remains the key challenge to overcome in current research. We propose an SL prediction method based on graph embedding. First, we transformed five types of raw omics data into graph structures to capture the complex associations among genes. Then, using the graph embedding technique, we extracted feature information for each gene and constructed the feature representation of SL pairs by mathematical operations. Finally, different from GNN, which infers a single graph, we used the machine learning classifiers to discriminate positive and negative samples. Our method achieved better AUC than GNN-based baseline methods. Overall, this study firstly proposed a prediction model for Escherichia coli (E. coli) SLs that integrates the advantages of graph embedding techniques and classifier ensembles, which significantly improves the accuracy and reliability of prediction, and also provides new perspectives and methods for this field.

References 

  • 1.

    Bridges, C.B. Current Maps of the Location of the Mutant Genes of Drosophila Melanogaster. Proc. Natl. Acad. Sci. USA 1921, 7, 127–132.

  • 2.

    Güell, O.; Sagués, F.; Serrano, M.A. Essential Plasticity and Redundancy of Metabolism Unveiled by Synthetic Lethality Analysis. PLoS Comput. Biol. 2014, 10, e1003637.

  • 3.

    Sambamoorthy, G.; Raman, K. Understanding the Evolution of Functional Redundancy in Metabolic Networks. Bioinformatics 2018, 34, i981–i987.

  • 4.

    Pallotta, M.M.; Di Nardo, M.; Musio, A. Synthetic Lethality between Cohesin and WNT Signaling Pathways in Diverse Cancer Contexts. Cells 2024, 13, 608.

  • 5.

    Hartwell, L.H.; Szankasi, P.; Roberts, C.J.; et al. Integrating Genetic Approaches into the Discovery of Anticancer Drugs. Science 1997, 278, 1064–1068.

  • 6.

    Sigurdsson, G.; Fleming, R.M.; Heinken, A.; et al. A Systems Biology Approach to Drug Targets in Pseudomonas aeruginosa Biofilm. PLoS ONE 2012, 7, e34337.

  • 7.

    Lord, C.J.; Ashworth, A. PARP Inhibitors: Synthetic Lethality in the Clinic. Science 2017, 355, 1152–1158.

  • 8.

    Guo, J.; Liu, H.; Zheng, J. SynLethDB: Synthetic Lethality Database Toward Discovery of Selective and Sensitive Anticancer Drug Targets. Nucleic Acids Res. 2016, 44, D1011–D1017.

  • 9.

    Wang, J.; Wu, M.; Huang, X.; et al. SynLethDB 2.0: A Web-Based Knowledge Graph Database on Synthetic Lethality for Novel Anticancer Drug Discovery. Database 2022, 2022, baac030.

  • 10.

    Zhu, S.-B.; Jiang, Q.-H.; Chen, Z.-G.; et al. Mslar: Microbial Synthetic Lethal and Rescue Database. PLoS Comput. Biol. 2023, 19, e1011218.

  • 11.

    Rahiminejad, S.; De Sanctis, B.; Pevzner, P.; et al. Synthetic Lethality and the Minimal Genome Size Problem. mSphere 2024, 9, e00139-24.

  • 12.

    Lee, S.J.; Lee, S.-J.; Lee, D.-W. Design and Development of Synthetic Microbial Platform Cells for Bioenergy. Front. Microbiol. 2013, 4, 92.

  • 13.

    Yeh, C.-S.; Wang, Z.; Miao, F.; et al. A Novel Synthetic-Genetic-Array–Based Yeast One-Hybrid System for High Discovery Rate and Short Processing Time. Genome Res. 2019, 29, 1343–1351.

  • 14.

    Stojic, L.; Lun, A.T.; Mascalchi, P.; et al. A High-Content RNAi Screen Reveals Multiple Roles for Long Noncoding RNAs in Cell Division. Nat. Commun. 2020, 11, 1851.

  • 15.

    Wang, J.; Zhang, Q.; Han, J.; et al. Computational Methods, Databases and Tools for Synthetic Lethality Prediction. Brief. Bioinform. 2022, 23, bbac106.

  • 16.

    Li, J.; Lu, L.; Zhang, Y.H.; et al. Identification of Synthetic Lethality Based on a Functional Network by Using Machine Learning Algorithms. J. Cell. Biochem. 2019, 120, 405–416.

  • 17.

    Kranthi, T.; Rao, S.; Manimaran, P. Identification of Synthetic Lethal Pairs in Biological Systems through Network Information Centrality. Mol. Biosyst. 2013, 9, 2163–2167.

  • 18.

    Liany, H.; Jeyasekharan, A.; Rajan, V. Predicting Synthetic Lethal Interactions Using Heterogeneous Data Sources. Bioinformatics 2020, 36, 2209–2216.

  • 19.

    Liu, Y.; Wu, M.; Liu, C.; et al. SL2MF: Predicting Synthetic Lethality in Human Cancers via Logistic Matrix Factorization. IEEE/ACM Trans. Comput. Biol. Bioinform. 2019, 17, 748–757.

  • 20.

    Wang, S.; Xu, F.; Li, Y.; et al. KG4SL: Knowledge Graph Neural Network for Synthetic Lethality Prediction in Human Cancers. Bioinformatics 2021, 37, i418–i425.

  • 21.

    Long, Y.; Wu, M.; Liu, Y.; et al. Graph Contextualized Attention Network for Predicting Synthetic Lethality in Human Cancers. Bioinformatics 2021, 37, 2432–2440.

  • 22.

    Zhang, K.; Wu, M.; Liu, Y.; et al. KR4SL: Knowledge Graph Reasoning for Explainable Prediction of Synthetic Lethality. Bioinformatics 2023, 39, i158–i167.

  • 23.

    Huang, J.; Wu, M.; Lu, F.; et al. Predicting Synthetic Lethal Interactions in Human Cancers Using Graph Regularized Self-Representative Matrix Factorization. BMC Bioinform. 2019, 20, 657.

  • 24.

    Zhang, G.; Chen, Y.; Yan, C.; et al. MPASL: Multi-Perspective Learning Knowledge Graph Attention Network for Synthetic Lethality Prediction in Human Cancer. Front. Pharmacol. 2024, 15, 1398231.

  • 25.

    Hoang, V.T.; Jeon, H.-J.; You, E.-S.; et al. Graph Representation Learning and Its Applications: A Survey. Sensors 2023, 23, 4168.

  • 26.

    Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp 701–710.

  • 27.

    Grover, A.; Leskovec, J. node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp 855–864.

  • 28.

    Forster, D.T.; Li, S.C.; Yashiroda, Y.; et al. BIONIC: Biological Network Integration Using Convolutions. Nat. Methods 2022, 19, 1250–1261.

  • 29.

    Cho, H.; Berger, B.; Peng, J. Compact Integration of Multi-Network Topology for Functional Analysis of Genes. Cell Syst. 2016, 3, 540–548.e5.

  • 30.

    Côté, J.-P.; French, S.; Gehrke, S.S.; et al. The Genome-Wide Interaction Network of Nutrient Stress Genes in Escherichia coli. mBio 2016, 7, e01714-16.

  • 31.

    French, S.; Côté, J.-P.; Stokes, J.M.; et al. Bacteria Getting into Shape: Genetic Determinants of E. coli Morphology. mBio 2017, 8, e01977-16.

  • 32.

    Minchin, S.; Lodge, J. Understanding Biochemistry: Structure and Function of Nucleic Acids. Essays Biochem. 2019, 63, 433–456.

  • 33.

    Duan, Z.-H.; Hughes, B.; Reichel, L.; et al. The Relationship between Protein Sequences and Their Gene Ontology Functions. BMC Bioinform. 2006, 7, 89.

  • 34.

    De Las Rivas, J.; Fontanillo, C. Protein–Protein Interaction Networks: Unraveling the Wiring of Molecular Machines within the Cell. Brief. Funct. Genom. 2012, 11, 489–496.

  • 35.

    Szklarczyk, D.; Kirsch, R.; Koutrouli, M.; et al. The STRING Database in 2023: Protein–Protein Association Networks and Functional Enrichment Analyses for Any Sequenced Genome of Interest. Nucleic Acids Res. 2023, 51, D638–D646.

  • 36.

    Liu, G.; Yong, M.Y.J.; Yurieva, M.; et al. Gene Essentiality is a Quantitative Property Linked to Cellular Evolvability. Cell 2015, 163, 1388–1399.

  • 37.

    Wei, W.; Ye, Y.-N.; Luo, S.; et al. IFIM: A Database of Integrated Fitness Information for Microbial Genes. Database 2014, 2014, bau052.

  • 38.

    Wen, Q.-F.; Wei, W.; Guo, F.-B. Geptop 2.0: Accurately Select Essential Genes from the List of Protein-Coding Genes in Prokaryotic Genomes. In Essential Genes and Genomes: Methods and Protocols; Springer: Berlin/Heidelberg, Germany, 2022; pp 423–430.

  • 39.

    Hazra, A.; Gogtay, N. Biostatistics Series Module 6: Correlation and Linear Regression. Indian J. Dermatol. 2016, 61, 593–601.

  • 40.

    Hassanat, A.B. Two-Point-Based Binary Search Trees for Accelerating Big Data Classification Using KNN. PLoS ONE 2018, 13, e0207772.

  • 41.

    Huang, M.-W.; Tsai, C.-F.; Tsui, S.-C.; et al. Combining Data Discretization and Missing Value Imputation for Incomplete Medical Datasets. PLoS ONE 2023, 18, e0295032.

  • 42.

    Chen, C.-Y.; Chang, Y.-W. Missing Data Imputation Using Classification and Regression Trees. PeerJ Comput. Sci. 2024, 10, e2119.

  • 43.

    Qiu, Y.L.; Zheng, H.; Gevaert, O. Genomic Data Imputation with Variational Auto-Encoders. Gigascience 2020, 9, giaa082.

  • 44.

    Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2013, arXiv:1312.6114.

  • 45.

    Cai, R.; Chen, X.; Fang, Y.; et al. Dual-Dropout Graph Convolutional Network for Predicting Synthetic Lethality in Human Cancers. Bioinformatics 2020, 36, 4458–4465.

  • 46.

    Hao, Z.; Wu, D.; Fang, Y.; et al. Prediction of Synthetic Lethal Interactions in Human Cancers Using Multi-View Graph Auto-Encoder. IEEE J. Biomed. Health Inform. 2021, 25, 4041–4051.

  • 47.

    Dehghan Manshadi, M.; Setoodeh, P.; Zare, H. Rapid-SL Identifies Synthetic Lethal Sets with an Arbitrary Cardinality. Sci. Rep. 2022, 12, 14022.

  • 48.

    Singh, A.; Ogunfunmi, T. An Overview of Variational Autoencoders for Source Separation, Finance, and Bio-Signal Applications. Entropy 2021, 24, 55.

  • 49.

    Jaksik, R.; Iwanaszko, M.; Rzeszowska-Wolny, J.; et al. Microarray Experiments and Factors Which Affect Their Reliability. Biol. Direct 2015, 10, 46.

  • 50.

    Robinson, M.D.; Cai, P.; Emons, M.; et al. Ten Simple Rules for Computational Biologists Collaborating with Wet Lab Researchers. PLoS Comput. Biol. 2024, 20, e1012174.

  • 51.

    Li, H.; Sun, X.; Cui, W.; et al. Computational Drug Development for Membrane Protein Targets. Nat. Biotechnol. 2024, 42, 229–242.

Share this article:
How to Cite
Xu, Q.; Feng, Y.; Guo, H.; Su, Y.; Chen, X.; Sun, H.; Feng, J.; Guo, F. Prediction of Synthetic Lethality in Escherichia coli Based on Feature Engineering through Graph Embedding. eMicrobe 2026, 2 (1), 6. https://doi.org/10.53941/emicrobe.2026.100006.
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2026 by the authors.