2512002553
  • Open Access
  • Review

Chemoinformatic Tools Used in Untargeted Metabolomic Approaches by Mass Spectrometry Applied for Natural Product Analyses

  • Augustin Bildstein 1,   
  • Sylvie Chollet 2,   
  • Céline Rivière 1,*

Received: 01 Dec 2025 | Revised: 14 Dec 2025 | Accepted: 14 Dec 2025 | Published: 22 Dec 2025

Abstract

The chemistry of natural products has undergone a major transformation in the last twenty years, largely due to the development of powerful coupling techniques such as LC-HRMS/MS. These techniques, combined with supervised and unsupervised multivariate statistical analyses, are used for untargeted metabolomic studies for a wide range of applications. They have also enabled the development of dereplication approaches, thus accelerating often lengthy purification processes by focusing on biologically active metabolites of unknown structure. These dereplication approaches have been further strengthened in recent years by the development of molecular networks, based on the principle of grouping compounds according to their fragmentation profile in mass spectrometry. One of the current challenges remains the annotation of a large number of variables with a high degree of confidence. This will require enriching existing databases, and more recently, leveraging artificial intelligence. The latter, integrating in-silico virtual screening and chemoinformatic approaches, is now emerging as a powerful tool for predicting biological activity.

References 

  • 1.

    Joyce, A.R.; Palsson, B.Ø. The Model Organism as a System: Integrating “omics” Data Sets. Nat. Rev. Mol. Cell Biol. 2006, 7, 198–210. https://doi.org/10.1038/nrm1857.

  • 2.

    Dai, X.; Shen, L. Advances and Trends in Omics Technology Development. Front. Med. 2022, 9, 911861. https://doi.org/10.3389/fmed.2022.911861.

  • 3.

    Winkler, H. Verbreitung Und Ursache Der Parthenogenesis Im Pflanzen—Und Tierreiche; G. FIscher: Jena, Germany, 1920; pp. 1–248.

  • 4.

    Wolfender, J.-L.; Litaudon, M.; Touboul, D.; et al. Innovative Omics-Based Approaches for Prioritisation and Targeted Isolation of Natural Products—New Strategies for Drug Discovery. Nat. Prod. Rep. 2019, 36, 855–868. https://doi.org/10.1039/C9NP00004F.

  • 5.

    Beniddir, A.M.; Bin Kang, K.; Genta-Jouve, G.; et al. Advances in Decomposing Complex Metabolite Mixtures Using Substructure- and Network-Based Computational Metabolomics Approaches. Nat. Prod. Rep. 2021, 38, 1967–1993. https://doi.org/10.1039/D1NP00023C.

  • 6.

    Tsugawa, H.; Rai, A.; Saito, K.; et al. Metabolomics and Complementary Techniques to Investigate the Plant Phytochemical Cosmos. Nat. Prod. Rep. 2021, 38, 1729–1759. https://doi.org/10.1039/D1NP00014D.

  • 7.

    Schrimpe-Rutledge, A.C.; Codreanu, S.G.; Sherrod, S.D.; et al. Untargeted Metabolomics Strategies—Challenges and Emerging Directions. J. Am. Soc. Mass Spectrom. 2016, 27, 1897–1905. https://doi.org/10.1007/s13361-016-1469-y.

  • 8.

    Patti, G.J.; Yanes, O.; Siuzdak, G. Metabolomics: The Apogee of the Omics Trilogy. Nat. Rev. Mol. Cell Biol. 2012, 13, 263–269. https://doi.org/10.1038/nrm3314.

  • 9.

    Wolfender, J.-L.; Nuzillard, J.-M.; van der Hooft, J.J.J.; et al. Accelerating Metabolite Identification in Natural Product Research: Toward an Ideal Combination of Liquid Chromatography-High-Resolution Tandem Mass Spectrometry and NMR Profiling, in Silico Databases, and Chemometrics. Anal. Chem. 2019, 91, 704–742. https://doi.org/10.1021/acs.analchem.8b05112.

  • 10.

    Wolfender, J.-L.; Marti, G.; Thomas, A.; et al. Current Approaches and Challenges for the Metabolite Profiling of Complex Natural Extracts. J. Chromatogr. A 2015, 1382, 136–164. https://doi.org/10.1016/j.chroma.2014.10.091.

  • 11.

    Sasse, M.; Rainer, M. Mass Spectrometric Methods for Non-Targeted Screening of Metabolites: A Future Perspective for the Identification of Unknown Compounds in Plant Extracts. Separations 2022, 9, 415. https://doi.org/10.3390/separations9120415.

  • 12.

    Guo, J.; Huan, T. Comparison of Full-Scan, Data-Dependent, and Data-Independent Acquisition Modes in Liquid Chromatography-Mass Spectrometry Based Untargeted Metabolomics. Anal. Chem. 2020, 92, 8072–8080. https://doi.org/10.1021/acs.analchem.9b05135.

  • 13.

    An Overview of the Principles of MSE, the Engine that Drives MS Performance. Waters White Paper (P/N: 720004036EN), October 2011.

  • 14.

    Waters Corporation. An Added Dimension for Metabolite ID Studies Using Ion Mobility Combined with MSE. Available online: https://www.waters.com/nextgen/xg/fr/library/application-notes/2011/an-added-dimension-for-metabolite-id-studies-using-ion-mobility-combined-with-mse.html?srsltid=AfmBOoqB1sOe5QVdeiAX8PqOvcwfP4O6GLsVSOeckdDm9ZQsLKaUDWjj (accessed on 1 December 2025).

  • 15.

    Ramos, A.E.F.; Evanno, L.; Poupon, E.; et al. Natural Products Targeting Strategies Involving Molecular Networking: Different Manners, One Goal. Nat. Prod. Rep. 2019, 36, 960–980. https://doi.org/10.1039/C9NP00006B.

  • 16.

    Medina-Franco, J.L.; Sánchez-Cruz, N.; López-López, E.; et al. Progress on Open Chemoinformatic Tools for Expanding and Exploring the Chemical Space. J. Comput. Aided. Mol. Des. 2022, 36, 341–354. https://doi.org/10.1007/s10822-021-00399-1.

  • 17.

    Wolfender, J.-L.; Glauser, G.; Boccard, J.; et al. MS-Based Plant Metabolomic Approaches for Biomarker Discovery. Nat. Prod. Commun. 2009, 4, 1934578X0900401019. https://doi.org/10.1177/1934578X0900401019.

  • 18.

    Dunn, W.B.; Broadhurst, D.; Begley, P.; et al. Procedures for Large-Scale Metabolic Profiling of Serum and Plasma Using Gas Chromatography and Liquid Chromatography Coupled to Mass Spectrometry. Nat. Protoc. 2011, 6, 1060–1083. https://doi.org/10.1038/nprot.2011.335.

  • 19.

    Schmid, R.; Heuckeroth, S.; Korf, A.; et al. Integrative Analysis of Multimodal Mass Spectrometry Data in MZmine 3. Nat. Biotechnol. 2023, 41, 447–449. https://doi.org/10.1038/s41587-023-01690-2.

  • 20.

    Sturm, M.; Bertsch, A.; Gröpl, C.; et al. OpenMS—An Open-Source Software Framework for Mass Spectrometry. BMC Bioinform. 2008, 9, 163. https://doi.org/10.1186/1471-2105-9-163.

  • 21.

    Röst, H.L.; Sachsenberg, T.; Aiche, S.; et al. OpenMS: A Flexible Open-Source Software Platform for Mass Spectrometry Data Analysis. Nat. Methods 2016, 13, 741–748. https://doi.org/10.1038/nmeth.3959.

  • 22.

    Tautenhahn, R.; Patti, G.J.; Rinehart, D.; et al. XCMS Online: A Web-Based Platform to Process Untargeted Metabolomic Data. Anal. Chem. 2012, 84, 5035–5039. https://doi.org/10.1021/ac300698c.

  • 23.

    Chambers, M.C.; Maclean, B.; Burke, R.; et al. A Cross-Platform Toolkit for Mass Spectrometry and Proteomics. Nat. Biotechnol. 2012, 30, 918–920. https://doi.org/10.1038/nbt.2377.

  • 24.

    Guo, J.; Huan, T. Mechanistic Understanding of the Discrepancies between Common Peak Picking Algorithms in Liquid Chromatography—Mass Spectrometry-Based Metabolomics. Anal. Chem. 2023, 95, 5894–5902. https://doi.org/10.1021/acs.analchem.2c04887.

  • 25.

    Allen, F.; Pon, A.; Greiner, R.; et al. Computational Prediction of Electron Ionization Mass Spectra to Assist in GC/MS Compound Identification. Anal. Chem. 2016, 88, 7689–7697. https://doi.org/10.1021/acs.analchem.6b01622.

  • 26.

    Lange, E.; Tautenhahn, R.; Neumann, S.; et al. Critical Assessment of Alignment Procedures for LC-MS Proteomics and Metabolomics Measurements. BMC Bioinform. 2008, 9, 375. https://doi.org/10.1186/1471-2105-9-375.

  • 27.

    Broadhurst, D.I.; Kell, D.B. Statistical Strategies for Avoiding False Discoveries in Metabolomics and Related Experiments. Metabolomics 2006, 2, 171–196. https://doi.org/10.1007/s11306-006-0037-z.

  • 28.

    Tugizimana, F.; Piater, L.; Dubery, I. Plant Metabolomics: A New Frontier in Phytochemical Analysis. S. Afr. J. Sci. 2013, 109, 11. https://doi.org/10.1590/sajs.2013/20120005.

  • 29.

    Meglen, R.R. Examining Large Databases: A Chemometric Approach Using Principal Component Analysis. Mar. Chem. 1992, 39, 217–237. https://doi.org/10.1016/0304-4203(92)90103-H.

  • 30.

    van den Berg, R.A.; Hoefsloot, H.C.; Westerhuis, J.A.; et al. Centering, Scaling, and Transformations: Improving the Biological Information Content of Metabolomics Data. BMC Genom. 2006, 7, 142. https://doi.org/10.1186/1471-2164-7-142.

  • 31.

    Fonville, J.M.; Richards, S.E.; Barton, R.H.; et al. The Evolution of Partial Least Squares Models and Related Chemometric Approaches in Metabonomics and Metabolic Phenotyping. J. Chemom. 2010, 24, 636–649. https://doi.org/10.1002/cem.1359.

  • 32.

    Marini, F. Classification Methods in Chemometrics. Curr. Anal. Chem. 2010, 6, 72–79.

  • 33.

    Bylesjö, M.; Rantalainen, M.; Cloarec, O.; et al. OPLS Discriminant Analysis: Combining the Strengths of PLS-DA and SIMCA Classification. J. Chemom. 2006, 20, 341–351. https://doi.org/10.1002/cem.1006.

  • 34.

    Saraçli, S.; Doğan, N.; Doğan, İ. Comparison of Hierarchical Cluster Analysis Methods by Cophenetic Correlation. J. Inequal. Appl. 2013, 2013, 203. https://doi.org/10.1186/1029-242X-2013-203.

  • 35.

    Granato, D.; Santos, J.S.; Escher, G.B.; et al. Use of Principal Component Analysis (PCA) and Hierarchical Cluster Analysis (HCA) for Multivariate Association between Bioactive Compounds and Functional Properties in Foods: A Critical Perspective. Trends Food Sci. Technol. 2018, 72, 83–90. https://doi.org/10.1016/j.tifs.2017.12.006.

  • 36.

    Argüelles, M.; Benavides, C.; Fernández, I. A New Approach to the Identification of Regional Clusters: Hierarchical Clustering on Principal Components. Appl. Econ. 2014, 46, 2511–2519. https://doi.org/10.1080/00036846.2014.904491.

  • 37.

    Boccard, J.; Rudaz, S. Harnessing the Complexity of Metabolomic Data with Chemometrics. J. Chemom. 2014, 28, 1–9. https://doi.org/10.1002/cem.2567.

  • 38.

    Escofier, B.; Pagès, J. Analyses Factorielles Simples et Multiples: Cours et Études de Cas, Sciences Sup, 5th ed.; Dunod: Paris, France, 2023; ISBN 978-2-10-085957-3.

  • 39.

    Chong, I.-G.; Jun, C.-H. Performance of Some Variable Selection Methods When Multicollinearity Is Present. Chemom. Intell. Lab. Syst. 2005, 78, 103–112. https://doi.org/10.1016/j.chemolab.2004.12.011.

  • 40.

    Nothias-Scaglia, L.-F.; Esposito, M.; Costa, J.; et al. Les réseaux moléculaires, une approche bio-informatique globale pour interpréter les données de spectrométrie de masse tandem. Spectra Anal. 2015, 307, 73–78,

  • 41.

    Elie, N.; Santerre, C.; Touboul, D. Generation of a Molecular Network from Electron Ionization Mass Spectrometry Data by Combining MZmine2 and MetGem Software. Anal. Chem. 2019, 91, 11489–11492. https://doi.org/10.1021/acs.analchem.9b02802.

  • 42.

    Watrous, J.; Roach, P.; Alexandrov, T.; et al. Mass Spectral Molecular Networking of Living Microbial Colonies. Proc. Natl. Acad. Sci. USA 2012, 109, E1743–E1752. https://doi.org/10.1073/pnas.1203689109.

  • 43.

    Frank, A.M.; Bandeira, N.; Shen, Z.; et al. Clustering Millions of Tandem Mass Spectra. J. Proteome Res. 2008, 7, 113–122. https://doi.org/10.1021/pr070361e.

  • 44.

    Shannon, P.; Markiel, A.; Ozier, O.; et al. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 2003, 13, 2498–2504. https://doi.org/10.1101/gr.1239303.

  • 45.

    Wang, M.; Carver, J.J.; Phelan, V.V.; et al. Sharing and Community Curation of Mass Spectrometry Data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 2016, 34, 828–837. https://doi.org/10.1038/nbt.3597.

  • 46.

    Olivon, F.; Elie, N.; Grelier, G.; et al. MetGem Software for the Generation of Molecular Networks Based on the T-SNE Algorithm. Anal. Chem. 2018, 90, 13900–13908. https://doi.org/10.1021/acs.analchem.8b03099.

  • 47.

    Nothias, L.-F.; Petras, D.; Schmid, R.; et al. Feature-Based Molecular Networking in the GNPS Analysis Environment. Nat. Methods 2020, 17, 905–908. https://doi.org/10.1038/s41592-020-0933-6.

  • 48.

    Schmid, R.; Petras, D.; Nothias, L.-F.; et al. Ion Identity Molecular Networking for Mass Spectrometry-Based Metabolomics in the GNPS Environment. Nat. Commun. 2021, 12, 3832. https://doi.org/10.1038/s41467-021-23953-9.

  • 49.

    Hubert, J.; Nuzillard, J.-M.; Renault, J.-H. Dereplication Strategies in Natural Product Research: How Many Tools and Methodologies behind the Same Concept? Phytochem. Rev. 2017, 16, 55–95. https://doi.org/10.1007/s11101-015-9448-7.

  • 50.

    Bruguière, A.; Derbré, S.; Dietsch, J.; et al. MixONat, a Software for the Dereplication of Mixtures Based on 13C NMR Spectroscopy. Anal. Chem. 2020, 92, 8793–8801. https://doi.org/10.1021/acs.analchem.0c00193.

  • 51.

    Hubert, J.; Kotland, A.; Henes, B.; et al. Deciphering the Phytochemical Profile of an Alpine Rose (Rhododendron ferrugineum L.) Leaf Extract for a Better Understanding of Its Senolytic and Skin-Rejuvenation Effects. Cosmetics 2022, 9, 37. https://doi.org/10.3390/cosmetics9020037.

  • 52.

    Allard, P.-M.; Péresse, T.; Bisson, J.; et al. Integration of Molecular Networking and In-Silico MS/MS Fragmentation for Natural Products Dereplication. Anal. Chem. 2016, 88, 3317–3323. https://doi.org/10.1021/acs.analchem.5b04804.

  • 53.

    Blaženović, I.; Kind, T.; Ji, J.; et al. Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics. Metabolites 2018, 8, 31. https://doi.org/10.3390/metabo8020031.

  • 54.

    Xing, S.; Shen, S.; Xu, B.; et al. BUDDY: Molecular Formula Discovery via Bottom-up MS/MS Interrogation. Nat. Methods 2023, 20, 881–890. https://doi.org/10.1038/s41592-023-01850-x.

  • 55.

    Ruttkies, C.; Schymanski, E.L.; Wolf, S.; et al. MetFrag Relaunched: Incorporating Strategies beyond In Silico Fragmentation. J. Cheminform. 2016, 8, 3. https://doi.org/10.1186/s13321-016-0115-9.

  • 56.

    White, J. PubMed 2.0. Med. Ref. Serv. Q. 2020, 39, 382–387. https://doi.org/10.1080/02763869.2020.1826228.

  • 57.

    Rutz, A.; Sorokina, M.; Galgonek, J.; et al. The LOTUS Initiative for Open Knowledge Management in Natural Products Research. eLife 2022, 11, e70780. https://doi.org/10.7554/eLife.70780.

  • 58.

    Dührkop, K.; Shen, H.; Meusel, M.; et al. Searching Molecular Structure Databases with Tandem Mass Spectra Using CSI:FingerID. Proc. Natl. Acad. Sci. USA 2015, 112, 12580–12585. https://doi.org/10.1073/pnas.1509788112.

  • 59.

    Hoang, C.; Uritboonthai, W.; Hoang, L.; et al. Tandem Mass Spectrometry across Platforms. Anal. Chem. 2024, 96, 5478–5488. https://doi.org/10.1021/acs.analchem.3c05576.

  • 60.

    Ausloos, P.; Clifton, C.L.; Lias, S.G.; et al. The Critical Evaluation of a Comprehensive Mass Spectral Library. J. Am. Soc. Mass Spectrom. 1999, 10, 287–299. https://doi.org/10.1016/S1044-0305(98)00159-7.

  • 61.

    Kováts, E. Gas-Chromatographische Charakterisierung Organischer Verbindungen. Teil 1: Retentionsindices Aliphatischer Halogenide, Alkohole, Aldehyde Und Ketone. Helv. Chim. Acta 1958, 41, 1915–1932. https://doi.org/10.1002/hlca.19580410703.

  • 62.

    Salem, M.A.; Perez de Souza, L.; Serag, A.; et al. Metabolomics in the Context of Plant Natural Products Research: From Sample Preparation to Metabolite Analysis. Metabolites 2020, 10, 37. https://doi.org/10.3390/metabo10010037.

  • 63.

    Mejri, Y.; Cailloux, O.; Otogo N’Nang, E.; et al. MS2DECIDE: Aggregating Multiannotated Tandem Mass Spectrometry Data with Decision Theory Enhances Natural Products Prioritization. Chem. Methods 2025, 5, e202400088. https://doi.org/10.1002/cmtd.202400088.

  • 64.

    Quinlan, Z.A.; Koester, I.; Aron, A.T.; et al. ConCISE: Consensus Annotation Propagation of Ion Features in Untargeted Tandem Mass Spectrometry Combining Molecular Networking and In Silico Metabolite Structure Prediction. Metabolites 2022, 12, 1275. https://doi.org/10.3390/metabo12121275.

  • 65.

    Stravs, M.A.; Dührkop, K.; Böcker, S.; et al. MSNovelist: De Novo Structure Generation from Mass Spectra. Nat. Methods 2022, 19, 865–870. https://doi.org/10.1038/s41592-022-01486-3.

  • 66.

    Rutz, A.; Dounoue-Kubo, M.; Ollivier, S.; et al. Taxonomically Informed Scoring Enhances Confidence in Natural Products Annotation. Front. Plant Sci. 2019, 10, 1329. https://doi.org/10.3389/fpls.2019.01329.

  • 67.

    Sumner, L.W.; Amberg, A.; Barrett, D.; et al. Proposed Minimum Reporting Standards for Chemical Analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). Metabolomics 2007, 3, 211–221. https://doi.org/10.1007/s11306-007-0082-2.

Share this article:
How to Cite
Bildstein, A.; Chollet, S.; Rivière, C. Chemoinformatic Tools Used in Untargeted Metabolomic Approaches by Mass Spectrometry Applied for Natural Product Analyses. Natural Products Analysis 2025, 1 (1), 100010. https://doi.org/10.53941/npa.2025.100010.
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2025 by the authors.