Tutorial: guidelines for the use of machine learning methods to mine genomes and proteomes for antibiotic discovery

Magana, M. et al. The value of antimicrobial peptides in the age of resistance. Lancet Infect. Dis. 20, e216–e230 (2020).
Google Scholar
Murray, C. J. L. et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet 399, 629–655 (2022).
Google Scholar
de la Fuente-Nunez, C., Torres, M. D., Mojica, F. J. & Lu, T. K. Next-generation precision antimicrobials: towards personalized treatment of infectious diseases. Curr. Opin. Microbiol. 37, 95–102 (2017).
Google Scholar
Porto, W. F. et al. In silico optimization of a guava antimicrobial peptide enables combinatorial exploration for peptide design. Nat. Commun. 9, 1490 (2018).
Google Scholar
Wong, F., de la Fuente-Nunez, C. & Collins, J. J. Leveraging artificial intelligence in the fight against infectious diseases. Science 381, 164–170 (2023).
Google Scholar
Maasch, J. R. M. A., Torres, M. D. T., Melo, M. C. R. & de la Fuente-Nunez, C. Molecular de-extinction of ancient antimicrobial peptides enabled by machine learning. Cell Host Microbe 31, 1260–1274 (2023).
Google Scholar
Torres, M. D. T. et al. Mining for encrypted peptide antibiotics in the human proteome. Nat. Biomed. Eng. 6, 67–75 (2022).
Google Scholar
Wan, F., Torres, M. D. T., Peng, J. & de la Fuente-Nunez, C. Deep-learning-enabled antibiotic discovery through molecular de-extinction. Nat. Biomed. Eng. 8, 854–871 (2024).
Google Scholar
Diéguez-Santana, K. & González-Díaz, H. Towards machine learning discovery of dual antibacterial drug–nanoparticle systems. Nanoscale 13, 17854–17870 (2021).
Google Scholar
Nocedo-Mena, D. et al. Modeling antibacterial activity with machine learning and fusion of chemical structure information with microorganism metabolic networks. J. Chem. Inf. Model. 59, 1109–1120 (2019).
Google Scholar
Liu, G. et al. Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii. Nat. Chem. Biol. 19, 1342–1350 (2023).
Google Scholar
Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2024).
Google Scholar
Hughes, J., Rees, S., Kalindjian, S. & Philpott, K. Principles of early drug discovery. Br. J. Pharmacol. 162, 1239–1249 (2011).
Google Scholar
Ma, Y. et al. Identification of antimicrobial peptides from the human gut microbiome using deep learning. Nat. Biotechnol. 40, 921–931 (2022).
Google Scholar
Torres, M. D. T. et al. Mining human microbiomes reveals an untapped source of peptide antibiotics. Cell 187, 5453–5467 (2024).
Google Scholar
Santos-Júnior, C. D. et al. Discovery of antimicrobial peptides in the global microbiome with machine learning. Cell 187, 3761–3778.e16 (2024).
Google Scholar
Pane, K. et al. Identification of novel cryptic multifunctional antimicrobial peptides from the human stomach enabled by a computational–experimental platform. ACS Synth. Biol. 7, 2105–2115 (2018).
Google Scholar
Cesaro, A. et al. Synthetic antibiotic derived from sequences encrypted in a protein from human plasma. ACS Nano 16, 1880–1895 (2022).
Google Scholar
Sberro, H. et al. Large-scale analyses of human microbiomes reveal thousands of small, novel genes. Cell 178, 1245–1259.e14 (2019).
Google Scholar
Li, H. et al. FSPP: a tool for genome-wide prediction of smORF-encoded peptides and their functions. Front. Genet. 9, 96 (2018).
Google Scholar
Torres, M. D. T., Sothiselvam, S., Lu, T. K. & de la Fuente-Nunez, C. Peptide design principles for antimicrobial applications. J. Mol. Biol. 431, 3547–3567 (2019).
Google Scholar
Torres, M. D. T., Cesaro, A. & de la Fuente-Nunez, C. Peptides from non-immune proteins target infections through antimicrobial and immunomodulatory properties. Trends Biotechnol. 43, 184–205 (2025).
Google Scholar
Yuanyuan, J. & Xinqiang, Y. Micropeptides identified from human genomes. J. Proteome Res. 21, 865–873 (2022).
Google Scholar
Martinez, T. F. et al. Profiling mouse brown and white adipocytes to identify metabolically relevant small ORFs and functional microproteins. Cell Metab. 35, 166–183.e11 (2023).
Google Scholar
Ruiz-Orera, J. & Albà, M. M. Translation of small open reading frames: roles in regulation and evolutionary innovation. Trends Genet. 35, 186–198 (2019).
Google Scholar
Sandmann, C.-L. et al. Evolutionary origins and interactomes of human, young microproteins and small peptides translated from short open reading frames. Mol. Cell 83, 994–1011.e18 (2023).
Google Scholar
Makarewich, C. A. & Olson, E. N. Mining for micropeptides. Trends Cell Biol. 27, 685–696 (2017).
Google Scholar
Vitorino, R., Guedes, S., Amado, F., Santos, M. & Akimitsu, N. The role of micropeptides in biology. Cell. Mol. Life Sci. 78, 3285–3298 (2021).
Google Scholar
Sousa, M. E. & Farkas, M. H. Micropeptide. PLoS Genet. 14, e1007764 (2018).
Google Scholar
Eraslan, G., Avsec, Ž., Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389–403 (2019).
Google Scholar
Torres, M. D. T., Cao, J., Franco, O. L., Lu, T. K. & de la Fuente-Nunez, C. Synthetic biology and computer-based frameworks for antimicrobial peptide discovery. ACS Nano 15, 2143–2164 (2021).
Google Scholar
Rombel, I. T., Sykes, K. F., Rayner, S. & Johnston, S. A. ORF-FINDER: a vector for high-throughput gene identification. Gene 282, 33–41 (2002).
Google Scholar
Bepler, T. & Berger, B. Learning the protein language: evolution, structure, and function. Cell Syst. 12, 654–669.e3 (2021).
Google Scholar
Rives, A. et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl. Acad. Sci. USA 118, e2016239118 (2021).
Google Scholar
Madani, A. et al. Large language models generate functional protein sequences across diverse families. Nat. Biotechnol. 41, 1099–1106 (2023).
Google Scholar
Alley, E. C., Khimulya, G., Biswas, S., AlQuraishi, M. & Church, G. M. Unified rational protein engineering with sequence-based deep representation learning. Nat. Methods 16, 1315–1322 (2019).
Google Scholar
de la Fuente-Nunez, C. AI in infectious diseases: the role of datasets. Drug Resist. Updat. 73, 101067 (2024).
Google Scholar
Pane, K. et al. Antimicrobial potency of cationic antimicrobial peptides can be predicted from their amino acid composition: application to the detection of “cryptic” antimicrobial peptides. J. Theor. Biol. 419, 254–265 (2017).
Google Scholar
UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515 (2019).
Google Scholar
Goldberg, K. et al. Cell-autonomous innate immunity by proteasome-derived defence peptides. Nature 639, 1032–1041 (2025).
Xia, X., Torres, M. D. T. & de la Fuente-Nunez, C. Proteasome-derived antimicrobial peptides discovered via deep learning. Preprint at bioRxiv (2025).
Bhadra, P., Yan, J., Li, J., Fong, S. & Siu, S. W. I. AmPEP: sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest. Sci. Rep. 8, 1697 (2018).
Google Scholar
Pirtskhalava, M. et al. DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res. 49, D288–D297 (2021).
Google Scholar
Torrance, A. W. & de la Fuente-Nunez, C. The patentability and bioethics of molecular de-extinction. Nat. Biotechnol. 42, 1179–1180 (2024).
Google Scholar
Rawlings, N. D., Barrett, A. J. & Bateman, A. MEROPS: the peptidase database. Nucleic Acids Res. 38, D227–D233 (2010).
Google Scholar
De Oliveira, D. M. P. et al. Antimicrobial resistance in ESKAPE pathogens. Clin. Microbiol. Rev. 33, e00181-19 (2020).
Google Scholar
Kawashima, S. AAindex: Amino Acid index database. Nucleic Acids Res. 28, 374–374 (2000).
Google Scholar
Wang, G., Li, X. & Wang, Z. APD3: the antimicrobial peptide database as a tool for research and education. Nucleic Acids Res. 44, D1087–D1093 (2016).
Google Scholar
Kang, X. et al. DRAMP 2.0, an updated data repository of antimicrobial peptides. Sci. Data 6, 148 (2019).
Google Scholar
Zhao, X., Wu, H., Lu, H., Li, G. & Huang, Q. LAMP: a database linking antimicrobial peptides. PLoS One 8, e66557 (2013).
Google Scholar
Jhong, J.-H. et al. dbAMP 2.0: updated resource for antimicrobial peptides with an enhanced scanning method for genomic and proteomic data. Nucleic Acids Res. 50, D460–D470 (2022).
Google Scholar
McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2018).
Google Scholar
Andaur Navarro, C. L. et al. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ 375, n2281 (2021).
Google Scholar
Mercer, D. K. et al. Antimicrobial susceptibility testing of antimicrobial peptides to better predict efficacy. Front. Cell. Infect. Microbiol. 10, 326 (2020).
Google Scholar
Wiegand, I., Hilpert, K. & Hancock, R. E. W. Agar and broth dilution methods to determine the minimal inhibitory concentration (MIC) of antimicrobial substances. Nat. Protoc. 3, 163–175 (2008).
Google Scholar
Cesaro, A., Torres, M. D. T. & de la Fuente-Nunez, C. Methods for the design and characterization of peptide antibiotics. Methods Enzymol. 663, 303–326 (2022).
Google Scholar
Powell, M. F. et al. Peptide stability in drug development. II. Effect of single amino acid substitution and glycosylation on peptide reactivity in human serum. Pharm. Res. 10, 1268–1273 (1993).
Google Scholar
Torres, M. D. T. et al. Coatable and resistance-proof ionic liquid for pathogen eradication. ACS Nano 15, 966–978 (2021).
Google Scholar
Pletzer, D., Mansour, S. C. & Hancock, R. E. W. Synergy between conventional antibiotics and anti-biofilm peptides in a murine, sub-cutaneous abscess model caused by recalcitrant ESKAPE pathogens. PLoS Pathog. 14, e1007084 (2018).
Google Scholar
Scheinpflug, K., Krylova, O. & Strahl, H. Measurement of cell membrane fluidity by Laurdan GP: fluorescence spectroscopy and microscopy. Methods Mol. Biol. 1520, 159–174 (2017).
Google Scholar
Grage, S. L., Afonin, S., Kara, S., Buth, G. & Ulrich, A. S. Membrane thinning and thickening induced by membrane-active amphipathic peptides. Front. Cell Dev. Biol. 4, 65 (2016).
Google Scholar
Haney, E. F., Nathoo, S., Vogel, H. J. & Prenner, E. J. Induction of non-lamellar lipid phases by antimicrobial peptides: a potential link to mode of action. Chem. Phys. Lipids 163, 82–93 (2010).
Google Scholar
Yan, J. et al. Two hits are better than one: membrane-active and DNA binding-related double-action mechanism of NK-18, a novel antimicrobial peptide derived from mammalian NK-lysin. Antimicrob. Agents Chemother. 57, 220–228 (2013).
Google Scholar
Rokitskaya, T. I., Kolodkin, N. I., Kotova, E. A. & Antonenko, Y. N. Indolicidin action on membrane permeability: carrier mechanism versus pore formation. Biochim. Biophys. Acta 1808, 91–97 (2011).
Google Scholar
Chan, D. I., Prenner, E. J. & Vogel, H. J. Tryptophan- and arginine-rich antimicrobial peptides: structures and mechanisms of action. Biochim. Biophys. Acta 1758, 1184–1202 (2006).
Google Scholar
Khandelia, H., Ipsen, J. H. & Mouritsen, O. G. The impact of peptides on lipid membranes. Biochim. Biophys. Acta 1778, 1528–1536 (2008).
Google Scholar
Paterson, D. J., Tassieri, M., Reboud, J., Wilson, R. & Cooper, J. M. Lipid topology and electrostatic interactions underpin lytic activity of linear cationic antimicrobial peptides in membranes. Proc. Natl. Acad. Sci. USA 114, E8324–E8332 (2017).
Google Scholar
Finger, S., Kerth, A., Dathe, M. & Blume, A. The efficacy of trivalent cyclic hexapeptides to induce lipid clustering in PG/PE membranes correlates with their antimicrobial activity. Biochim. Biophys. Acta 1848, 2998–3006 (2015).
Google Scholar
Schmidt, N. W. & Wong, G. C. L. Antimicrobial peptides and induced membrane curvature: geometry, coordination chemistry, and molecular engineering. Curr. Opin. Solid State Mater. Sci. 17, 151–163 (2013).
Google Scholar
Zemel, A., Ben-Shaul, A. & May, S. Modulation of the spontaneous curvature and bending rigidity of lipid membranes by interfacially adsorbed amphipathic peptides. J. Phys. Chem. B 112, 6988–6996 (2008).
Google Scholar
Conibear, A. C., Rosengren, K. J., Daly, N. L., Henriques, S. T. & Craik, D. J. The cyclic cystine ladder in θ-defensins is important for structure and stability, but not antibacterial activity. J. Biol. Chem. 288, 10830–10840 (2013).
Google Scholar
Torres, M. D. T. et al. Structure-function-guided exploration of the antimicrobial peptide polybia-CP identifies activity determinants and generates synthetic therapeutic candidates. Commun. Biol. 1, 221 (2018).
Google Scholar
Lázár, V. et al. Antibiotic-resistant bacteria show widespread collateral sensitivity to antimicrobial peptides. Nat. Microbiol. 3, 718–731 (2018).
Google Scholar
Boaro, A. et al. Structure-function-guided design of synthetic peptides with anti-infective activity derived from wasp venom. Cell Rep. Phys. Sci. 4, 101459 (2023).
Google Scholar
Lázár, V., Snitser, O., Barkan, D. & Kishony, R. Antibiotic combinations reduce Staphylococcus aureus clearance. Nature 610, 540–546 (2022).
Google Scholar
Grézal, G. et al. Plasticity and stereotypic rewiring of the transcriptome upon bacterial evolution of antibiotic resistance. Mol. Biol. Evol. 40, msad020 (2023).
Google Scholar
Sheard, D. E., O’Brien-Simpson, N. M., Wade, J. D. & Separovic, F. Combating bacterial resistance by combination of antibiotics with antimicrobial peptides. Pure Appl. Chem. 91, 199–209 (2019).
Google Scholar
Al Shaer, D., Al Musaimi, O., Albericio, F. & de la Torre, B. G. 2023 FDA TIDES (peptides and oligonucleotides) harvest. Pharmaceuticals 17, 243 (2024).
Silveira, G. G. O. S. et al. Antibiofilm peptides: relevant preclinical animal infection models and translational potential. ACS Pharmacol. Transl. Sci. 4, 55–73 (2021).
Google Scholar
Silva, O. N. et al. Repurposing a peptide toxin from wasp venom into antiinfectives with dual antimicrobial and immunomodulatory properties. Proc. Natl. Acad. Sci. USA 117, 26936–26945 (2020).
Google Scholar
Arqué, X. et al. Autonomous treatment of bacterial infections in vivo using antimicrobial micro- and nanomotors. ACS Nano 16, 7547–7558 (2022).
Google Scholar
De los Santos, L. et al. Polyproline peptide targets Klebsiella pneumoniae polysaccharides to collapse biofilms. Cell Rep. Phys. Sci. 5, 101869 (2024).
Google Scholar
Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688–702.e13 (2020).
Google Scholar
Ribeiro, M. T., Singh, S. & Guestrin, C. ‘Why Should I Trust You?’: Explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (Association for Computing Machinery, 2016).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Proc. 31st International Conference on Neural Information Processing Systems 4768–4777 (Curran Associates, 2017).
Torres, M. D. T. et al. A generative artificial intelligence approach for antibiotic optimization. Preprint at bioRxiv (2024).
Mirdita, M. et al. Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Res. 45, D170–D176 (2017).
Google Scholar
link