A review of machine learning methods for cancer characterization from microbiome data
Ferlay, J. et al. Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods. Int. J. Cancer 144, 1941–1953 (2019).
Google Scholar
WHO. WHO Methods and Data Sources for Country-Level Causes of Death: 2000-2019 (World Health Organization, 2020).
Hanahan, D. Hallmarks of cancer: new dimensions. Cancer Discov. 12, 31–46 (2022).
Google Scholar
Gilbert, J. A. et al. Current understanding of the human microbiome. Nat. Med. 24, 392–400 (2018).
Google Scholar
Behjati, S. & Tarpey, P. S. What is next generation sequencing? Arch. Dis. Child. Educ. Pract. Ed. 98, 236–238 (2013).
Google Scholar
Jiang, D. et al. Microbiome multi-omics network analysis: statistical considerations, limitations, and opportunities. Front. Genet. 10, 995 (2019).
Google Scholar
Jovel, J. et al. Characterization of the gut microbiome using 16S or shotgun metagenomics. Front. Microbiol. 7, 459 (2016).
Google Scholar
Turnbaugh, P. J. et al. The Human Microbiome Project. Nature 449, 804–810 (2007).
Google Scholar
Zeller, G. et al. Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol. Syst. Biol. 10, 766 (2014).
Google Scholar
Glassner, K. L., Abraham, B. P. & Quigley, E. M. M. The microbiome and inflammatory bowel disease. J. Allergy Clin. Immunol. 145, 16–27 (2020).
Google Scholar
Chen, W., Liu, F., Ling, Z., Tong, X. & Xiang, C. Human intestinal lumen and mucosa-associated microbiota in patients with colorectal cancer. PLoS ONE 7, e39743 (2012).
Google Scholar
Carabotti, M., Scirocco, A., Maselli, M. A. & Severi, C. The gut-brain axis: interactions between enteric microbiota, central and enteric nervous systems. Ann. Gastroenterol. Hepatol. 28, 203–209 (2015).
Helmink, B. A., Khan, M. A. W., Hermann, A., Gopalakrishnan, V. & Wargo, J. A. The microbiome, cancer, and cancer therapy. Nat. Med. 25, 377–388 (2019).
Google Scholar
Ferreira, R. M. et al. Gastric microbial community profiling reveals a dysbiotic cancer-associated microbiota. Gut 67, 226–236 (2018).
Google Scholar
Flemer, B. et al. The oral microbiota in colorectal cancer is distinctive and predictive. Gut 67, 1454–1463 (2018).
Google Scholar
Kartal, E. et al. A faecal microbiota signature with high specificity for pancreatic cancer. Gut 71, 1359–1372 (2022).
Google Scholar
Cancer Genome Atlas Research Network. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).
Google Scholar
Poore, G. D. et al. Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature 579, 567–574 (2020).
Google Scholar
Rodriguez, R. M., Hernandez, B. Y., Menor, M., Deng, Y. & Khadka, V. S. The landscape of bacterial presence in tumor and adjacent normal tissue across 9 major cancer types using TCGA exome sequencing. Comput. Struct. Biotechnol. J. 18, 631–641 (2020).
Google Scholar
Geller, L. T. et al. Potential role of intratumor bacteria in mediating tumor resistance to the chemotherapeutic drug gemcitabine. Science 357, 1156–1160 (2017).
Google Scholar
Matson, V. et al. The commensal microbiome is associated with anti–PD-1 efficacy in metastatic melanoma patients. Science 359, 104–108 (2018).
Google Scholar
Routy, B. et al. Gut microbiome influences efficacy of PD-1–based immunotherapy against epithelial tumors. Science 359, 91–97 (2018).
Google Scholar
Nichols, J. A., Herbert Chan, H. W. & Baker, M. A. B. Machine learning: applications of artificial intelligence to imaging and diagnosis. Biophys. Rev. 11, 111–118 (2019).
Google Scholar
Miotto, R., Wang, F., Wang, S., Jiang, X. & Dudley, J. T. Deep learning for healthcare: review, opportunities and challenges. Brief. Bioinf. 19, 1236–1246 (2018).
Google Scholar
Liu, W., Fang, X., Zhou, Y., Dou, L. & Dou, T. Machine learning-based investigation of the relationship between gut microbiome and obesity status. Microbes Infect. 24, 104892 (2022).
Google Scholar
Radjabzadeh, D. et al. Gut microbiome-wide association study of depressive symptoms. Nat. Commun. 13, 7128 (2022).
Google Scholar
Konishi, Y. et al. Development and evaluation of a colorectal cancer screening method using machine learning-based gut microbiota analysis. Cancer Med. 11, 3194–3206 (2022).
Google Scholar
Shah, M. S. et al. Leveraging sequence-based faecal microbial community survey data to identify a composite biomarker for colorectal cancer. Gut 67, 882–891 (2018).
Google Scholar
Zhou, Z. et al. Human gut microbiome-based knowledgebase as a biomarker screening tool to improve the predicted probability for colorectal cancer. Front. Microbiol. 11, 596027 (2020).
Google Scholar
Hogan, G. et al. Biopsy bacterial signature can predict patient tissue malignancy. Sci. Rep. 11, 18535 (2021).
Google Scholar
Li, X.et al. The machine-learning-mediated interface of microbiome and genetic risk stratification in neuroblastoma reveals molecular pathways related to patient survival. Cancers 14, 2874 (2022).
Liang, H. et al. Predicting cancer immunotherapy response from gut microbiomes using machine learning models. Oncotarget 13, 876–889 (2022).
Google Scholar
Ma, Y. et al. Distinct tumor bacterial microbiome in lung adenocarcinomas manifested as radiological subsolid nodules. Transl. Oncol. 14, 101050 (2021).
Google Scholar
Mao, X.-Y. et al. iCEMIGE: integration of CEll-morphometrics, MIcrobiome, and GEne biomarker signatures for risk stratification in breast cancers. World J. Clin. Oncol. 13, 616–629 (2022).
Google Scholar
Montassier, E. et al. Pretreatment gut microbiome predicts chemotherapy-related bloodstream infection. Genome Med. 8, 49 (2016).
Google Scholar
Zhou, Y.-H. & Gallins, P. A review and tutorial of machine learning methods for microbiome host trait prediction. Front. Genet. 10, 579 (2019).
Google Scholar
Cheung, H. & Yu, J. Machine learning on microbiome research in gastrointestinal cancer. J. Gastroenterol. Hepatol. 36, 817–822 (2021).
Google Scholar
Dohlman, A. B. et al. The cancer microbiome atlas: a pan-cancer comparative analysis to distinguish tissue-resident microbiota from contaminants. Cell Host Microbe 29, 281–298.e5 (2021).
Google Scholar
Davis, N. M., Proctor, D. M., Holmes, S. P., Relman, D. A. & Callahan, B. J. Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data. Microbiome 6, 226 (2018).
Noecker, C., McNally, C. P., Eng, A. & Borenstein, E. High-resolution characterization of the human microbiome. Transl. Res. 179, 7–23 (2017).
Google Scholar
Pasolli, E., Truong, D. T., Malik, F., Waldron, L. & Segata, N. Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput. Biol. 12, e1004977 (2016).
Google Scholar
Woerner, J. et al. Circulating microbial content in myeloid malignancy patients is associated with disease subtypes and patient outcomes. Nat. Commun. 13, 1038 (2022).
Google Scholar
Yang, J. et al. Brain tumor diagnostic model and dietary effect based on extracellular vesicle microbiome data in serum. Exp. Mol. Med. 52, 1602–1613 (2020).
Google Scholar
Miao, R. et al. Assessment of peritoneal microbial features and tumor marker levels as potential diagnostic tools for ovarian cancer. PLoS ONE 15, e0227707 (2020).
Google Scholar
He, Y. et al. Stability of operational taxonomic units: an important but neglected property for analyzing microbial diversity. Microbiome 3, 20 (2015).
Google Scholar
Breitwieser, F. P., Lu, J. & Salzberg, S. L. A review of methods and databases for metagenomic classification and assembly. Brief. Bioinform. 20, 1125–1136 (2019).
Google Scholar
Lee, S. J. & Rho, M. Multimodal deep learning applied to classify healthy and disease states of human microbiome. Sci. Rep. 12, 824 (2022).
Google Scholar
Zhao, D. et al. A reliable method for colorectal cancer prediction based on feature selection and support vector machine. Med. Biol. Eng. Comput. 57, 901–912 (2019).
Google Scholar
Ling, W., Qi, Y., Hua, X. & Wu, M. C. Deep ensemble learning over the microbial phylogenetic tree (DeepEn-Phy). In 2021 IEEE International Conference on Bioinformatics and Biomedicine (IEEE, 2021).
Hastie, T., Tibshirani, R. & Friedman, J. H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, 2009).
D’Elia, D.et al. Advancing microbiome research with machine learning: Key findings from the ML4Microbiome COST action. Front. Microbiol. 14, 1257002 (2023).
Corsini, N. & Viroli, C. Dealing with overdispersion in multivariate count data. Comput. Stat. Data Anal. 170, 107447 (2022).
Google Scholar
Greenacre, M., Martínez-Álvaro, M. & Blasco, A. Compositional data analysis of microbiome and any-omics datasets: a validation of the additive logratio transformation. Front. Microbiol. 12, 727398 (2021).
Casimiro-Soriguer, C. S., Loucera, C., Peña-Chilet, M. & Dopazo, J. Towards a metagenomics machine learning interpretable model for understanding the transition from adenoma to colorectal cancer. Sci. Rep. 12, 450 (2022).
Google Scholar
Ni, Y. et al. Distinct composition and metabolic functions of human gut microbiota are associated with cachexia in lung cancer patients. ISME J. 15, 3207–3220 (2021).
Google Scholar
Han, S., Zhuang, J., Pan, Y., Wu, W. & Ding, K. Different characteristics in gut microbiome between advanced adenoma patients and colorectal cancer patients by metagenomic analysis. Microbiol. Spectr. 10, e01593–22 (2022).
Google Scholar
Mulenga, M., Kareem, S. A., Sabri, A. Q. M. & Seera, M. Stacking and chaining of normalization methods in deep learning-based classification of colorectal cancer using gut microbiome data. IEEE Access 9, 97296–97319 (2021).
Google Scholar
De Martin, A. et al. Distinct microbial communities colonize tonsillar squamous cell carcinoma. Oncoimmunology 10, 1945202 (2021).
Google Scholar
Jiang, S. et al. HARMONIES: a hybrid approach for microbiome networks inference via exploiting sparsity. Front. Genet. 11, 445 (2020).
Google Scholar
Singh, D. & Singh, B. Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020).
Google Scholar
Arabameri, A., Asemani, D. & Teymourpour, P. Detection of colorectal carcinoma based on microbiota analysis using generalized regression neural networks and nonlinear feature selection. IEEE/ACM Trans. Comput. Biol. Bioinform. 17, 547–557 (2020).
Google Scholar
Weiss, S. et al. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome 5, 27 (2017).
Google Scholar
Mulenga, M. et al. Feature extension of gut microbiome data for deep neural network-based colorectal cancer classification. IEEE Access 9, 23565–23578 (2021).
Google Scholar
Jović, A., Brkić, K. & Bogunović, N. A review of feature selection methods with applications. In 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 1200–1205 (IEEE, 2015).
Nogales, R. E. & Benalcázar, M. E. Analysis and evaluation of feature selection and feature extraction methods. Int. J. Comput. Intell. Syst. 16, 153 (2023).
Google Scholar
Miao, J. & Niu, L. A survey on feature selection. Proc. Comput. Sci. 91, 919–926 (2016).
Google Scholar
Jaeger, J., Sengupta, R. & Ruzzo, W. L. Improved gene selection for classification of microarrays. In Pacific Symposium on Biocomputing 2003 (Lihue, 2003).
Ding, C. & Peng, H. Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3, 185–205 (2005).
Google Scholar
Chen, L. et al. Identifying robust microbiota signatures and interpretable rules to distinguish cancer subtypes. Front. Mol. Biosci. 7, 604794 (2020).
Google Scholar
Jabeer, A. et al. Identifying taxonomic biomarkers of colorectal cancer in human intestinal microbiota using multiple feature selection methods. In 2022 Innovations in Intelligent Systems and Applications Conference (IEEE, 2022).
Yuan, B. et al. Fecal bacteria as non-invasive biomarkers for colorectal adenocarcinoma. Front. Oncol. 11, 664321 (2021).
Google Scholar
Segata, N. et al. Metagenomic biomarker discovery and explanation. Genome Biol. 12, R60 (2011).
Google Scholar
Menze, B. H. et al. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinform. 10, 213 (2009).
Google Scholar
Venkatesh, B. & Anuradha, J. A review of Feature Selection and its methods. Cybern. Inf. Technol. 19, 3–26 (2019).
Theodoridis, S. Machine Learning: A Bayesian and Optimization Perspective (Academic Press, 2015).
Chen, F. et al. Meta-analysis of fecal viromes demonstrates high diagnostic potential of the gut viral signatures for colorectal cancer and adenoma risk assessment. J. Adv. Res. 49, 103–114 (2022).
Guyon, I., Weston, J., Barnhill, S. & Vapnik, V. Mach. Learn. 46, 389–422 (2002).
Google Scholar
Hermida, L. C., Gertz, E. M. & Ruppin, E. Predicting cancer prognosis and drug response from the tumor microbiome. Nat. Commun. 13, 2896 (2022).
Google Scholar
Senliol, B., Gulgezen, G., Yu, L. & Cataltepe, Z. Fast Correlation Based Filter (FCBF) with a different search strategy. In 2008 23rd International Symposium on Computer and Information Sciences (IEEE, 2008).
Bishop, C. M. Pattern Recognition and Machine Learning (Springer Verlag, 2006).
Zackular, J. P., Baxter, N. T., Chen, G. Y. & Schloss, P. D. Manipulation of the gut microbiota reveals role in colon tumorigenesis. mSphere 1, e00001–15 (2016).
Google Scholar
Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297 (1995).
Google Scholar
Noble, W. S. What is a support vector machine? Nat. Biotechnol. 24, 1565–1567 (2006).
Google Scholar
Schuldt, C., Laptev, I. & Caputo, B. Recognizing human actions: a local SVM approach. In Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004 (IEEE, 2004).
Topçuoğlu, B. D., Lesniak, N. A., Ruffin 4th, M. T., Wiens, J. & Schloss, P. D. A framework for effective application of machine learning to microbiome-based classification problems. MBio 11, e00434–20 (2020).
Camps-Valls, G., Gomez-Chova, L., Munoz-Mari, J., Vila-Frances, J. & Calpe-Maravilla, J. Composite kernels for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 3, 93–97 (2006).
Google Scholar
Rossi, M. et al. Gut microbial shifts indicate melanoma presence and bacterial interactions in a murine model. Diagnostics 12, 958 (2022).
Google Scholar
Karamizadeh, S., Abdullah, S. M., Halimi, M., Shayan, J. & Rajabi, M. J. Advantage and drawback of support vector machine functionality. In 2014 International Conference on Computer, Communications, and Control Technology (IEEE, 2014).
Kishk, A.et al. A Hybrid Machine Learning Approach for the Phenotypic Classification of Metagenomic Colon Cancer Reads Based on Kmer Frequency and Biomarker Profiling. In 2018 9th Cairo International Biomedical Engineering Conference (IEEE, 2018).
Yang, M. et al. A multi-omics machine learning framework in predicting the survival of colorectal cancer patients. Comput. Biol. Med. 146, 105516 (2022).
Google Scholar
Ashraf, F. B., Shafi, M. S. R. & Kabir, M. R. Host trait prediction from human microbiome data for Colorectal Cancer. In 2020 23rd International Conference on Computer and Information Technology (IEEE, 2020).
Dadkhah, E. et al. Gut microbiome identifies risk for colorectal polyps. BMJ Open Gastroenterol. 6, e000297 (2019).
Google Scholar
Kotsiantis, S. B., Zaharakis, I. D. & Pintelas, P. E. Machine learning: a review of classification and combining techniques. Artif. Intell. Rev. 26, 159–190 (2006).
Warnke-Sommer, J. D. & Ali, H. H. Evaluation of the oral microbiome as a biomarker for early detection of human oral carcinomas. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2069–2076 (IEEE, 2017).
Kingsford, C. & Salzberg, S. L. What are decision trees? Nat. Biotechnol. 26, 1011–1013 (2008).
Google Scholar
Kotsiantis, S. B. Decision trees: a recent overview. Artif. Intell. Rev. 39, 261–283 (2013).
Google Scholar
Chen, X. & Ishwaran, H. Random forests for genomic data analysis. Genomics 99, 323–329 (2012).
Google Scholar
Zhou, X. et al. The clinical potential of oral microbiota as a screening tool for oral squamous cell carcinomas. Front. Cell. Infect. Microbiol. 11, 728933 (2021).
Google Scholar
Ferreira, A. J. & Figueiredo, M. A. T. Boosting algorithms: a review of methods, theory, and applications. In Ensemble Machine Learning, 35–85 (Springer US, 2012).
Podgorelec, V., Kokol, P., Stiglic, B. & Rozman, I. Decision trees: an overview and their use in medicine. J. Med. Syst. 26, 445–463 (2002).
Google Scholar
Lou, Y., Caruana, R., Gehrke, J. & Hooker, G. Accurate intelligible models with pairwise interactions. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2013).
Hastie, T. & Tibshirani, R. Generalized Additive Models; Some Applications. J. Am. Stat. Assoc. 82 371–386 (1985).
Lou, Y., Caruana, R. & Gehrke, J. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2012).
Maxwell, A. E., Sharma, M. & Donaldson, K. A. Explainable boosting machines for slope failure spatial predictive modeling. Remote Sens. 13, 4991 (2021).
Google Scholar
Ranstam, J. & Cook, J. A. LASSO regression. Br. J. Surg. 105, 1348 (2018).
Google Scholar
Ng, A. Y. Feature selection, L1 vs. L2 regularization, and rotational invariance. In Twenty-First International Conference on Machine Learning – ICML ’04 (ACM Press, 2004).
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
Kang, G.-U. et al. Dynamics of fecal microbiota with and without invasive cervical cancer and its application in early diagnosis. Cancers 12, 3800 (2020).
Google Scholar
Goldberg, Y. A primer on neural network models for natural language processing. J. Artif. Intell. Res. 57, 345–420 (2016).
Google Scholar
Goodfellow, I., Bengio, Y. & Courville, A.Deep Learning (MIT Press, 2016).
Mahmud, M., Kaiser, M. S., Hussain, A. & Vassanelli, S. Applications of deep learning and reinforcement learning to biological data. IEEE Trans Neural Netw Learn Syst 29, 2063–2079 (2018).
Google Scholar
Alzubaidi, L. et al. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J Big Data 8, 53 (2021).
Google Scholar
Reiman, D., Metwally, A. A., Sun, J. & Dai, Y. PopPhy-CNN: a phylogenetic tree embedded architecture for convolutional neural networks to predict host phenotype from metagenomic data. IEEE J. Biomed. Health Inf. 24, 2993–3001 (2020).
Google Scholar
Specht, D. F. A general regression neural network. IEEE Trans. Neural Netw. 2, 568–576 (1991).
Google Scholar
Hannan, S. A., Manza, R. R. & Ramteke, R. J. Generalized regression neural network and radial basis function for heart disease diagnosis. Int. J. Comput. Appl. 7, 7–13 (2010).
Al-Mahasneh, A. J., Anavatti, S. G. & Garratt, M. A. Review of applications of Generalized Regression Neural Networks in identification and control of dynamic systems. arXiv (2018).
García-Jiménez, B., Muñoz, J., Cabello, S., Medina, J. & Wilkinson, M. D. Predicting microbiomes through a deep latent space. Bioinformatics 37, 1444–1451 (2021).
Google Scholar
Oh, M. & Zhang, L. DeepMicro: deep representation learning for disease prediction based on microbiome data. Sci. Rep. 10, 1–9 (2020).
Shorten, C. & Khoshgoftaar, T. M. A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019).
Rosenblatt, M., Tejavibulya, L., Jiang, R., Noble, S. & Scheinost, D. Data leakage inflates prediction performance in connectome-based machine learning models. Nat. Commun. 15, 1829 (2024).
Google Scholar
Refaeilzadeh, P., Tang, L. & Liu, H. Encyclopedia of Database Systems (eds. Liu, L. & Özsu, M. T.) 532–538 (Springer US, 2009).
Gihawi, A. et al. Major data analysis errors invalidate cancer microbiome findings. mBio 14, e01607–23 (2023).
Google Scholar
Gihawi, A., Cooper, C. S. & Brewer, D. S. Caution regarding the specificities of pan-cancer microbial structure. Microb. Genomics 9, 001088 (2023).
Sepich-Poore, G. D.et al. Robustness of cancer microbiome signals over a broad range of methodological variation. Oncogene 43, 1127–1148 (2024).
Sepich-Poore, G. D. et al. Reply to: caution regarding the specificities of pan-cancer microbial structure. Preprint at: (2023).
Gaulke, C. A. & Sharpton, T. J. The influence of ethnicity and geography on human gut microbiome composition. Nature Medicine 24, 1495–1496 (2018).
Google Scholar
Leinonen, R., Sugawara, H., Shumway, M. & on behalf of the International Nucleotide Sequence Database Collaboration. The sequence read archive. Nucleic Acids Res. 39, D19–D21 (2011).
Google Scholar
Yelmen, B. & Jay, F. An overview of deep generative models in functional and evolutionary genomics. Annu. Rev. Biomed. Data Sci. 6 173–189 (2023).
Yelmen, B. et al. Creating artificial human genomes using generative neural networks. PLOS Genet. 17, e1009303 (2021).
Google Scholar
Cavadas, B. et al. Gastric microbiome diversities in gastric cancer patients from europe and asia mimic the human population structure and are partly driven by microbiome quantitative trait loci. Microorganisms 8, 1196 (2020).
Google Scholar
Lauss, M. et al. Monitoring of technical variation in quantitative high-throughput datasets. Cancer Inf. 12, 193–201 (2013).
Rasnic, R., Brandes, N., Zuk, O. & Linial, M. Substantial batch effects in TCGA exome sequences undermine pan-cancer analysis of germline variants. BMC Cancer 19, 783 (2019).
Google Scholar
Ribeiro, M. T., Singh, S. & Guestrin, C. “Why Should I Trust You?”: Explaining the predictions of any classifier. arXiv (2016).
Lundberg, S. & Lee, S.-I. A unified approach to interpreting model predictions. arXiv (2017).
Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. arXiv (2019).
Japkowicz, N. Imbalanced Learning, 187–206 (John Wiley & Sons, Inc., 2013).
Vaswani, A. et al. Attention is all you need. arXiv (2017).
Feng, C. et al. A deep-learning model with the attention mechanism could rigorously predict survivals in neuroblastoma. Front. Oncol. 11, 653863 (2021).
Lin, M. et al. Application of Deep Learning on predicting prognosis of acute myeloid leukemia with cytogenetics, age, and mutations. arXiv (2018).
Larsson, S. C., Orsini, N. & Wolk, A. Diabetes mellitus and risk of colorectal cancer: a meta-analysis. J. Natl. Cancer Inst. 97, 1679–1687 (2005).
Google Scholar
Tsilidis, K. K., Kasimis, J. C., Lopez, D. S., Ntzani, E. E. & Ioannidis, J. P. A. Type 2 diabetes and cancer: Umbrella review of meta-analyses of observational studies. BMJ 350, g7607–g7607 (2015).
Google Scholar
Li, W.-Z., Stirling, K., Yang, J.-J. & Zhang, L. Gut microbiota and diabetes: from correlation to causality and mechanism. World J. Diabetes 11, 293–308 (2020).
Google Scholar
Wensel, C. R., Pluznick, J. L., Salzberg, S. L. & Sears, C. L. Next-generation sequencing: Insights to advance clinical investigations of the microbiome. J. Clin. Investig. 132, e154944 (2022).
Google Scholar
Satam, H. et al. Next-generation sequencing technology: current trends and advancements. Biology 12, 997 (2023).
Google Scholar
Kong, S. et al. Deep hurdle networks for zero-inflated multi-target regression: application to multiple species abundance estimation. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI’20 (2021).
Lu, Y. & Liao, Y. STS: A novel deep learning method for zero-inflated crime prediction. In Proceedings of the 2022 4th International Conference on Robotics, Intelligent Control and Artificial Intelligence, RICAI ’22, 1097–1103 (Association for Computing Machinery, 2023).
Wei, M., Liu, R., Wang, Y. J. & Huang, C. SoutheastCon 2023, 901–905 (IEEE, 2023).
Osawa, T., Mitsuhashi, H., Uematsu, Y. & Ushimaru, A. Bagging GLM: improved generalized linear model for the analysis of zero-inflated data. Ecol. Inf. 6, 270–275 (2011).
Google Scholar
Liu, B., Chau, J., Dai, Q., Zhong, C. & Zhang, J. Exploring gut microbiome in predicting the efficacy of immunotherapy in non-small cell lung cancer. Cancers 14, 5401 (2022).
Google Scholar
Heshiki, Y. et al. Predictable modulation of cancer treatment outcomes by the gut microbiota. Microbiome 8, 28 (2020).
Google Scholar
Stein-Thoeringer, C. K. et al. A non-antibiotic-disrupted gut microbiome is associated with clinical responses to CD19-CAR-T cell cancer immunotherapy. Nat. Med. 29, 906–916 (2023).
Google Scholar
Shamszare, H. & Choudhury, A. Clinicians’ perceptions of artificial intelligence: focus on workload, risk, trust, clinical decision making, and clinical integration. Healthcare 11, 2308 (2023).
Google Scholar
Doherty, M., Metcalfe, T., Guardino, E., Peters, E. & Ramage, L. Precision medicine and oncology: an overview of the opportunities presented by next-generation sequencing and big data and the challenges posed to conventional drug development and regulatory approval pathways. Ann. Oncol. 27, 1644–1646 (2016).
Google Scholar
Qu, K., Gao, F., Guo, F. & Zou, Q. Taxonomy dimension reduction for colorectal cancer prediction. Comput. Biol. Chem. 83, 107160 (2019).
Google Scholar
Zheng, Y. et al. Specific gut microbiome signature predicts the early-stage lung cancer. Gut Microbes 11, 1030–1042 (2020).
Google Scholar
Chen, M. et al. Carcinogenesis of male oral submucous fibrosis alters salivary microbiomes. J. Dent. Res. 100, 397–405 (2021).
Google Scholar
Chen, J.-W. et al. Taxonomic and functional dysregulation in salivary microbiomes during oral carcinogenesis. Front. Cell. Infect. Microbiol. 11, 663068 (2021).
Google Scholar
Shrode, R. L. et al. Breast cancer patients from the Midwest region of the United States have reduced levels of short-chain fatty acid-producing gut bacteria. Sci. Rep. 13, 526 (2023).
Google Scholar
Wang, N. et al. Identifying distinctive tissue and fecal microbial signatures and the tumor-promoting effects of deoxycholic acid on breast cancer. Front. Cell. Infect. Microbiol. 12, 1029905 (2022).
Google Scholar
An, J. et al. Prediction of breast cancer using blood microbiome and identification of foods for breast cancer prevention. Sci. Rep. 13, 5110 (2023).
Google Scholar
Uzelac, M., Li, Y., Chakladar, J., Li, W. T. & Ongkeko, W. M. Archaea microbiome dysregulated genes and pathways as molecular targets for lung adenocarcinoma and squamous cell carcinoma. Int. J. Mol. Sci. 23, 11566 (2022).
Google Scholar
Banavar, G. et al. The salivary metatranscriptome as an accurate diagnostic indicator of oral cancer. npj Genom. Med. 6, 105 (2021).
Google Scholar
Bukavina, L. et al. Global meta-analysis of urine microbiome: colonization of polycyclic aromatic hydrocarbon–degrading bacteria among bladder cancer patients. Eur. Urol. Oncol. 6, 190–203 (2023).
Google Scholar
Bang, S. et al. Establishment and evaluation of prediction model for multiple disease classification based on gut microbial data. Sci. Rep. 9, 10189 (2019).
Google Scholar
Su, Q. et al. Faecal microbiome-based machine learning for multi-class disease diagnosis. Nat. Commun. 13, 6818 (2022).
Google Scholar
Wickramaratne, D., Wijesinghe, R. & Weerasinghe, R. Human gut microbiome data analysis for disease likelihood prediction using autoencoders. In 2021 21st International Conference on Advances in ICT for Emerging Regions (ICter), 49–54 (IEEE, 2021).
Jiang, P., Lai, S., Wu, S., Zhao, X.-M. & Chen, W.-H. Host DNA contents in fecal metagenomics as a biomarker for intestinal diseases and effective treatment. BMC Genomics 21, 348 (2020).
Google Scholar
Jiang, P., Wu, S., Luo, Q., Zhao, X.-m & Chen, W.-H. Metagenomic analysis of common intestinal diseases reveals relationships among microbial signatures and powers multidisease diagnostic models. mSystems 6, e00112–21 (2021).
Google Scholar
McDowell, A. et al. Machine-learning algorithms for asthma, COPD, and lung cancer risk assessment using circulating microbial extracellular vesicle data and their application to assess dietary effects. Exp. Mol. Med. 54, 1586–1595 (2022).
Google Scholar
link