Objective: This study aims to establish a machine learning (ML) model for predicting the risk of liver and/or lung metastasis in colorectal cancer (CRC). Methods: Using the National Institutes of Health (NIH)'s Surveillance, Epidemiology, and End Results (SEER) database, a total of 51265 patients with pathological diagnosis of colorectal cancer from 2010 to 2015 were extracted for model development. On this basis, We have established 7 machine learning algorithm models. Evaluate the model based on accuracy, and AUC of receiver operating characteristics (ROC) and explain the relationship between clinical pathological features and target variables based on the best model. We validated the model among 196 colorectal cancer patients in Beijing Electric Power Hospital of Capital Medical University of China to evaluate its performance and universality. Finally, we have developed a network-based calculator using the best model to predict the risk of liver and/or lung metastasis in colorectal cancer patients. Results: 51265 patients were enrolled in the study, of which 7864 (15.3 %) had distant liver and/or lung metastasis. RF had the best predictive ability, In the internal test set, with an accuracy of 0.895, AUC of 0.956, and AUPR of 0.896. In addition, the RF model was evaluated in the external validation set with an accuracy of 0.913, AUC of 0.912, and AUPR of 0.611. Conclusion: In this study, we constructed an RF algorithm mode to predict the risk of colorectal liver and/or lung metastasis, to assist doctors in making clinical decisions.

Machine learning for predicting liver and/or lung metastasis in colorectal cancer: A retrospective study based on the SEER database

Inchingolo, R.
2024-01-01

Abstract

Objective: This study aims to establish a machine learning (ML) model for predicting the risk of liver and/or lung metastasis in colorectal cancer (CRC). Methods: Using the National Institutes of Health (NIH)'s Surveillance, Epidemiology, and End Results (SEER) database, a total of 51265 patients with pathological diagnosis of colorectal cancer from 2010 to 2015 were extracted for model development. On this basis, We have established 7 machine learning algorithm models. Evaluate the model based on accuracy, and AUC of receiver operating characteristics (ROC) and explain the relationship between clinical pathological features and target variables based on the best model. We validated the model among 196 colorectal cancer patients in Beijing Electric Power Hospital of Capital Medical University of China to evaluate its performance and universality. Finally, we have developed a network-based calculator using the best model to predict the risk of liver and/or lung metastasis in colorectal cancer patients. Results: 51265 patients were enrolled in the study, of which 7864 (15.3 %) had distant liver and/or lung metastasis. RF had the best predictive ability, In the internal test set, with an accuracy of 0.895, AUC of 0.956, and AUPR of 0.896. In addition, the RF model was evaluated in the external validation set with an accuracy of 0.913, AUC of 0.912, and AUPR of 0.611. Conclusion: In this study, we constructed an RF algorithm mode to predict the risk of colorectal liver and/or lung metastasis, to assist doctors in making clinical decisions.
2024
Colorectal cancer
Liver and/or lung metastasis
Machine learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12572/29875
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
social impact