The prediction of colorectal cancer(CRC) survivability has always been a challenging research issue. Considering the importance of predicting CRC patients' survival rates, we compared the performance of three data mining methods: decision trees(DTs), artificial neural networks(ANNs) and support vector machines(SVMs), for predicting 5-year survival of CRC patients to assist clinicians in making treatment decisions. The CRC dataset used to build the prediction model comes from the surveillance, epidemiology, and end results(SEER) program. The 5-fold cross-validation and random forest algorithm were respectively utilized for measuring the model predictive accuracy and the importance of features. Experimental results show that the predictive accuracy of ANNs(0.73) and SVMs(0.75) were higher than that of DTs, and they also have the best result in the area under the receiver operating characteristic(ROC) curve(area under curve(AUC)=0.82). This result may indicate high predictive power of ANNs and SVMs for predicting 5-year survival of CRC patients.
展开▼