Predicting classifier performance with a small training set: Applications to computer-aided diagnosis and prognosis

机译：只需少量培训即可预测分类器的性能：在计算机辅助诊断和预后中的应用

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Selection of an appropriate classifier for computer-aided diagnosis (CAD) applications has typically been an ad hoc process. It is difficult to know a priori which classifier will yield high accuracies for a specific application, especially when well-annotated data for classifier training is scarce. In this study, we utilize an inverse power-law model of statistical learning to predict classifier performance when only limited amounts of annotated training data is available. The objectives of this study are to (a) predict classifier error in the context of different CAD problems when larger data cohorts become available, and (b) compare classifier performance and trends (both at the sample/patient level and at the pixel level) as additional data is accrued (such as in a clinical trial). In this paper we utilize a power law model to evaluate and compare various classifiers (Support Vector Machine (SVM), C4.5 decision tree, k-nearest neighbor) for four distinct CAD problems. The first two datasets deal with sample/patient-level classification for distinguishing between (1) high from low grade breast cancers and (2) high from low levels of lymphocytic infiltration in breast cancer specimens. The other two datasets are pixel-level classification problems for discriminating cancerous and non-cancerous regions on prostate (3) MRI and (4) histopathology. Our empirical results suggest that, given sufficient training data, SVMs tend to be the best classifiers. This was true for datasets (1), (2), and (3), while the C4.5 decision tree was the best classifier for dataset (4). Our results also suggest that results of classifier comparison made on small data cohorts should not be generalized as holding true when large amounts of data become available.

机译：为计算机辅助诊断（CAD）应用程序选择合适的分类器通常是一个临时过程。很难先验地知道哪个分类器将针对特定应用产生较高的准确度，尤其是在缺少用于分类器训练的标注充分的数据时。在这项研究中，当只有有限数量的带注释的训练数据可用时，我们利用统计学习的逆幂定律模型来预测分类器的性能。这项研究的目的是（a）在可获得较大数据队列时，根据不同的CAD问题预测分类器错误;以及（b）比较分类器的性能和趋势（在样本/患者级别和像素级别）因为需要累积其他数据（例如在临床试验中）。在本文中，我们利用幂定律模型来评估和比较各种分类器（支持向量机（SVM），C4.5决策树，k近邻），以解决四个不同的CAD问题。前两个数据集涉及样本/患者级别的分类，以区分（1）乳腺癌样本中的高和低级别乳腺癌以及（2）高和低的淋巴细胞浸润级别。其他两个数据集是像素级分类问题，用于区分前列腺癌（3）MRI和（4）组织病理学上的癌性和非癌性区域。我们的经验结果表明，在有足够训练数据的情况下，支持向量机往往是最好的分类器。这对于数据集（1），（2）和（3）都是正确的，而C4.5决策树是数据集（4）的最佳分类器。我们的结果还表明，当有大量数据可用时，不应将对小数据队列所做的分类器比较结果概括为成立。

著录项

来源
《IEEE International Symposium on Biomedical Imaging: From Nano to Macro》|2010年|p.229-232|共4页
会议地点
作者
Basavanhally Ajay; Doyle Scott; Madabhushi Anant;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类模式识别与装置;
关键词

相似文献

外文文献
中文文献
专利

1. Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers. [J] . Chan HP, Sahiner B, Wagner RF, Medical Physics . 1999,第12期

机译：用于计算机辅助诊断的分类器设计：有限样本量对经典分类器和神经网络分类器平均性能的影响。
2. The role of pertinently diversified and balanced training as well as testing data sets in achieving the true performance of classifiers in predicting the antifreeze proteins [J] . Abhigyan Nath, Karthikeyan Subbiah Neurocomputing . 2018,第jana10期

机译：有关多样化和均衡的培训以及测试数据集在实现分类器预测抗冻蛋白的真实性能方面的作用
3. Classifier ensemble generation and selection with multiple feature representations for classification applications in computer-aided detection and diagnosis on mammography [J] . Choi Jae Young, Kim Dae Hoe, Plataniotis Konstantinos N., Expert Systems with Application . 2016,第Mara期

机译：具有多种特征表示的分类器集合生成和选择，用于在乳腺X线计算机辅助检测和诊断中的分类应用
4. PREDICTING CLASSIFIER PERFORMANCE WITH A SMALL TRAINING SET: APPLICATIONS TO COMPUTER-AIDED DIAGNOSIS AND PROGNOSIS [C] . Ajay Basavanhally, Scott Doyle, Anant Madabhushi IEEE International Symposium on Biomedical Imaging . 2010

机译：用小型训练集预测分类器性能：应用于计算机辅助诊断和预后的应用
5. Advanced computer-aided diagnosis and prognosis for breast MRI. [D] . Bhooshan, Neha. 2010

机译：乳房MRI的先进计算机辅助诊断和预后。
6. Predicting Classifier Performance with Limited Training Data: Applications to Computer-Aided Diagnosis in Breast and Prostate Cancer [O] . Ajay Basavanhally, Satish Viswanath, Anant Madabhushi -1

机译：用有限的训练数据预测分类器的性能：在乳腺癌和前列腺癌的计算机辅助诊断中的应用
7. Predicting classifier performance with limited training data: applications to computer-aided diagnosis in breast and prostate cancer. [O] . Ajay Basavanhally, Satish Viswanath, Anant Madabhushi 2015

机译：使用有限的训练数据预测分类器性能：应用于乳腺癌和前列腺癌的计算机辅助诊断。
8. Likelihood Ratio Classifier for Computer-Aided Diagnosis in Mammography [R] . Bilska-Wolak, A. 2006

机译：乳腺摄影计算机辅助诊断的似然比分类器

Predicting classifier performance with a small training set: Applications to computer-aided diagnosis and prognosis

摘要

著录项

相似文献

相关主题

期刊订阅