Direct estimation of class membership probabilities for multiclass classification using multiple scores

Kazuko Takahashi; Hiroya Takamura; Manabu Okumura

首页> 外文期刊>Knowledge and information systems >Direct estimation of class membership probabilities for multiclass classification using multiple scores

【24h】

Direct estimation of class membership probabilities for multiclass classification using multiple scores

机译：使用多个分数直接估计用于多类分类的类成员资格概率

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Accurate estimation of class membership probability is needed for many applications in data mining and decision-making, to which multiclass classification is often applied. Since existing methods for estimation of class membership probability are designed for binary classification, in which only a single score outputted from a classifier can be used, an approach for multiclass classification requires both a decomposition of a multiclass classifier into binary classifiers and a combination of estimates obtained from each binary classifier to a target estimate. We propose a simple and general method for directly estimating class membership probability for any class in multiclass classification without decomposition and combination, using multiple scores not only for a predicted class but also for other proper classes. To make it possible to use multiple scores, we propose to modify or extend representative existing methods. As a non-parametric method, which refers to the idea of a binning method as proposed by Zadrozny et al., we create an "accuracy table" by a different method. Moreover we smooth accuracies on the table with methods such as the moving average to yield reliable probabilities (accuracies). As a parametric method, we extend Piatt's method to apply a multiple logistic regression. On two different datasets (open-ended data from Japanese social surveys and the 20 Newsgroups) both with Support Vector Machines and naive Bayes classifiers, we empirically show that the use of multiple scores is effective in the estimation of class membership probabilities in multiclass classification in terms of cross entropy, the reliability diagram, the ROC curve and AUC (area under the ROC curve), and that the proposed smoothing method for the accuracy table works quite well. Finally, we show empirically that in terms of MSE (mean squared error), our best proposed method is superior to an expansion for multiclass classification of a PAV method proposed by Zadrozny et al., in both the 20 Newsgroups dataset and the Pendigits dataset, but is slightly worse than the state-of-the-art method, which is an expansion for multiclass classification of a combination of boosting and a PAV method, on the Pendigits dataset.

机译：在数据挖掘和决策中的许多应用中都需要对类成员资格概率进行准确的估计，而在此类应用中通常会应用多类分类。由于现有的用于估计类成员资格概率的方法是针对二进制分类而设计的，其中只能使用从分类器输出的单个分数，因此用于多类分类的方法既需要将多类分类器分解为二进制分类器，又需要组合估计从每个二元分类器获得的目标估计值。我们提出了一种简单通用的方法，可以直接估计多类分类中任何类的类隶属概率，而无需分解和组合，不仅针对预测类，还针对其他适当类使用多个分数。为了能够使用多个分数，我们建议修改或扩展现有的代表性方法。作为非参数方法，它涉及Zadrozny等人提出的合并方法的思想，我们通过另一种方法来创建“精度表”。此外，我们使用诸如移动平均线之类的方法对表上的准确度进行平滑处理，以得出可靠的概率（准确度）。作为参数方法，我们扩展了Piatt方法以应用多元逻辑回归。在两个分别使用支持向量机和朴素贝叶斯分类器的数据集（来自日本社会调查和20个新闻组的开放式数据）上，我们凭经验表明，使用多个分数可以有效地估计美国多类分类中的类成员概率。交叉熵，可靠性图，ROC曲线和AUC（ROC曲线下的面积），以及所提出的精度表平滑方法效果很好。最后，我们通过经验证明，就20个新闻组数据集和Pendigits数据集而言，就MSE（均方误差）而言，我们最好的方法优于Zadrozny等人提出的PAV方法的多类分类扩展，但是它比Pendigits数据集上的最新方法稍差一些，后者是对boosting和PAV方法相结合的多类分类的扩展。

著录项

来源
《Knowledge and information systems》 |2009年第2期|共26页
作者
Kazuko Takahashi; Hiroya Takamura; Manabu Okumura;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化系统理论;
关键词
Multiclass classification; Class membership probabilities; Accuracy table; Logistic regression; Direct estimation; Multiple classification scores;

机译：多类分类;类隶属率;精度表;逻辑回归;直接估计;多个分类得分;

相似文献

外文文献
中文文献
专利

1. Direct estimation of class membership probabilities for multiclass classification using multiple scores [J] . Kazuko Takahashi, Hiroya Takamura, Manabu Okumura Knowledge and information systems . 2009,第2期

机译：使用多个分数直接估计用于多类分类的类成员资格概率
2. Head pose estimation using image abstraction and local directional quaternary patterns for multiclass classification [J] . ByungOk Han, Suwon Lee, Hyun S. Yang Pattern recognition letters . 2014,第auga1期

机译：使用图像抽象和局部方向四元模式的头部姿势估计用于多类分类
3. Multiclass cancer classification by support vector machines with class-wise optimized genes and probability estimates. [J] . Anand A, Suganthan PN Journal of Theoretical Biology . 2009,第3期

机译：通过支持向量机对多类癌症进行分类，并进行逐级优化的基因和概率估计。
4. Estimation of Class Membership Probabilities in the Document Classification [C] . Kazuko Takahashi, Hiroya Takamura, Manabu Okumura Advances in Knowledge Discovery and Data Mining; Lecture Notes in Artificial Intelligence; 4426 . 2007

机译：文件分类中类别成员资格概率的估计
5. Signal detection and estimation using classification-directed adaptive modeling. [D] . Muir, Robert Angus. 1988

机译：使用基于分类的自适应建模进行信号检测和估计。
6. Multiclass Posterior Probability Twin SVM for Motor Imagery EEG Classification [O] . Qingshan She, Yuliang Ma, Ming Meng, 2015

机译：运动图像脑电分类的多类后验概率双SVM
7. Estimation of Class Membership Probabilities by Using Multiple Classification Scores [O] . KAZUKO TAKAHASHI, HIROYA TAKAMURA, MANABU OKUMURA 2008

机译：使用多种分类分数估计类别成员资格概率

Direct estimation of class membership probabilities for multiclass classification using multiple scores

摘要

著录项

相似文献

相关主题

期刊订阅