首页> 外文期刊>SIAM/ASA Journal on Uncertainty Quantification >Uncertainty Quantification in Graph-Based Classification of High Dimensional Data
【24h】

Uncertainty Quantification in Graph-Based Classification of High Dimensional Data

机译:在基于不确定性量化高维数据的分类

获取原文
获取原文并翻译 | 示例
           

摘要

Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The methods are all based on the graph formulation of semisupervised learning. We provide a unified framework which brings together a variety of methods that have been introduced in different communities within the mathematical sciences. We study probit classification [C. K. Williams and C. E. Rasmussen, ``Gaussian Processes for Regression,"" in Advances in Neural Information Processing Systems 8, MIT Press, 1996, pp. 514-520] in the graph-based setting, generalize the level-set method for Bayesian inverse problems [M. A. Iglesias, Y. Lu, and A. M. Stuart, Interfaces Free Bound., 18 (2016), pp. 181-217] to the classification setting, and generalize the Ginzburg--Landau optimization-based classifier [A. L. Bertozzi and A. Flenner, Multiscale Model. Simul., 10 (2012), pp. 1090-1118], [Y. Van Gennip and A. L. Bertozzi, Adv. Differential Equations, 17 (2012), pp. 1115-1180] to a Bayesian setting. We also show that the probit and level-set approaches are natural relaxations of the harmonic function approach introduced in [X. Zhu et al., "Semi-supervised Learning Using Gaussian Fields and Harmonic Functions," in ICML, Vol. 3, 2003, pp. 912-919]. We introduce efficient numerical methods, suited to large datasets, for both MCMC-based sampling and gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semisupervised learning algorithms.
机译:高维数据的分类广泛的应用。应用装备产生的分类和测量的不确定性分类本身一样重要。本文我们介绍,开发算法,并研究了各种的属性贝叶斯模型的二进制的任务分类;分类标签,这些方法自动给措施的不确定性。方法都是基于图的制定semisupervised学习。汇集了各种各样的框架介绍了在不同的方法在数学科学社区。研究probit分类(C。c·e·拉斯穆森”、“高斯过程回归”、“先进的神经信息处理系统8,麻省理工学院出版社,1996,pp。514 - 520年)是基于设置,概括贝叶斯的水平集方法逆问题[M。斯图尔特、接口自由的束缚。181 - 217年)的分类设置,推广了金兹堡-朗道文中针对分类器(A。a . Flenner多尺度模型。页1090 - 1118],[Y。贝尔托齐,放置微分方程,17 (2012),页1115 - 1180)贝叶斯设置。表明,probit和水平集方法自然计划生育政策放宽的谐波函数方法引入[X。“Semi-supervised学习使用高斯领域和谐波函数,“在ICML, 3卷,2003年,页912 - 919)。方法,适合大型数据集,两个基于MCMC-based采样和梯度地图估计。研究分类准确性和不确定性量化模型;展示一套数据集通常用于评估基于semisupervised学习算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号