首页> 外文期刊>IFAC PapersOnLine >The Blessing of Dimensionality: Separation Theorems in the Thermodynamic Limit * * The work is partially supported by Innovate UK, Technology Strategy Board, Knowledge Transfer Partnership grant KTP009890
【24h】

The Blessing of Dimensionality: Separation Theorems in the Thermodynamic Limit * * The work is partially supported by Innovate UK, Technology Strategy Board, Knowledge Transfer Partnership grant KTP009890

机译:维度的祝福:热力学极限中的分离定理 * * 这项工作得到了Innovate UK,技术战略委员会的部分支持,知识转移伙伴关系赠款KTP009890

获取原文
           

摘要

Abstract: We consider and analyze properties of large sets of randomly selected (i.i.d.) points in high dimensional spaces. In particular, we consider the problem of whether a single data point that is randomly chosen from a finite set of points can be separated from the rest of the data set by a linear hyperplane. We formulate and prove stochastic separation theorems, including: 1) with probability close to one a random point may be separated from a finite random set by a linear functional; 2) with probability close to one for every point in a finite random set there is a linear functional separating this point from the rest of the data. The total number of points in the random sets are allowed to be exponentially large with respect to dimension. Various laws governing distributions of points are considered, and explicit formulae for the probability of separation are provided. These theorems reveal an interesting implication for machine learning and data mining applications that deal with large data sets (big data) and high-dimensional data (many attributes): simple linear decision rules and learning machines are surprisingly efficient tools for separating and filtering out arbitrarily assigned points in large dimensions.
机译:摘要:我们考虑并分析了高维空间中大量随机选择的点(i.i.d.)点的属性。特别地,我们考虑是否可以通过线性超平面将从一组有限的点中随机选择的单个数据点与其余数据集分开的问题。我们提出并证明了随机分离定理,包括:1)概率接近一个随机点,可以通过线性函数将其与有限随机集分开; 2)有限随机集合中每个点的概率接近一个,因此存在一个线性函数将该点与其余数据区分开。随机集合中的总点数相对于维度成指数增长。考虑了控制点分布的各种定律,并提供了分离概率的明确公式。这些定理揭示了对处理大数据集(大数据)和高维数据(许多属性)的机器学习和数据挖掘应用程序的有趣含义:简单的线性决策规则和学习机是令人惊讶的有效工具,可任意分离和过滤掉大范围分配点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号