首页> 美国政府科技报告 >Fast Nonparametric Machine Learning Algorithms for High-Dimensional Massive Data and Applications
【24h】

Fast Nonparametric Machine Learning Algorithms for High-Dimensional Massive Data and Applications

机译:高维海量数据及其应用的快速非参数机器学习算法

获取原文

摘要

Nonparametric methods have become increasingly popular in statistics and probabilistic AI communities. One well-known nonparametric method is 'nearest-neighbor' . It uses the observations in the training set T closest in input space to a query q to form the prediction of q. Specifically, when k of the observations in T are considered, it is called k-nearest-neighbor (or k- NN). Despite its simplicity, k-NN and its variants have been successful in many machine learning problems, including pattern recognition, text categorization, information retrieval, computational statistics, database and data mining. It is also used for estimating sample distributions and Bayes error. However, k-NN and many related nonparametric methods remain hampered by their computational complexity. Many spatial methods, such as metric-trees, have been proposed to alleviate the computational cost, but the effectiveness of these methods decreases as the number of dimensions of feature vectors increases. From another direction, researchers are trying to develop ways to find approximate answers. The premise of this research is that in many cases it is not necessary to insist on the exact answers; instead, determining an approximate answer should be sufficient. In fact, some approximate methods show good performance in a number of applications, and some methods enjoy very good theoretical soundness. However, when facing hundreds or thousands dimensions, many algorithms do not work well in reality.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号