首页> 中文期刊> 《计算机科学》 >基于哈希技术和MapReduce的大数据集K-近邻算法

基于哈希技术和MapReduce的大数据集K-近邻算法

         

摘要

K-近邻是一种著名的分类算法.由于简单且易于实现,因此其被广泛应用于许多领域,如人脸识别、基因分类、决策支持等.然而,在大数据环境中,K-近邻算法变得非常低效,甚至不可行.针对这一问题,提出了一种基于哈希技术和MapReduce的大数据集K-近邻分类算法.为了验证算法的有效性,在4个大数据集上进行了实验,结果显示,在保持分类能力的前提下,所提算法可以大幅度地提高K-近邻算法的效率.%K-nearest neighbor(K-NN) is a famous classification algorithm.Because the idea of K-NN is simple and it is easy to implement,K-NN has been widely applied to many fields,such as face recognition,gene classification and decision making,etc.However,in the big data environment,the efficiency of K-NN is very low,even it is not workable.In order to deal with this problem,based on hash technology and MapRecuce,this paper proposed an improved K-nearest neighbor algorithm.In order to verify the effectiveness of the proposed algorithm,some experiments were conducted on 4 big data sets.The experimental results show that the proposed algorithm is effective and efficient.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号