首页> 中文期刊> 《计算机技术与发展》 >海量数据下不完备信息系统的知识约简算法

海量数据下不完备信息系统的知识约简算法

         

摘要

Knowledge reduction for massive datasets has attracted many research interests in rough set theory. Traditional knowledge re-duction algorithms of incomplete information system assume that all the datasets can be loaded into the main memory,which are obvious-ly infeasible for large-scale datasets,especially for massive datasets with missing information. To this end,deeply analyze the characteris-tics of massive datasets with missing information,and allow the missing attribute value to take all possible values. Then,by combining the parallel computations used in classical knowledge reduction algorithms with the discernibility ( indiscernibility) of the attributes,a knowl-edge reduction algorithm is designed for incomplete information systems under MapReduce framework. The experimental results demon-strate that this algorithm is effective and feasible,which can efficiently process massive datasets for knowledge reduction in incomplete in-formation systems.%面向大规模的数据进行知识约简是近年来粗糙集理论研究的热点。传统不完备信息系统的知识约简是假设在初始时将所有需要处理的数据一次性地装入内存中,这明显不适合处理海量数据,更不适合处理含有缺失信息的海量数据。为此,深入剖析了带有缺失信息的数据特征,把缺失属性的值用该属性所有可能的取值表示,并结合知识约简算法中的可并行性,从属性(集)的可辨识性和不可辨识性出发,并在MapReduce框架下设计了可用来处理不完备信息系统的知识约简算法。实验结果表明,该算法是有效可行的,能够对不完备信息系统中的海量数据进行知识约简。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号