首页> 中文期刊> 《计算机工程》 >基于互信息和粗糙集理论的特征选择

基于互信息和粗糙集理论的特征选择

         

摘要

Feature selection is research hotspot in text automatic categorization. Mutual Information(MI) is analyzed. And according to deficiency of MI, Rough Set(RS) is introduced and an attribute reduction algorithm based on relation union theory is proposed. A feature selection method based on MI and the proposed attribute reduction algorithm is presented, and it is suitable for massive text data sets. The method uses MI to select features, and employs the proposed attribute reduction algorithm to eliminate redundancy, so it can acquire the feature subsets which are more representative. Experimental results show that the method is promising.%针对互信息方法在精度方面的不足,通过引入粗糙集,给出一种基于关系积理论的属性约简算法,以此为基础提出一个适用于海量文本数据集的特征选择方法.该方法采用互信息进行特征初选,利用提出的属性约简算法消除冗余,获得较具代表性的特征子集.实验结果表明,该特征选择方法能获得冗余度小且较具代表性的特征子集.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号