首页> 中文期刊> 《上海师范大学学报(自然科学版)》 >基于信息增益的中文网页SVM分类研究

基于信息增益的中文网页SVM分类研究

         

摘要

In order to improve the efficiency and accuracy of text classification,optimization and improvement are made for defects and deficiencies of the feature dimensionality reduction method and traditional information gain method in text classification of Chinese web pages.At first,part-of-speech filtering and synonyms merging processes are taken for the first feature dimension reduction of feature items.Then,an improved information gain method is proposed for feature weighting computation of feature items.Finally,the classification algorithm of Support Vector Machine (SVM) is used for text classification of Chinese web pages.Both theoretical analysis and experimental results show that this method has better performance and classification results than traditional method.%针对中文网页文本分类中特征降维方法和传统信息增益方法的缺陷和不足做出优化改进,旨在有效提高文本分类效率和精度.首先,采取词性过滤和同义词归并处理对特征项进行初次特征降维,然后提出改进的信息增益方法对特征项进行特征加权运算,最后采用支持向量机(SVM)分类算法对中文网页进行文本分类.理论分析和实验结果都表明本方法比传统方法具有更好的性能和分类效果.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号