首页> 中文期刊> 《计算机应用》 >深层次分类中候选类别搜索算法

深层次分类中候选类别搜索算法

         

摘要

Aiming at the problem of low classification accuracy and slow processing speed in deep classification,a candidate category searching algorithm for text classification was proposed.Firstly,the search,classification of two-stage processing ideas were introduced,and the weighting of the category hierarchy was analyzed and feature was updated dynamically by combining with the structure characteristics of the category hierarchy tree and the related link between categories as well as other implicit domain knowledge.Meanwhile feature set with more classification judgment was built for each node of the category hierarchy tree.In addition,depth first search algorithm was used to reduce the search range and the pruning strategy with setting threshold was applied to search the best candidate category for classified text.Finally,the classical K Nearest Neighbor (KNN) classification algorithm and Support Vector Machine (SVM) classification algorithm were applied to classification test and contrast analysis on the basis of candidate classes.The experimental resuhs show that the overall classification performance of the proposed algorithm is superior to the traditional classification algorithm,and the average F1 value is about 6% higher than the heuristic search algorithm based on greedy strategy.The algorithm improves the classification accuracy of deep text classification significandy.%针对深层次分类中分类准确率低、处理速度慢等问题,提出一种待分类文本的候选类别搜索算法.首先,引入搜索、分类两阶段的处理思想,结合类别层次树的结构特点和类别间的相关联系等隐含的领域知识,进行了类别层次权重分析和特征项的动态更新,为类树层次结构的各个节点构建更具分类判断力的特征项集合;进而,采用深度优先搜索算法并结合设定阈值的剪枝策略缩小搜索范围,搜索得到待分类文本的最优候选类别;最后,在候选类别的基础上应用经典的K最近邻(KNN)分类算法和支持向量机(SVM)分类算法进行分类测试和对比分析.实验结果显示,所提算法的总体分类性能优于传统的分类算法,而且使平均F1值较基于贪心策略的启发式搜索算法提高了6%左右.该算法显著提高了深层次文本分类的分类准确度.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号