首页> 外文会议>International Conference on Smart City and Systems Engineering >Chinese Document Keyword Extraction Algorithm Based on FP-growth
【24h】

Chinese Document Keyword Extraction Algorithm Based on FP-growth

机译:基于FP-Grang的中文文献关键字提取算法

获取原文

摘要

In view of the problems of the existing keyword extraction algorithm, such as large amount of computation and complex calculation process, this paper proposes an algorithm based on FP-Growth to extract keyword from Chinese documents. The FP-Growth algorithm mines word co-occurrence information, excluding the interference of noise words; semantic similarity computation using lexical chain eliminates the influence of synonyms; using TF-IDF and feature fusion method, considering frequency, part of speech and the position of the words, combine TF-IDF with "double comparing method" to calculate the weight of the characteristic factors, and build words weight function to calculate final weight of the candidate words. Experimental results show that the proposed method improves the accuracy rate and recall rate by about 10% compared to the traditional TF-IDF.
机译:鉴于现有关键字提取算法的问题,例如大量计算和复杂计算过程,本文提出了一种基于FP-Grower的算法来从中文文档中提取关键字。 FP-Granges算法挖掘Word Co-Feationence信息,不包括噪声字的干扰;使用词汇链的语义相似性计算消除了同义词的影响;使用TF-IDF和特征融合方法,考虑频率,语音部分和单词的位置,将TF-IDF与“双比较方法”组合来计算特征因子的重量,并构建单词权重函数来计算最终重量候选人的话。实验结果表明,与传统的TF-IDF相比,该方法提高了精度率并召回率约10%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号