首页> 中文期刊> 《西安邮电学院学报》 >一种多策略的中文领域本体概念抽取方法

一种多策略的中文领域本体概念抽取方法

         

摘要

为了提高中文领域本体概念抽取的准确率和召回率,提出一种多策略的中文领域本体概念抽取方法.该方法使用模式匹配法改进原有的单字合并法,经词性过滤和缺陷检测筛选出概念集组成用户词典,并送入概念抽取系统进行二次分词获得候选概念集;利用词频-逆向文本频率(TFIDF)方法和信息熵融合得到TFIDFE方法,计算概念权重以获得领域概念集.实验结果表明,该方法在领域术语抽取的准确率、召回率和F值上均有较好的效果.%In order to improve the precision and recall rate of Chinese domain ontology concept extraction,a multi-strategy Chinese domain ontology concept extraction method is proposed in this paper.In this method,the method of pattern matching is used to improve the original Character Combine Method.A user dictionary is composed after screening concept set with word filter and defect detection,and then fed into the system for the second word segmentation to get candidate concept set.Fused from the TFIDF method and information entropy,the TFIDFE method is used to calculate the concept of weight to obtain the domain concept set.Experimental results show that the proposed method has a good effect on the accuracy,recall and F value of domain term extraction.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号