首页> 中文期刊>计算机工程 >中文产品评论中属性词抽取方法研究

中文产品评论中属性词抽取方法研究

     

摘要

Aiming at solving problems of relatively low precision, rate of coverage when using existing attribute word extraction methods, this paper adopts Baidu Baike and co-occurrence proportion of adjacent words after word segmentation to identify new domain words, decreases impact on recognition of attribute word caused by segmentation errors.This paper designs part of speech sequence templates which contain noun and noun phrase templates, verb and verb phrase templates to obtain attribute word candidates from Chinese product comments, then utilizes statistical technique and natural language processing technique to filter attribute word candidates.Experimental results show that for the 3 623 mobile phone comments, this method obtains 1 732 attribute words, the precision, recall and f-measure reach 0.565, 0.726 and 0.636, and it has good extraction performance.%针对现有属性词抽取方法的准确率和覆盖率偏低问题,利用百度百科和分词后相邻词语同现比例识别专业领域生词,降低分词错误对属性词识别的影响,在中文产品评论语料中通过设计词性序列模板获得候选属性词集,该词性序列模板包含名词和名词短语模板、动词和动词短语模板,采用统计技术和自然语言处理技术筛选候选属性词.实验结果表明,对于3 623篇手机评论文章,利用该方法可获得1 732个属性词,准确率为0.565、召回率为0.726、调和平均值为0.636,具有较好的抽取性能.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号