首页> 中文期刊> 《情报学报》 >基于条件随机场与自定义规则的时间表达式识别

基于条件随机场与自定义规则的时间表达式识别

         

摘要

本文致力于信息抽取中时间表达式的识别与提取研究.首先针对基于规则方法时间识别的缺点,将统计序列标注模型--条件随机场应用于时间识别中,充分利用时间表达式的内部和外部特征进行时间识别,提高了时间识别的准确率.然后通过对识别结果进行分析,自定义规则对识别错误结果进行后处理,进一步提高时间识别的召回率,弥补了机器学习模型获取知识不够全面而导致的召回率偏低的问题.实验结果表明,本文方法开放测试的准确率、召回率和F-值分别到达了91 65%、88 13%和89 85%,较传统方法均有所提高,是一种有效的时间表达式识别方法.%This paper focuses on the recognition and extraction of time expression in information extraction. Firstly, inorder to overcome the disadvantages in the time recognition of rules-based methods, probabilistic sequence label model- conditional random fields is used in time recognition, and makes full use of inner and exterior features to recognize time expression, which ensures a higher time recognition rate. Secondly, through careful error analysis, user-defined rules are constructed and a post processing step to overcome the shortcoming of low recall of machine learning model. The experimental results show that the precision, recall and F-measure of the proposed method is 91.65% , 88. 13% and 89. 85% respectively in open test, which is higher than the values of the traditional methods. This shows that it is an effective time expression recognition method.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号