首页> 中文期刊> 《软件学报》 >口语对话中的代词指代消解

口语对话中的代词指代消解

         

摘要

提出一套分为两步的代词指代消解算法,算法不需要人工清洗语料及预定义规则.算法第1步采用一些新特征和机器学习算法对名词性指代代词和非名词性指代(non-anaphoric)代词分类,第2步分别对两类代词进行消解.针对名词性代词指代消解,提出了适用于口语对话的特征抽取及表示方法,如代词和候选先行词的距离、语法、语义等的抽取和表示方法,然后通过综合这些特征来选择先行词.针对非名词性指代,将右边界规则(right frontier rule)改进为可以在口语对话中自动抽取的形式,并根据该规则选择先行项.在Byron于2004年发布的语料上测试,消解正确率达到77.0%.召回率达到66.0%.与Byron的工作相比,该方法在保证系统能够自动完成的同时还提高了消解性能.%This paper presents a two-stage pronoun resolution algorithm. It does not need to clean the testing corpus and predefine patterns manually. In the first stage of the algorithm, some new features and machine learning methods are used to classify pronouns into anaphoric and non-anaphoric ones. In the second stage, these two kinds of pronouns are resolved respectively. For the anaphoric ones, some methods are presented to extract distance, syntactic, and semantic features etc. For the non-anaphoric ones, the Right Frontier Rule is improved to do the resolution work. While testing the corpus published by Byron in 2004, this algorithm achieves a precision of 77.0% and a recall of 66.0%. Compared with the work of Byron, the algorithm is fully automatic, and the results are much better.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号