首页> 外文会议>Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition
【24h】

An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition

机译:汉语自动分词对口语理解和命名实体识别的实证研究

获取原文

摘要

Word segmentation is usually recognized as the first step for many Chinese natural language processing tasks, yet its impact on these subsequent tasks is relatively under-studied. For example, how to solve the mismatch problem when applying an existing word seg-menter to new data? Does a better word seg-menter yield a better subsequent NLP task performance? In this work, we conduct an initial attempt to answer these questions on two related subsequent tasks: semantic slot filling in spoken language understanding and named entity recognition. We propose three techniques to solve the mismatch problem: using word segmentation outputs as additional features, adaptation with partial-learning and taking advantage of n-best word segmentation list. Experimental results demonstrate the effectiveness of these techniques for both tasks and we achieve an error reduction of about 11% for spoken language understanding and 24% for named entity recognition over the baseline systems.
机译:分词通常被认为是许多中文自然语言处理任务的第一步,但是对这些后续任务的影响却相对未被充分研究。例如,将现有的词段分割器应用于新数据时,如何解决不匹配问题?更好的词段指导器会带来更好的后续NLP任务性能吗?在这项工作中,我们进行了一个初步的尝试,以回答两个相关的后续任务:在口头理解中的语义空位填充和命名实体识别。我们提出了三种解决不匹配问题的技术:使用分词输出作为附加功能,通过部分学习进行自适应以及利用n个最佳分词列表的优势。实验结果证明了这些技术对于两种任务的有效性,并且在基准系统上,我们的口语理解能力降低了11%,命名实体识别能力降低了24%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号