首页> 外文期刊>ACM transactions on Asian language information processing >An Analysis of a High-Performance Japanese Question Answering System
【24h】

An Analysis of a High-Performance Japanese Question Answering System

机译:高性能日语问答系统的分析

获取原文
获取原文并翻译 | 示例
           

摘要

Twenty-five Japanese Question Answering systems participated in NTCIR QAC2 subtask 1. Of these, our system SAIQA-QAC2 performed the best: MRR = 0.607. SAIQA-QAC2 is an improvement on our previous system SAIQA-Ii that achieved MRR = 0.46 for QAC1. We mainly improved the answer-type determination module and the retrieval module. In general, a fine-grained answer taxonomy improves QA performance but it is difficult to build an accurate answer extraction module for the fine-grained taxonomy because Machine Learning methods require a huge training corpus and hand-crafted rules are hard to maintain. Therefore, we built a fine-grained system by using a coarse-grained named entity recognizer and a Japanese lexicon "Nihongo Goi-taikei." Our experiments show that named entityumerical expression recognition and word sense-based answer extraction mainly contributed to the performance. In addition, we developed a new proximity-based document retrieval module that performs better than BM25. We also compared its performance with MultiText, a conventional proximity-based retrieval method developed for QA.
机译:25个日语问答系统参与了NTCIR QAC2子任务1。其中,我们的系统SAIQA-QAC2表现最好:MRR = 0.607。 SAIQA-QAC2是对我们以前的系统SAIQA-Ii的改进,该系统对QAC1的MRR = 0.46。我们主要改进了答案类型确定模块和检索模块。通常,细粒度的答案分类法可提高质量保证性能,但由于细粒度的学习方法需要庞大的训练语料并且难以维护手工制定的规则,因此难以为细粒度的分类法建立准确的答案提取模块。因此,我们通过使用粗粒度命名实体识别器和日语词典“ Nihongo Goi-taikei”构建了细粒度系统。我们的实验表明,命名实体/数字表达识别和基于词义的答案提取对性能有重要贡献。此外,我们开发了一个新的基于邻近度的文档检索模块,其性能优于BM25。我们还将其性能与MultiText(一种针对QA开发的常规的基于邻近度的检索方法)进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号