首页> 外文期刊>Information Processing & Management >WabiQA: A Wikipedia-Based Thai Question-Answering System
【24h】

WabiQA: A Wikipedia-Based Thai Question-Answering System

机译:Wabiqa:基于维基百科的泰国问答系统

获取原文
获取原文并翻译 | 示例
           

摘要

With vast information that has been digitized and made available online, manually finding the answer to a question can be tedious. While search engines have emerged to facilitate information needs, users would have to manually read through the retrieved articles to locate the answer to a specific question. Therefore, the ability to automatically understand users' natural language questions and find the correct answers could prove crucial in information retrieval. Indeed, such automatic question-answering solutions have been extensively studied by the natural language processing (NLP) research communities. However, most of the development targets questions and information sources composed in high-resource languages such as English and Chinese. In this paper, we propose WabiQA, a novel system for automatically answering questions in the Thai language using the Thai Wikipedia articles as the knowledge source. Specifically, the proposed method first retrieves the Wikipedia article that is most likely to contain the answer. Then, a bidirectional LSTM model is used to read the article and locate candidate answers, which are ranked by confidence levels and returned to the user. WabiQA won the first prize award from Thailand's National Software Contest 2019 under category "Question-Answering Program from Thai Wikipedia," with 83.5%, 34.80%, and 45.96%, and outperforming the next best competitors' systems by 19.99, 24.26, and 33.10 percentage points in terms of Accuracy@l, EM, and Fl respectively. Furthermore, we also develop a prototype mobile application that aims to facilitate Thai users with visual impairment using voice-to-speech technology and an intelligent question-answer categorization. The findings of this research not only expand the horizon of the possibility to develop intelligent NLP applications for the Thai language using only available existing Thai NLP tools, resources, and deep learning technologies, but also shed light on the possibility to apply such techniques to develop many intelligent NLP tasks for the Thai and other low-resource languages such as reading assessment, writing assistance, and entity linking.
机译:凭借在线数字化和在线提供的广泛信息,手动找到问题的答案可能是乏味的。虽然搜索引擎出现了促进信息需求,但用户必须手动阅读检索到的文章以找到特定问题的答案。因此,能够自动理解用户的自然语言问题并找到正确的答案可能证明在信息检索中至关重要。实际上,通过自然语言处理(NLP)研究社区进行了广泛研究了这种自动问题答案解决方案。但是,大多数开发目标都以英文和中文为单语言组成的问题和信息来源。在本文中,我们向Wabiqa提出了一种新颖的系统,用于使用泰国维基百科文章作为知识来源在泰语语言中自动回答问题。具体地,所提出的方法首先检索最有可能包含答案的维基百科文章。然后,使用双向LSTM模型来读取物品并定位候选答案,这些答案由置信水平排名并返回给用户。 Wabiqa根据“泰国维基百科”类别“泰国维基百科”类别“问题回答计划”赢得了泰国国家软件竞赛的一等奖奖,增长率为83.5%,34.80%和45.96%,并在19.99,24.26和33.10年之前优于下一个最佳竞争对手的系统。分别在精度@ L,EM和FL方面的百分比点。此外,我们还开发了一个原型移动应用程序,旨在使用语音语音技术和智能问题答案分类来促进泰国用户的视觉障碍。这项研究的结果不仅扩大了使用仅现有的现有泰国NLP工具,资源和深度学习技术为泰语语言开发智能NLP应用的可能性,而且还阐明了应用这些技术的可能性泰国和其他低资源语言的许多智能NLP任务,如阅读评估,写作帮助和实体链接。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号