首页> 外文会议>International Conference on Computational Linguistics >Towards building a Robust Industry-scale Question Answering System
【24h】

Towards building a Robust Industry-scale Question Answering System

机译:在建立一个强大的行业规模的问题应答系统

获取原文

摘要

Industry-scale NLP systems necessitate two features. 1. Robustness: "zero-shot transfer learning" (ZSTL) performance has to be commendable and 2. Efficiency: systems have to train efficiently and respond instantaneously. In this paper, we introduce the development of a production model called G_(AA)M_A (Go Ahead Ask Me Anything) which possess the above two characteristics. For robustness, it trains on the recently introduced Natural Questions (NQ) dataset. NQ poses additional challenges over older datasets like SQuAD: (a) QA systems need to read and comprehend an entire Wikipedia article rather than a small passage, and (b) NQ does not suffer from observation bias during construction, resulting in less lexical overlap between the question and the article. G_(AA)M_A consists of Attention-over-Attention, diversity among attention heads, hierarchical transfer learning, and synthetic data augmentation while being computationally inexpensive. Building on top of the powerful BERT_(QA) model, G_(AA)M_A provides a ~2.0% absolute boost in F_1 over the industry-scale state-of-the-art (SOTA) system on NQ. Further, we show that G_(AA)M_A transfers zero-shot to unseen real life and important domains as it yields respectable performance on two benchmarks: the BioASQ and the newly introduced CovidQA datasets.
机译:行业规模的NLP系统需要两个功能。 1.稳健性:“零拍摄转移学习”(ZSTL)性能必须值得称道,2.效率:系统必须有效培训并瞬间响应。在本文中,我们介绍了一个名为G_(AA)M_A(继续询问我的产品)的生产模型的发展,其具有上述两个特征。对于稳健性,它在最近引入的自然问题(NQ)数据集中进行培训。 NQ在较旧的数据集上造成额外的挑战,如小队:(a)QA系统需要读取和理解整个维基百科文章而不是小通道,并且(b)NQ不会在施工期间遭受观察​​偏差,导致之间的词汇重叠较少问题和文章。 G_(AA)M_A包括关注,在计算廉价的同时,关注头,分层转移学习和合成数据增强之间的关注。在强大的BERT_(QA)模型之上,G_(AA)M_A在NQ上的行业规模最先进(SOTA)系统上提供了〜2.0%的绝对增压。此外,我们展示了G_(AA)M_A传输零拍摄,以看不见的现实生活和重要域,因为它在两个基准上产生了可观的性能:Bioasq和新引入的CovidQA数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号