TWEETQA: A Social Media Focused Question Answering Dataset

机译：TWEETQA：以社交媒体为中心的问答数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With social media becoming increasingly popular on which lots of news and real-time events are reported, developing automated question answering systems is critical to the effectiveness of many applications that rely on realtime knowledge. While previous datasets have concentrated on question answering (QA) for formal text like news and Wikipedia, we present the first large-scale dataset for QA over social media data. To ensure that the tweets we collected are useful, we only gather tweets used by journalists to write news articles. We then ask human annotators to write questions and answers upon these tweets. Unlike other QA datasets like SQuAD in which the answers are extractive, we allow the answers to be abstractive. We show that two recently proposed neural models that perform well on formal texts are limited in their performance when applied to our dataset. In addition, even the fine-tuned BERT model is still lagging behind human performance with a large margin. Our results thus point to the need of improved QA systems targeting social media text. ~1

机译：随着社交媒体变得越来越流行，在社交媒体上报道了大量新闻和实时事件，开发自动问答系统对于依赖实时知识的许多应用程序的有效性至关重要。虽然先前的数据集主要集中在新闻和Wikipedia等正式文本的问答（QA）上，但我们还是通过社交媒体数据展示了第一个大规模的QA数据集。为了确保我们收集的推文有用，我们只收集新闻工作者用来撰写新闻文章的推文。然后，我们要求人类注释者在这些推文上写问题和答案。与其他QA数据集（如SQuAD）中的答案是可提取的不同，我们允许答案是抽象的。我们展示了两个最近提出的在形式文本上表现良好的神经模型，在应用于我们的数据集时，其性能受到限制。此外，即使是经过微调的BERT模型也仍然远远落后于人类绩效。因此，我们的结果表明需要针对社交媒体文本的改进的质量检查系统。〜1

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2019年|5020-5031|共12页
会议地点
作者
Wenhan Xiong; Jiawei Wu; Hong Wang; Vivek Kulkarni; Mo Yu; Shiyu Chang; Xiaoxiao Guo; William Yang Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Visual question answering: Datasets, algorithms, and future challenges [J] . Kushal Kafle, Christopher Kanan Computer vision and image understanding . 2017,第octa期

机译：视觉问题解答：数据集，算法和未来挑战
2. Visual question answering: A survey of methods and datasets [J] . Qi Wu, Damien Teney, Peng Wang, Computer vision and image understanding . 2017,第octa期

机译：视觉问题解答：方法和数据集调查
3. Improving question answering for event-focused questions in temporal collections of news articles [J] . Wang Jiexin, Jatowt Adam, Faerber Michael, Information retrieval . 2021,第1期

机译：改进问题回答的问题，以便在新闻文章的时间收集中的临时问题
4. TWEETQA: A Social Media Focused Question Answering Dataset [C] . Wenhan Xiong, Jiawei Wu, Hong Wang, Annual meeting of the Association for Computational Linguistics . 2019

机译：Tweetqa：一个社交媒体集中的问题应答数据集
5. Unsupervised relation learning for event -focused question -answering and domain modelling [D] . Filatova, Elena 2008

机译：面向事件的问题解答和领域建模的无监督关系学习
6. Applying deep matching networks to Chinese medical question answering: a study and a dataset [O] . Junqing He, Mingming Fu, Manshu Tu 2019

机译：将深度匹配网络应用于中医问答：一项研究和数据集
7. TWEETQA: A Social Media Focused Question Answering Dataset [O] . Wenhan Xiong, Jiawei Wu, Hong Wang, 2019

机译：Tweetqa：一个社交媒体集中的问题应答数据集
8. Questions and Answers for Mediation Providers: Mediation and the Americans with Disabilities Act (ADA) [R] . 2005

机译：调解提供者的问题和解答：调解和美国残疾人法案（aDa）

TWEETQA: A Social Media Focused Question Answering Dataset

摘要

著录项

相似文献

相关主题

期刊订阅