首页> 外文会议>International joint conference on natural language processing;Conference on empirical methods in natural language processing >CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases
【24h】

CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

机译:CoSQL:跨文本自然语言接口到数据库的对话式文本到SQL的挑战

获取原文

摘要

We present CoSQL, a corpus tor building cross-domain, general-purpose database (DB) querying dialogue systems. It consists of 30k+ turns plus 10k+ annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL. clarifying ambiguous questions, or otherwise informing of unanswerable questions. When user questions are answerable by SQL, the expert describes the SQL and execution results to the user, hence maintaining a natural interaction flow. CoSQL introduces new challenges compared to existing task-oriented dialogue datasets: (1) the dialogue states are grounded in SQL, a domain-independent executable representation, instead of domain-specific slot-value pairs, and (2) because testing is done on unseen databases, success requires generalizing to new domains. CoSQL includes three tasks: SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction. We evaluate a set of strong baselines for each task and show that CoSQL presents significant challenges for future research. The dataset.
机译:我们介绍CoSQL,它是构建跨域通用数据库(DB)查询对话系统的语料库。它由30k +转弯和10k +带注释的SQL查询组成,这些查询是从3k对话的绿野仙踪(WOZ)集合中获得的,这些对话查询了138个域中的200个复杂DB。每个对话都模拟了一个实际的数据库查询场景,其中有一群人作为用户探索数据库,而SQL专家则通过SQL检索答案。澄清模棱两可的问题,或以其他方式告知无法回答的问题。当用户的问题由SQL回答时,专家将向用户描述SQL和执行结果,从而保持自然的交互流程。与现有的面向任务的对话数据集相比,CoSQL带来了新的挑战:(1)对话状态基于SQL(一种独立于域的可执行表示形式,而不是特定于域的插槽值对),以及(2)因为测试是在在看不见的数据库中,成功需要推广到新的领域。 CoSQL包括三项任务:SQL基础的对话状态跟踪,查询结果的响应生成以及用户对话行为预测。我们为每个任务评估了一组强有力的基准,并表明CoSQL为未来的研究提出了重大挑战。数据集。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号