Rapid Development of a Corpus with Discourse Annotations using Two-stage Crowdsourcing

机译：使用两阶段众包的带有话语注释的语料库的快速发展

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a novel approach for rapidly developing a corpus with discourse annotations using crowdsourcing. Although discourse annotations typically require much time and cost owing to their complex nature, we realize discourse annotations in an extremely short time while retaining good quality of the annotations by crowdsourcing two annotation subtasks. In fact, our experiment to create a corpus comprising 30,000 Japanese sentences took less than eight hours to run. Based on this corpus, we also develop a supervised discourse parser and evaluate its performance to verify the usefulness of the acquired corpus.

机译：我们提出了一种使用众包快速开发带有话语注释的语料库的新颖方法。尽管由于其复杂的性质，话语注释通常需要大量时间和成本，但我们可以在极短的时间内实现话语注释，同时通过众包两个注释子任务来保持注释的良好质量。实际上，我们创建包含30,000个日语句子的语料库的实验耗时不到八个小时。基于此语料库，我们还开发了一个监督的语篇解析器，并评估其性能以验证所获取语料库的有用性。

著录项

来源
《International conference on computational linguistics》|2014年|269-278|共10页
会议地点
作者
Daisuke Kawahara; Yuichiro Machida; Tomohide Shibata; Sadao Kurohashi; Hayato Kobayashi; Manabu Sassano;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. 我国大学本科生英语作文中元话语使用:基于语料库的纵向对比研究 [J] . 阮周林中国应用语言学：英文版 . 2019,第004期
2. Altruistic Crowdsourcing for Arabic Speech Corpus Annotation [J] . Soumia Bougrine, Hadda Cherroun, Ahmed Abdelali Procedia Computer Science . 2017,第1期

机译：阿拉伯语音语料库注释的无私众包
3. The GV-LEx corpus of tales in French Text and speech corpora enrichedn with lexical, discourse, structural, phonemic and prosodic annotations [J] . Doukhan David, Rosset Sophie, Rilliard Albert, Language Resources and Evaluation . 2015,第3期

机译：法语文本和语音语料库中的GV-LEx故事语料库，丰富了词汇，话语，结构，音位和韵律注释
4. Representatinal issues in annotation: Using the Australian map task corpus to relate prosody and discourse structure [J] . Lesley Stirling, Janet Fletcher, Ilana Mushin 20f Speech Communication . 2001,第1a2期

机译：注释中的代表性问题：使用澳大利亚地图任务语料库关联韵律和话语结构
5. Rapid Development of a Corpus with Discourse Annotations using Two-stage Crowdsourcing [C] . Daisuke Kawahara, Yuichiro Machida, Tomohide Shibata, International conference on computational linguistics . 2014

机译：使用两阶段众包的话语注释快速发展了语料库
6. Facilitating Corpus Annotation by Improving Annotation Aggregation [D] . Felt, Paul Lewis. 2015

机译：通过改进注释聚合来促进语料库注释
7. Crowdsourcing pneumothorax annotations using machine learning annotations on the NIH chest X-ray dataset [O] . Ross W. Filice, Anouk Stein, Carol C. Wu, 2020

机译：使用机器学习注释在NIH胸X射线数据集上的众包注释
8. Design of Annotation system of Russian Speech Corpus Based on Crowdsourcing [O] . Yanzhou Ma 2016

机译：基于众包的俄罗斯语音语料库注释系统设计

Rapid Development of a Corpus with Discourse Annotations using Two-stage Crowdsourcing

摘要

著录项

相似文献

相关主题

期刊订阅