首页> 外文期刊>ACM transactions on intelligent systems >Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size
【24h】

Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size

机译:没有人群的众包:使用贝叶斯模型进行可靠的在线物种识别以最小化人群规模

获取原文
获取原文并翻译 | 示例
           

摘要

We present an incremental Bayesian model that resolves key issues of crowd size and data quality for consensus labeling. We evaluate our method using data collected from a real-world citizen science program, BEEWATCH, which invites members of the public in the United Kingdom to classify (label) photographs of bumblebees as one of 22 possible species. The biological recording domain poses two key and hitherto unaddressed challenges for consensus models of crowdsourcing: (1) the large number of potential species makes classification difficult, and (2) this is compounded by limited crowd availability, stemming from both the inherent difficulty of the task and the lack of relevant skills among the general public. We demonstrate that consensus labels can be reliably found in such circumstances with very small crowd sizes of around three to five users (i.e., through group sourcing). Our incremental Bayesian model, which minimizes crowd size by re-evaluating the quality of the consensus label following each species identification solicited from the crowd, is competitive with a Bayesian approach that uses a larger but fixed crowd size and outperforms majority voting. These results have important ecological applicability: biological recording programs such as BEEWATCH can sustain themselves when resources such as taxonomic experts to confirm identifications by photo submitters are scarce (as is typically the case), and feedback can be provided to submitters in a timely fashion. More generally, our model provides benefits to any crowdsourced consensus labeling task where there is a cost (financial or otherwise) associated with soliciting a label.
机译:我们提出了一种增量贝叶斯模型,该模型解决了共识标签的人群规模和数据质量的关键问题。我们使用从现实世界公民科学计划BEEWATCH收集的数据评估我们的方法,该计划邀请英国公众将大黄蜂的照片分类(标记)为22种可能的物种之一。生物记录领域对众包共识模型提出了两个关键的,迄今尚未解决的挑战:(1)大量潜在物种使分类变得困难,(2)人群可利用性有限,这是由于两者固有的困难所致。任务和普通民众缺乏相关技能。我们证明,在这种情况下,只有三到五个用户的很小规模的人群(即通过小组采购)可以可靠地找到共识标签。我们的增量贝叶斯模型通过在人群中寻求每个物种识别后通过重新评估共识标签的质量来最小化人群规模,与使用较大但固定的人群规模并且胜过多数投票的贝叶斯方法具有竞争性。这些结果具有重要的生态适用性:当诸如分类专家等资源不足以确认照片提交者确认身份的生物记录程序(如通常)时,BEEWATCH等生物记录程序就可以维持自身生存,并且可以及时向提交者提供反馈。更笼统地说,我们的模型可以为任何众包共识标签任务带来好处,因为在这种情况下,征集标签会产生成本(财务或其他方面的费用)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号