...
首页> 外文期刊>Computer speech and language >Exploring speech retrieval from meetings using the AMI corpus
【24h】

Exploring speech retrieval from meetings using the AMI corpus

机译:探索使用AMI语料从会议中检索语音

获取原文
获取原文并翻译 | 示例
           

摘要

Increasing amounts of informal spoken content are being collected, e.g. recordings of meetings, lectures and personal data sources. The amount of this content being captured and the difficulties of manually searching audio data mean that efficient automated search tools are of increasing importance if its full potential is to be realized. Much existing work on speech search has focused on retrieval of clearly defined document units in ad hoc search tasks. We investigate search of informal speech content using an extended version of the AMI meeting collection. A retrieval collection was constructed by augmenting the AMI corpus with a set of ad hoc search requests and manually identified relevant regions of the recorded meetings. Unlike standard ad hoc information retrieval focussing primarily on precision, we assume a recall-focused search scenario of a user seeking to retrieve a particular incident occurring within meetings relevant to the query. We explore the relationship between automatic speech recognition (ASR) accuracy, automated segmentation of the meeting into retrieval units and retrieval behaviour with respect to both precision and recall. Experimental retrieval results show that while averaged retrieval effectiveness is generally comparable in terms of precision for automatically extracted segments for manual content transcripts and ASR transcripts with high recognition accuracy, segments with poor recognition quality become very hard to retrieve and may fall below the retrieval rank position to which a user is willing search. These changes impact on system effectiveness for recall-focused search tasks. Varied ASR quality across the relevant and non-relevant data means that the rank of some well-recognized relevant segments is actually promoted for ASR transcripts compared to manual ones. This effect is not revealed by the averaged precision based retrieval evaluation metrics typically used for evaluation of speech retrieval. However such variations in the ranks of relevant segments can impact considerably on the experience of the user in terms of the order in which retrieved content is presented. Analysis of our results reveals that while relevant longer segments are generally more robust to ASR errors, and consequentially retrieved at higher ranks, this is often at the expense of the user needing to engage in longer content playback to locate the relevant content in the audio recording. Our overall conclusion being that it is desirable to minimize the length of retrieval units containing relevant content while seeking to maintain high ranking of these items.
机译:越来越多的非正式口语内容被收集,例如会议,讲座和个人数据源的记录。捕获到的这种内容的数量以及手动搜索音频数据的困难意味着,如果要充分发挥其潜力,高效的自动搜索工具将变得越来越重要。现有的有关语音搜索的许多工作都集中在临时搜索任务中对明确定义的文档单元的检索上。我们调查使用AMI会议资料集的扩展版本对非正式演讲内容的搜索。通过使用一组临时搜索请求扩充AMI语料库并手动识别记录的会议的相关区域,构造了一个检索集合。与主要侧重于准确性的标准即席信息检索不同,我们假设用户以检索为重点的搜索场景,该用户试图检索与查询相关的会议中发生的特定事件。我们探讨了自动语音识别(ASR)准确性,将会议自动划分为检索单位和关于准确性和召回率的检索行为之间的关系。实验检索结果表明,尽管平均检索效率在手动识别内容和ASR成绩单的自动提取片段的精度上通常可以与之媲美,但识别质量较差的片段却很难检索,并且可能会落在检索排名以下用户愿意搜索的对象。这些更改会影响针对召回的搜索任务的系统有效性。有关和无关数据的ASR质量各不相同,这意味着与手工相比,对于ASR笔录,一些公认的相关片段的排名实际上得到了提升。通常用于语音检索评估的基于平均精度的检索评估指标并未揭示这种效果。但是,相关段的等级的这种变化会在呈现检索到的内容的顺序方面极大地影响用户的体验。对我们的结果的分析表明,尽管相关的较长段通常对ASR错误更健壮,并因此在较高的级别上进行检索,但这通常是以用户需要进行较长内容回放以在音频记录中定位相关内容为代价的。我们的总体结论是,希望在设法保持这些项目的高排名的同时,尽量减少包含相关内容的检索单元的长度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号