WEB Page Collection Using Automatic Document Segmentation for Spoken Document Retrieval

机译：使用自动文档分段进行语音文档检索的WEB页面收集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In spoken document retrieval, the main factor affecting retrieval performance is speech recognition errors. Refining speech recognition technology can make improvement of speech recognition performance. However, if a query has out-ofvocabulary words, we cannot get the spoken documents related to the query. This paper describes spoken document retrieval using document expansion based on WEB whose contents are similar to the spoken documents retrieved. Most of spoken documents have some topics. Therefore, each spoken document is automatically divided into some segments depending on topic. And then, similar WEB pages to the spoken document can be collected using the query derived from the segment. The document expansion using WEB achieved improvement of the spoken document retrieval performance from 0.364 to 0.401 on interpolated 11-points average precition metric.

机译：在语音文档检索中，影响检索性能的主要因素是语音识别错误。完善语音识别技术可以提高语音识别性能。但是，如果查询中有非词汇词，我们将无法获得与查询相关的语音文档。本文描述了使用基于WEB的文档扩展的语音文档检索，其内容类似于所检索的语音文档。大多数口头文件都有一些主题。因此，每个语音文档会根据主题自动分为几个部分。然后，可以使用从句段派生的查询来收集与语音文档相似的WEB页面。使用内插11点平均精度指标，使用WEB进行文档扩展可以将语音文档检索性能从0.364提高到0.401。

著录项

来源
《Asia-Pacific Signal and Information Processing Association Annual Summit and Conference》|2011年|1-4|共4页
会议地点
作者
Hiromitsu Nishizaki; Kiyotaka Sugimoto; Yoshihiro Sekiguchi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机的应用;信号处理;
关键词

相似文献

外文文献
中文文献
专利

1. GMM adaptation based online speaker segmentation for spoken document retrieval [J] . Kyungmi Park, Jeong-sik Park, Yung-Hwan Oh Consumer Electronics, IEEE Transactions on . 2010,第2期

机译：基于GMM自适应的在线说话人分割，用于语音文档检索
2. A novel approach to perform context-based automatic spoken document retrieval of political speeches based on wavelet tree indexing [J] . Gupta Anishka, Yadav Divakar Multimedia Tools and Applications . 2021,第14期

机译：基于小波树索引的基于语境的自动口语文献检索的新方法
3. A study on automatic creation of a comparable document collection in cross-language information retrieval [J] . Talvensaari T, Laurikkala J, Jarvelin K, The Journal of Documentation . 2006,第3期

机译：在跨语言信息检索中自动创建可比文档集合的研究
4. WEB Page Collection Using Automatic Document Segmentation for Spoken Document Retrieval [C] . Hiromitsu Nishizaki, Kiyotaka Sugimoto, Yoshihiro Sekiguchi Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2011

机译：网页集合使用自动文档分段进行口头文档检索
5. Parallel information retrieval and visualization on large, unstructured document collections using web link information. [D] . Alford, Kenneth Lowell. 2000

机译：使用Web链接信息对大型非结构化文档集合进行并行信息检索和可视化。
6. Document retrieval on repetitive string collections [O] . Travis Gagie, Aleksi Hartikainen, Kalle Karhu, -1

机译：重复字符串集合的文档检索
7. Automatic Story Segmentation for Spoken Document Retrieval [O] . Pui Yu Hui, Xiaoou Tang, Helen M. Meng, 2000

机译：语音文件检索的自动故事分割

WEB Page Collection Using Automatic Document Segmentation for Spoken Document Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅