【24h】

Scalable Browsing for Large Collections: A Case Study

机译:大型馆藏的可扩展浏览:一个案例研究

获取原文
获取原文并翻译 | 示例

摘要

Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that uses an automatically constructed phrase hierarchy to facilitate browsing of an ordinary large Web site. Phrases are extracted from the full text using a novel combination of rudimentary syntactic processing and sequential grammar induction techniques. The interface is simple, robust and easy to use. To convey a feeling for the quality of the phrases that are generated automatically, a thesaurus used by the organization responsible for the Web site is studied and its degree of overlap with the phrases in the hierarchy is analyzed. Our ultimate goal is to amalgamate hierarchical phrase browsing and hierarchical thesaurus browsing: the latter provides an authoritative domain vocabulary and the former augments coverage in areas the thesaurus does not reach.
机译:短语浏览技术使用从大量信息集中自动提取的短语作为浏览和访问它的基础。本文介绍了一个案例研究,该案例使用自动构建的短语层次结构来方便浏览普通的大型Web站点。使用基本句法处理和顺序语法归纳技术的新颖组合从全文中提取短语。该界面简单,健壮且易于使用。为了表达对自动生成的短语质量的感觉,研究了负责网站的组织使用的同义词库,并分析了其与层次结构中短语的重叠程度。我们的最终目标是合并分层的短语浏览和分层的同义词库浏览:后者提供了权威的领域词汇,而前者则增加了同义词库无法覆盖的区域的覆盖范围。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号