首页> 外文学位 >Context-based publication search paradigm in literature digital libraries.
【24h】

Context-based publication search paradigm in literature digital libraries.

机译:文献数字图书馆中基于上下文的出版物搜索范例。

获取原文
获取原文并翻译 | 示例

摘要

This thesis identifies two problems with the task of searching literature digital libraries: (a) there are no effective paper-scoring and ranking mechanisms. Without a scoring and ranking system, users are often forced to scan a large and diverse set of publications listed as search results and potentially miss the important ones. (b) Topic diffusion is a common problem: publications returned by a keyword-based search query often fall into multiple topic areas, not all of which are of interest to users.; As a response to the problems listed above, this thesis proposes a new literature digital library search paradigm, called context-based search, which effectively ranks search outputs and controls the topic diversity of keyword-based search query outputs. Our approach can be summarized as follows. During pre-querying, publications are classified to pre-specified ontology-based contexts, and query-independent context scores are attached to papers with respect to their assigned contexts. When a query is posed, relevant contexts are selected, search is performed within the selected contexts, context scores of publications are revised into relevancy scores with respect to the query at hand and the context that they are in, and query outputs are ranked within each relevant context. With the context-based search approach, (1) query output topic diversity is minimized, (2) query output size is reduced, (3) user time spent scanning query results is decreased, and (4) query output ranking accuracy is increased.; In addition to keyword-based search, one important feature in searching literature digital libraries is to find "related publications" of a given publication. Existing approaches do not take into account publication topics in the relatedness computation, allowing topic diffusion to permeate across query output publications. In this thesis, we propose a new way to measure "relatedness" by incorporating "contexts" of publications. We define three ways of context-based relatedness, namely, (a) relatedness between two contexts (context-to-context relatedness) by using publications that are assigned to the contexts and the context structures in the context hierarchy, (b) relatedness between a context and a paper (paper-to-context relatedness), which is used to rank the relatedness of contexts with respect to a paper, and (c) relatedness between two papers (paper-to-paper relatedness) by using both paper-to-context and context-to-context relatedness measurements.; Using existing biomedical ontology terms as contexts for genomics-oriented publications, our experiments indicate that the context-based approach is highly accurate and effectively solves the topic diffusion problem across search results.
机译:本文确定了搜索文献数字图书馆任务的两个问题:(a)没有有效的论文评分和排名机制。如果没有评分和排名系统,则通常会迫使用户扫描大量多样的出版物,这些出版物被列为搜索结果,并有可能错过重要出版物。 (b)主题传播是一个普遍的问题:基于关键字的搜索查询返回的出版物通常属于多个主题领域,并非所有这些主题都是用户感兴趣的;针对上述问题,本文提出了一种新的文献数字图书馆搜索范式,称为基于上下文的搜索,可以有效地对搜索输出进行排名,并控制基于关键词的搜索查询输出的主题多样性。我们的方法可以总结如下。在预查询期间,将出版物分类到基于本体的预先指定的上下文中,并且将与查询无关的上下文评分附加到其分配的上下文方面。提出查询时,选择相关的上下文,在所选的上下文中进行搜索,将发布的上下文得分修订为与当前查询及其所处上下文相关的得分,并将查询输出排列在每个上下文中相关上下文。使用基于上下文的搜索方法,(1)最小化查询输出主题的多样性,(2)减小查询输出的大小,(3)减少花费在扫描查询结果上的用户时间,以及(4)提高查询输出的排名准确性。 ;除了基于关键字的搜索外,搜索文献数字图书馆的一个重要功能是查找给定出版物的“相关出版物”。现有方法在相关性计算中未考虑发布主题,从而使主题扩散渗透到查询输出发布中。在本文中,我们提出了一种通过纳入出版物的“背景”来衡量“关联性”的新方法。我们定义了三种基于上下文的关联性方式,即(a)通过使用分配给上下文的出版物和上下文层次结构中的上下文结构,两个上下文之间的关联性(上下文到上下文的关联性);(b)上下文和论文(论文与论文之间的关联性),用于对论文相对于论文的上下文相关性进行排名,以及(c)两种论文之间的相关性(论文与论文的关联性)上下文和上下文之间的相关性度量。使用现有的生物医学本体论术语作为面向基因组学的出版物的上下文,我们的实验表明,基于上下文的方法非常准确,可以有效解决整个搜索结果中的主题扩散问题。

著录项

  • 作者

    Ratprasartporn, Nattakarn.;

  • 作者单位

    Case Western Reserve University.;

  • 授予单位 Case Western Reserve University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 151 p.
  • 总页数 151
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号