首页> 外文会议>International conference on world wide web >A Hierarchical Dirichlet Model for Taxonomy Expansion for Search Engines
【24h】

A Hierarchical Dirichlet Model for Taxonomy Expansion for Search Engines

机译:搜索引擎分类扩展的分层Dirichlet模型

获取原文

摘要

Emerging trends and products pose a challenge to modern search engines since they must adapt to the constantly changing needs and interests of users. For example, vertical search engines, such as Amazon, eBay, Walmart. Yelp and Yahoo! Local, provide business category hierarchies for people to navigate through millions of business listings. The category information also provides important ranking features that can be used to improve search experience. However, category hierarchies are often manually crafted by some human experts and they are far from complete. Manually constructed category hierarchies cannot handle the ever-changing and sometimes long-tail user information needs. In this paper, we study the problem of how to expand an existing category hierarchy for a searchavigation system to accommodate the information needs of users more comprehensively. We propose a general framework for this task, which has three steps: 1) detecting meaningful missing categories; 2) modeling the category hierarchy using a hierarchical Dirichlet model and predicting the optimal tree structure according to the model; 3) reorganizing the corpus using the complete category structure, i.e., associating each webpage with the relevant categories from the complete category hierarchy. Experimental results demonstrate that our proposed framework generates a high-quality category hierarchy and significantly boosts the retrieval performance.
机译:新兴趋势和产品对现代搜索引擎构成了挑战,因为它们必须适应不断变化的用户需求和兴趣。例如,垂直搜索引擎,例如Amazon,eBay,Walmart。 Yelp和Yahoo!本地,为人们提供业务类别层次结构,以供人们浏览数百万个业务列表。类别信息还提供了重要的排名功能,可用于改善搜索体验。但是,类别层次结构通常是由一些人类专家手动制作的,它们还远远不够完整。手动构建的类别层次结构无法满足日新月异的用户信息需求。在本文中,我们研究了如何针对搜索/导航系统扩展现有类别层次结构以更全面地满足用户的信息需求的问题。我们为该任务提出了一个通用框架,该框架包括三个步骤:1)检测有意义的缺失类别; 2)使用分级Dirichlet模型对类别层次进行建模,并根据该模型预测最佳树结构; 3)使用完整的类别结构重新组织语料库,即,将每个网页与完整类别层次结构中的相关类别相关联。实验结果表明,我们提出的框架可生成高质量的类别层次结构,并显着提高检索性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号