首页> 外文学位 >A regression framework for learning to rank in web information retrieval.

【24h】

A regression framework for learning to rank in web information retrieval.

机译：用于学习在网络信息检索中排名的回归框架。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Machine learning approaches for learning ranking functions have been generating much interest from both the web information retrieval community and the machine learning community recently. It has the promise of improved relevancy of search engines and reduced demand for manual parameter tuning. We focus on developing a regression framework for learning to rank with complex loss functions. More specifically, this framework first applies functional iterative or boosting algorithm to compute updates for a given loss function and then fit the updates with a standard regression base learner. We explore supervised learning methodology from machine learning, and we distinguish two types of relevance judgments used as the training data: (1) absolute relevance judgments arising from explicit labeling of query-document pairs; and (2) relative relevance judgments extracted from user click-throughs of search results or converted from the absolute relevance judgments. Within the framework, we propose three novel ranking algorithms and illustrate their application to web search ranking. The first one is to calibrate the existing point-wise(univariant) regression loss to incorporate query difference in terms of introducing nuisance parameters in the statistical models, and we present an alternating optimization method to simultaneously learn the retrieval function and the nuisance parameters. It is an improvement over the existing approach within the category of learning to rank using point-wise regression loss. The second is an extension of gradient boosting methods for point-wise regression loss to complex(multi-variant) loss functions. It is based on optimization of quadratic upper bounds of the loss functions which allows us to present a rigorous convergence analysis of the algorithm. We illustrate an application of this approach in pair-wise preference learning to rank for Web search by combining both preference data and labeled data. The third one is a list-wise approach based on minimum effort optimization that takes into account the entire training data within a query at each iteration. We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem. This more global approach results in faster convergency and signficantly improved performance of the learned ranking functions over existing state-of-the-art methods.;Experimental results are carried out using both data sets obtained from a commercial search engine and widely used IR benchmarking data, namely OHSUMED and TREC. Our results show significant improvements of our proposed methods over existing state-of-the-art methods.

机译：最近，用于学习排名功能的机器学习方法引起了Web信息检索社区和机器学习社区的极大兴趣。它有望改善搜索引擎的相关性，并减少对手动参数调整的需求。我们专注于开发一种回归框架，用于学习使用复杂损失函数进行排名。更具体地说，此框架首先应用功能迭代或增强算法来计算给定损失函数的更新，然后使用标准回归基础学习器拟合更新。我们从机器学习中探索有监督的学习方法，我们区分了两种类型的相关性判断作为训练数据：（1）由显式标记查询文档对引起的绝对相关性判断；（2）从用户对搜索结果的点击中提取或从绝对相关性判断转换而来的相对相关性判断。在该框架内，我们提出了三种新颖的排名算法，并说明了它们在网络搜索排名中的应用。第一个方法是校准现有的逐点（单变量）回归损失，以在统计模型中引入扰民参数方面纳入查询差异，并且我们提出一种交替优化方法，以同时学习检索功能和扰民参数。这是对使用点逐步回归损失进行排名的学习方法中现有方法的一种改进。第二个是将梯度提升方法用于点逐步回归损失扩展为复杂（多变量）损失函数。它基于损失函数的二次上限的优化，这使我们能够对算法进行严格的收敛分析。我们通过结合偏好数据和标记数据，说明了该方法在成对偏好学习中对Web搜索进行排名的应用。第三种是基于列表的最小工作量优化方法，该方法考虑了每次迭代中查询中的整个训练数据。我们使用功能迭代方法解决此优化问题，其中通过解决等渗回归问题来计算每次迭代的更新。与现有的最新方法相比，这种更具全局性的方法可以使学习的排名函数更快地收敛并显着改善性能;使用从商业搜索引擎获得的数据集和广泛使用的IR基准测试数据来执行实验结果，即OHSUMED和TREC。我们的结果表明，相对于现有的最新方法，我们提出的方法有了重大改进。

著录项

作者
Zheng, Zhaohui.;
展开▼
作者单位

State University of New York at Buffalo.;

展开▼
授予单位 State University of New York at Buffalo.;
学科 Computer Science.
学位 Ph.D.
年度 2008
页码 88 p.
总页数 88
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Learning to rank by using multivariate adaptive regression splines and conic multivariate adaptive regression splines [J] . Altinok Gulsah, Karagoz Pinar, Batmaz Inci Computational Intelligence . 2021,第1期

机译：学习通过使用多变量自适应回归样条和圆锥多变量自适应回归样条排列
2. Joint Embedding Learning and Low-Rank Approximation: A Framework for Incomplete Multiview Learning [J] . Tao Hong, Hou Chenping, Yi Dongyun, Cybernetics, IEEE Transactions on . 2021,第3期

机译：联合嵌入学习和低秩近似：不完整多视图学习的框架
3. Learning to rank with click-through features in a reinforcement learning framework [J] . Amir Hosein Keyhanipour, Behzad Moshiri, Maryam Piroozmand, International journal of web information systems . 2016,第4期

机译：在强化学习框架中通过点击功能进行排名
4. A WEBIR Crawling Framework for Retrieving Highly Relevant Web Documents: Evaluation Based on Rank Aggregation and Result Merging Algorithms [C] . Shekhar Shashi, Arya K.V., Agarwal Rohit, 2011 International Conference on Computational Intelligence and Communication Networks . 2011

机译：一个用于检索高度相关的Web文档的WEBIR爬网框架：基于等级汇总和结果合并算法的评估
5. Spectral Regression: A regression framework for efficient regularized subspace learning. [D] . Cai, Deng. 2009

机译：频谱回归：一种用于高效正规化子空间学习的回归框架。
6. Your Relevance Feedback Is Essential: Enhancing the Learning to Rank Using the Virtual Feature Based Logistic Regression [O] . Fei Cai, Deke Guo, Honghui Chen, 2009

机译：您的相关反馈至关重要：使用基于虚拟特征的Logistic回归来提高学习排名
7. Learning as a Service: A Web-Based Learning Framework for Communities of Professionals on the Web 2.0 [O] . Marc Spaniol, Ralf Klamma, Yiwei Cao 2009

机译：服务即学习：针对Web 2.0上的专业人员社区的基于Web的学习框架

A regression framework for learning to rank in web information retrieval.

摘要

著录项

相似文献

相关主题

期刊订阅