首页> 外文期刊>Journal of the American Society for Information Science and Technology >On the Complexity of Rocchio's Similarity-Based Relevance Feedback Algorithm
【24h】

On the Complexity of Rocchio's Similarity-Based Relevance Feedback Algorithm

机译:基于Rocchio相似度的相关反馈算法的复杂度

获取原文
获取原文并翻译 | 示例
           

摘要

Rocchio's similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive learning algorithm from examples in searching for documents represented by a linear classifier. Despite its popularity in various applications, there is little rigorous analysis of its learning complexity in literature. In this article, the authors prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d~2(log d + log n)) over the discretized vector space {0, ..., n-1}~d, when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q, 0) over {0,..., n - 1 }~d can be improved to, at most, 1 + 2k (n - 1) (log d+ log(n - 1)), where k is the number of nonzero components in q. Several lower bounds on the learning complexity are also obtained for Rocchio's algorithm. For example, the authors prove that Rocchio's algorithm has a lower bound Ω((_2~d)log n) on its learning complexity over the Boolean vector space {0,1}~d.
机译:Rocchio的基于相似度的相关性反馈算法是信息检索中最重要的查询重构方法之一,本质上是从搜索线性分类器表示的文档中的示例中得出的一种自适应学习算法。尽管它在各种应用中很受欢迎,但对其文学中的学习复杂性却缺乏严格的分析。在本文中,作者首次证明了Rocchio算法的学习复杂度在离散向量空间{0,...,n-1}上为O(d + d〜2(log d + log n))。 〜d,当使用内积相似性度量时。在{0,...,n-1}〜d上搜索由单调线性分类器(q,0)表示的文档的学习复杂度上限可以提高到最多1 + 2k(n- 1)(log d + log(n-1)),其中k是q中非零分量的数量。 Rocchio算法还获得了学习复杂度的几个下限。例如,作者证明Rocchio算法在布尔向量空间{0,1}〜d上具有较低的学习复杂度Ω((_ 2〜d)log n)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号