首页> 中文期刊> 《软件学报》 >基于距离度量的多样性图排序方法

基于距离度量的多样性图排序方法

         

摘要

Expansion relevance which combines both relevance and diversity into a single function is resorted to a submodular optimization objective that can be solved by applying the classic cardinality constrained monotone submodular maximization.However,expansion relevance do not directly capture the dis-similarity over a pair of nodes.Existing submodular algorithms are sequential and not easy to take full advantage of the power of distributed cluster computing platform,such as Spark,to significantly improve the efficiency of algorithm.To tackle this issue,in this paper,a distance metric,which is defined by a sum function of personalized PageRank scores over the symmetry difference of neighbors of a pair of nodes,is first introduced to capture the pairwise dis-similarity over pairs of nodes.Then,the problem of diversified ranking on graphs is formulated as a max-sum k-dispersion problem with metrical edge weight.A polynomial time 2-approximate algorithm is proposed to solve the problem.Considering the computational independence of different pairs of nodes,a MapReduce algorithm is further developed to boost the efficiency of the process.Finally,extensive experiments are conducted on real network datasets to verify the effectiveness and efficiency of the proposed algorithm.%有效结合查询相关性和多样性的扩展相关性,是多样性图排序问题的一种优化目标.基于扩展相关性的多样性图排序可建模为一个子模函数优化问题,贪心子模优化算法可近似求解该问题.然而,扩展相关性不能直接度量节点间的不相似性.子模优化算法是串行算法,不能充分利用诸如Spark等集群计算平台有效提高算法效率.针对这些问题,提出一种描述节点间不相似性的距离度量.基于该距离度量,将多样性图排序问题建模为一个在查询相关节点集上构造的带权完全图的最大和k-dispersion优化问题.提出了求解该问题的多项式时间2-近似算法.鉴于不同节点对的距离度量计算是相互独立的,进一步提出了基于MapReduce编程模型的并行化多样性图排序算法.最后,在真实图数据集上验证了所提出算法的高效性和有效性.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号