【24h】

Large Scale Discriminative Metric Learning

机译:大规模判别度量学习

获取原文

摘要

We consider the learning of a distance metric, using the Localized Supervised Metric Learning (LSML) scheme, that discriminates entities characterized by high dimensional feature attributes, with respect to labels assigned to each entity. LSML is a supervised learning scheme that learns a Mahalanobis distance grouping together features with the same label and repulsing features with different labels. In this paper, we propose an efficient and scalable implementation of LSML allowing us to scale significantly and process large data sets, both in terms of dimensions and instances. This implementation of LSML is programmed in SystemML with an R-like syntax, and compiled, optimized, and executed on Hadoop. We also propose experimental approaches for the tuning of LSML parameters yielding significant analytical and empirical improvements in terms of discriminative measures such as label prediction accuracy. We present experimental results on both synthetic and real-world data (feature vectors representing patients in an Intensive Care Unit with labels corresponding to different conditions) assessing respectively how well the algorithm scales and how well it works on real world prediction problems.
机译:我们考虑使用局部监督的度量学习(LSML)方案来学习距离度量,该方案针对分配给每个实体的标签区分以高维特征属性为特征的实体。 LSML是一种有监督的学习方案,用于学习将马哈拉诺比斯距离分组为具有相同标签的特征并将其与具有不同标签的排斥特征组合在一起的方法。在本文中,我们提出了一种LSML的高效且可扩展的实现,它使我们可以在维度和实例方面进行大规模扩展并处理大型数据集。 LSML的这种实现是使用类似于R的语法在SystemML中编程的,并在Hadoop上进行编译,优化和执行。我们还提出了用于调整LSML参数的实验方法,从而在诸如标签预测准确性之类的判别措施方面产生了重大的分析和经验改进。我们在合成数据和实际数据(代表重症监护病房中具有不同条件的标签的患者的特征向量)上均给出了实验结果,分别评估了算法的可扩展性以及在现实世界中的预测问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号