【24h】

Ranking-Based Autoencoder for Extreme Multi-label Classification

机译:基于排名的AutoEncoder,用于极端多标签分类

获取原文

摘要

Extreme Multi-label classification (XML) is an important yet challenging machine learning task, that assigns to each instance its most relevant candidate labels from an extremely large label collection, where the numbers of labels, features and instances could be thousands or millions. XML is more and more on demand in the Internet industries, accompanied with the increasing business scale / scope and data accumulation. The extremely large label collections yield challenges such as computational complexity, inter-label dependency and noisy labeling. Many methods have been proposed to tackle these challenges, based on different mathematical formulations. In this paper, we propose a deep learning XML method, with a word-vector-based self-attention, followed by a ranking-based AutoEncoder architecture. The proposed method has three major advantages: 1) the autoencoder simultaneously considers the inter-label dependencies and the feature-label dependencies, by projecting labels and features onto a common embedding space; 2) the ranking loss not only improves the training efficiency and accuracy but also can be extended to handle noisy labeled data; 3) the efficient attention mechanism improves feature representation by highlighting feature importance. Experimental results on benchmark datasets show the proposed method is competitive to state-of-the-art methods.
机译:极端多标签分类(XML)是一个重要但具有挑战性的机器学习任务,它分配给来自一个非常大的标签集合的每个实例,它来自一个非常大的标签集合,其中标签,功能和实例的数量可能是数千个或数百万人。 XML在互联网行业的需求越来越多,伴随着越来越多的业务规模/范围和数据累积。极大的标签集合产生挑战,例如计算复杂性,标签间依赖和嘈杂标签。已经提出了许多方法基于不同的数学制剂来解决这些挑战。在本文中,我们提出了一种深入的学习XML方法,具有基于词矢量的自我关注,然后是基于排名的AutalEncoder架构。该方法具有三个主要优点:1)AutoEncoder同时通过将标签和功能投影到常见的嵌入空间上,同时考虑标签间依赖关系和特征标签依赖项; 2)排名亏损不仅可以提高培训效率和准确性,还可以扩展到处理嘈杂的标记数据; 3)高效的注意机制通过突出显示特征重要性来改善特征表示。基准数据集的实验结果显示了所提出的方法对最先进的方法具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号