首页> 外文会议>IEEE International Symposium on Information Theory >A Provably Convergent Information Bottleneck Solution via ADMM
【24h】

A Provably Convergent Information Bottleneck Solution via ADMM

机译:通过ADMM提供一种可怕的收敛信息瓶颈解决方案

获取原文

摘要

The Information bottleneck (IB) method enables optimizing over the trade-off between compression of data and prediction accuracy of learned representations, and has successfully and robustly been applied to both supervised and unsupervised representation learning problems. However, IB has several limitations. First, the IB problem is hard to optimize. The IB Lagrangian $mathcal{L}_{IB}: =I(X;Z)-eta I(Y;Z)$ is non-convex and existing solutions guarantee only local convergence. As a result, the obtained solutions depend on initialization. Second, the evaluation of a solution is also a challenging task. Conventionally, it resorts to characterizing the information plane, that is, plotting $I(Y;Z)$ versus $I(X;Z)$ for all solutions obtained from different initial points. Furthermore, the IB Lagrangian has phase transitions while varying the multiplier $eta$. At phase transitions, both $I(X;Z)$ and $I(Y;Z)$ increase abruptly and the rate of convergence becomes significantly slow for existing solutions. Recent works with IB adopt variational surrogate bounds to the IB Lagrangian. Although allowing efficient optimization, how close are these surrogates to the IB Lagrangian is not clear. In this work, we solve the IB Lagrangian using augmented Lagrangian methods. With augmented variables, we show that the IB objective can be solved with the alternating direction method of multipliers (ADMM). Different from prior works, we prove that the proposed algorithm is consistently convergent, regardless of the value of $eta$. Empirically, our gradient-descent-based method results in information plane points that are comparable to those obtained through the conventional Blahut-Arimoto-based solvers, and is convergent for a wider range of the penalty coefficient than previous ADMM-based solvers.
机译:信息瓶颈(IB)方法使得能够在压缩数据和学习表示的预测准确性之间进行优化,并且已成功且强大地应用于监督和无监督的陈述学习问题。但是,IB有几个限制。首先,IB问题很难优化。 IB拉格朗日 $ mathcal {l} _ {ib}:= i(x; z) - beta i(y; z)$ 是非凸,现有解决方案仅保证本地收敛。结果,所获得的解决方案取决于初始化。其次,解决方案的评估也是一个具有挑战性的任务。传统上,它令人诉求表征信息平面,即绘图 $ i(y; z) $ 相对 $ i(x; z) $ 对于从不同初始点获得的所有溶液。此外,IB Lagrangian在改变乘数时具有相位转换 $ beta $ 。在阶段过渡,都是 $ i(x; z) $ $ i(y; z) $ 对于现有解决方案,突然增加,收敛速度变得显着缓慢。最近与IB的作品采用IB Lagrangian的变分代理界限。虽然允许有效的优化,但这些代理对IB拉格朗安的近似尚未清楚。在这项工作中,我们使用增强拉格朗日方法解决了IB Lagrangian。通过增强变量,我们表明IB目标可以用乘数(ADMM)的交替方向方法来解决。与现有作品不同,我们证明了所提出的算法始终会聚,无论值如何 $ beta $ 。经验上,我们基于梯度下降的方法导致与通过传统的Blahut-Arimoto的溶剂获得的信息平面点相当,并且对于更广泛的惩罚系数的惩罚系数的收敛性比以前的基于ADMM的溶剂。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号