【24h】

Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing

机译:周期性退火时间表:减轻KL消失的简单方法

获取原文

摘要

Variational autoencoders (VAEs) with an auto-regressive decoder have been applied for many natural language processing (NLP) tasks. The VAE objective consists of two terms, (ⅰ) reconstruction and (ⅱ) KL regularization, balanced by a weighting hyper-parameter β. One notorious training difficulty is that the KL term tends to vanish. In this paper we study scheduling schemes for β, and show that KL vanishing is caused by the lack of good latent codes in training the decoder at the beginning of optimization. To remedy this, we propose a cyclical annealing schedule, which repeats the process of increasing β multiple times. This new procedure allows the progressive learning of more meaningful latent codes, by leveraging the informative representations of previous cycles as warm re-starts. The effectiveness of cyclical annealing is validated on a broad range of NLP tasks, including language modeling, dialog response generation and unsuper-vised language pre-training.
机译:已经应用了具有自动回归解码器的变形AutiaceOders(VAES),用于许多自然语言处理(NLP)任务。 VAE目标由两种术语组成,(Ⅰ)重建和(Ⅱ)KL正则化,通过加权超参数β平衡。一个臭名昭着的训练难度是KL学期往往会消失。在本文中,我们研究了β的调度方案,并表明KL消失是由于在优化开始时缺乏训练解码器的良好潜在代码引起的。为了解决这个问题,我们提出了一种周期性退火时间表,其重复多次增加β的过程。这种新过程允许通过利用以前周期的信息表示作为温暖重新启动来实现更有意义的潜伏码的逐步学习。周期性退火的有效性在广泛的NLP任务上验证,包括语言建模,对话响应生成和令人不安的语言预培训。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号