Benefits from Variational Regularization in Language Models

Ferner Cornelia; Wegenkittl Stefan

首页> 外文期刊>Machine Learning and Knowledge Extraction >Benefits from Variational Regularization in Language Models

【24h】

Benefits from Variational Regularization in Language Models

机译：

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Representations from common pre-trained language models have been shown to suffer from the degeneration problem, i.e., they occupy a narrow cone in latent space. This problem can be addressed by enforcing isotropy in latent space. In analogy with variational autoencoders, we suggest applying a token-level variational loss to a Transformer architecture and optimizing the standard deviation of the prior distribution in the loss function as the model parameter to increase isotropy. The resulting latent space is complete and interpretable: any given point is a valid embedding and can be decoded into text again. This allows for text manipulations such as paraphrase generation directly in latent space. Surprisingly, features extracted at the sentence level also show competitive results on benchmark classification tasks.

著录项

来源
《Machine Learning and Knowledge Extraction》 |2022年第2期|542-555|共14页
作者
Ferner Cornelia; Wegenkittl Stefan;
展开▼
作者单位

Salzburg Univ Appl Sci;

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类
关键词
language models; regularization; isotropy; generalizability; semantic reasoning;

相似文献

外文文献
中文文献
专利

1. Image Restoration using Nonlocal Regularized Variational Model with Spatially Adapted Regularization Parameter [J] . Chen Pan, Helin Feng, Tudor Barbu Mathematical Problems in Engineering: Theory, Methods and Applications . 2022,第29期

机译：Image Restoration using Nonlocal Regularized Variational Model with Spatially Adapted Regularization Parameter
2. Hybrid regularization model combining overlapping group sparse second-order total variation and nonconvex total variation [J] . Liang Wu, Liming Tang, Chunyan Li Journal of electronic imaging . 2022,第4期

机译：Hybrid regularization model combining overlapping group sparse second-order total variation and nonconvex total variation
3. Optimal Selection of the Regularization Function in a Weighted Total Variation Model. Part I: Modelling and Theory [J] . Michael Hintermüller, Carlos N. Rautenberg Journal of mathematical imaging and vision . 2017,第3期

机译：Optimal Selection of the Regularization Function in a Weighted Total Variation Model. Part I: Modelling and Theory
4. Fault-Tolerant Economic Model Predictive Control for Building Temperature Regulation using $ell_{arepsilon}$ -Regularization [C] . Farah Gabsi, Frédéric Hamelin, Nathalie Sauer, Conference on Control and Fault Tolerant Systems . 2019

机译：使用 $ ell _ { varepsilon} $ -regularization建筑温度调节的容错经济模型预测控制
5. Analyzing and Improving Compositionality in Neural Language Models =分析和改善神经语言模型的组成性 [D] . Yu, Lang. 2021

机译：Analyzing and Improving Compositionality in Neural Language Models =分析和改善神经语言模型的组成性
6. Posttransplant reduction in preexisting donor‐specific antibody levels after belatacept‐ versus cyclosporine‐based immunosuppression: Post hoc analyses of BENEFIT and BENEFIT‐EXT [O] . R. A. Bray, H. M. Gebel, R. Townsend, -1

机译：贝拉西普与环孢素免疫抑制后移植后供体特异性抗体水平的降低：BENEFIT和BENEFIT-EXT的事后分析
7. Testing Cost-Benefit Models of Parental Care EvolutionUsing Lizard Populations Differing in the Expression ofMaternal Care [O] . Wen-San Huang, David A. Pike 2014

机译：Testing Cost-Benefit models of parental Care EvolutionUsing Lizard populations Differing in the Expression ofmaternal Care
8. Operational Test Command (OTC) Analytic Simulation and Instrumentation Suite (OASIS) Brings Live Players to the Modeling Architecture for Technology, Research, and EXperimentation (MATREX) and Other Benefits of MATREX-OASIS Teaming [R] . Smith, G. M., Snively, K. D., Smith, J. S., 2007

机译：Operational Test Command (OTC) analytic simulation and Instrumentation suite (OasIs) Brings Live players to the modeling architecture for Technology, Research, and EXperimentation (maTREX) and Other Benefits of maTREX-OasIs Teaming

Benefits from Variational Regularization in Language Models

摘要

著录项

相似文献

相关主题

期刊订阅