【24h】

Probing for Bridging Inference in Transformer Language Models

机译:变压器语言模型桥接推断探讨

获取原文

摘要

We probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations in-comparison with the lower and middle layers, also, few specific attention heads concentrate consistently on bridging. More importantly, we consider language models as a whole in our second approach where bridging anaphora resolution is formulated as a masked token prediction task (Of-Cloze test). Our formulation produces optimistic results without any fine-tuning, which indicates that pre-trained language models substantially capture bridging inference. Our further investigation shows that the distance between anaphor-antecedent and the context provided to language models play an important role in the inference.
机译:我们探测预训练的变压器语言模型,用于桥接推断。 我们首先调查伯特的个人关注头,并观察到更高层次的注意力突出关注与下层和中层的桥接关系,而且,很少有特定的关注头集中在桥接上。 更重要的是,我们在我们的第二种方法中考虑整体的语言模型,其中桥接颠覆分辨率被制定为蒙版令牌预测任务(层压测试)。 我们的配方在没有任何微调的情况下产生乐观的结果,这表明预先训练的语言模型基本上捕获了桥接推断。 我们的进一步调查表明,宣誓期与语言模型提供的上下文在推理中发挥着重要作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号