【24h】

How multilingual is Multilingual BERT?

机译:多语种伯特多种语言如何?

获取原文

摘要

In this paper, we show that Multilingual BERT (M-BERT). released by Devlin et al. (2019) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-specific annotations in one language are used to fine-tune the model for evaluation in another language. To understand why, we present a large number of probing experiments, showing that transfer is possible even to languages in different scripts, that transfer works best between typologically similar languages, that monolingual corpora can train models for code-switching, and that the model can find translation pairs. From these results, we can conclude that M-BERT does create multilingual representations, but that these representations exhibit systematic deficiencies affecting certain language pairs.
机译:在本文中,我们展示了多语言伯特(M-BERT)。 devlin等人发布。 (2019年)作为从104种语言中从Monolingual Corpora进行的单一语言模型,令人惊讶地擅长零拍摄的交叉模型转移,其中一种语言的任务特定注释用于微调评估模型用另一种语言。要了解原因,我们展示了大量探测实验,表明转移甚至可能在不同脚本中的语言,转移工作在类型的类型类似的语言之间,Monolingual Corpora可以培训代码切换的模型,并且模型可以培训模型找到翻译对。从这些结果来看,我们可以得出结论,M-BERT确实创造了多语言表示,但这些表示表现出影响某些语言对的系统性缺陷。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号