【24h】

German's Next Language Model

机译:德国的下一个语言模型

获取原文

摘要

In this work we present the experiments which lead to the creation of our BERT and ELECTRA based German language models, GBERT and GELECTRA. By varying the input training data, model size, and the presence of Whole Word Masking (WWM) we were able to attain SoTA performance across a set of document classification and named entity recognition (NER) tasks for both models of base and large size. We adopt an evaluation driven approach in training these models and our results indicate that both adding more data and utilizing WWM improve model performance. By benchmarking against existing German models, we show that these models are the best German models to date. Our trained models will be made publicly available to the research community.
机译:在这项工作中,我们提出了实验,导致我们的BERT和电器的德语模型,Gbert和Gelera创建。 通过改变输入培训数据,模型大小和整个单词屏蔽(WWM)的存在,我们能够在一组文档分类和命名实体识别(NER)任务中获得SOTA性能,用于基础和大尺寸。 我们采用评估驱动的方法在培训这些模型中,我们的结果表明,两者都添加了更多数据并利用WWM提高了模型性能。 通过对现有德国模型的基准测试,我们显示这些模型是迄今为止最好的德国模型。 我们的训练有素的型号将公开向研究界提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号