首页> 外国专利> Methods, devices, devices and storage media for acquiring word vectors based on language models

Methods, devices, devices and storage media for acquiring word vectors based on language models

机译:用于基于语言模型获取字向量的方法,设备,设备和存储介质

摘要

PROBLEM TO BE SOLVED: To provide a method for avoiding the risk of information leakage due to learning based on character particle size, enhancing the learning ability of word meaning information by a language model, accelerating the convergence speed of a word vector, and enhancing the training effect. SOLUTION: The method inputs each of at least two first sample text language materials into a language model, outputs a context vector of a first word mask in each said first sample text language material by the language model, and first. The word vector corresponding to each of the first word masks is determined based on the word vector parameter matrix, the second word vector parameter matrix, and the fully connected matrix, respectively, and corresponds to the first word mask in at least two first sample text language materials. Based on the word vector to be used, the language model and the fully connected matrix are trained to obtain the word vector. [Selection diagram] Fig. 1
机译:要解决的问题:提供一种基于角色粒度的学习引起的信息泄漏风险的方法,提高了语言模型的词意义信息的学习能力,加速了单词矢量的收敛速度,增强 培训效果。 解决方案:该方法将至少两个第一个示例文本语言材料中的每一个输入到语言模型中,通过语言模型输出每个所述第一示例文本语言材料中的第一单词掩码的上下文向量,并首先。 基于单词矢量参数矩阵,第二字矢量参数矩阵和完全连接的矩阵确定对应于每个第一单词掩模的单词矢量,并且对应于至少两个第一示例文本中的第一字掩码 语言材料。 基于要使用的单词矢量,培训语言模型和完全连接的矩阵以获得单词向量。 [选择图]图1

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号