首页> 外国专利> Apparatus for generating a statistical sequence model called class bi-multigram model with bigram dependencies assumed between adjacent sequences

Apparatus for generating a statistical sequence model called class bi-multigram model with bigram dependencies assumed between adjacent sequences

机译:用于生成统计序列模型的设备,该模型称为类二元模型,在相邻序列之间具有二元依赖关系

摘要

An apparatus generates a statistical class sequence model called A class bi-multigram model from input training strings of discrete-valued units, where bigram dependencies are assumed between adjacent variable length sequences of maximum length N units, and where class labels are assigned to the sequences. The number of times all sequences of units occur are counted, as well as the number of times all pairs of sequences of units co-occur in the input training strings. An initial bigram probability distribution of all the pairs of sequences is computed as the number of times the two sequences co-occur, divided by the number of times the first sequence occurs in the input training string. Then, the input sequences are classified into a pre-specified desired number of classes. Further, an estimate of the bigram probability distribution of the sequences is calculated by using an EM algorithm to maximize the likelihood of the input training string computed with the input probability distributions. The above processes are then iteratively performed to generate statistical class sequence model.
机译:一种设备从离散值单元的输入训练串中生成称为A类二元模型模型的统计类序列模型,其中假定最大长度为N个单位的相邻可变长度序列之间具有二元组依赖性,并且将类别标签分配给序列。计算所有单元序列出现的次数,以及在输入训练字符串中同时出现的所有单元序列对的次数。所有两个序列对的初始二元组概率分布计算为两个序列同时出现的次数除以输入训练字符串中第一个序列出现的次数。然后,将输入序列分类为预定的期望数量的类别。另外,通过使用EM算法来计算序列的二元组概率分布的估计,以使利用输入概率分布计算出的输入训练串的可能性最大化。然后迭代执行以上过程以生成统计类序列模型。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号