首页> 外文期刊>IEEE Transactions on Information Theory >Efficient Computation of Normalized Maximum Likelihood Codes for Gaussian Mixture Models With Its Applications to Clustering
【24h】

Efficient Computation of Normalized Maximum Likelihood Codes for Gaussian Mixture Models With Its Applications to Clustering

机译:高斯混合模型的归一化最大似然码的有效计算及其在聚类中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

This paper addresses the issue of estimating from a given data sequence the number of mixture components for a Gaussian mixture model(GMM). Our approach is to compute the normalized maximum likelihood (NML) code length for the data sequence relative to a GMM, then to find the mixture size that attains the minimum of the NML on the basis of the minimum description length principle. For finite domains, Kontkanen and Myllymäki proposed a method for efficient computation of the NML code length for specific models, however, for general classes over infinite domains, it has remained open how we compute the NML code length efficiently. We first propose a general method for calculating the NML code length for a general exponential family. Then, we apply it to the efficient computation of the NML code length for a GMM. The key idea is to restrict the data domain in combination with the technique of employing a generating function for computing the normalization term for a GMM. We use artificial datasets to empirically demonstrate that our estimate of the mixture size converges to the true one significantly faster than other criteria.
机译:本文讨论了从给定的数据序列中估计高斯混合模型(GMM)的混合成分数量的问题。我们的方法是计算相对于GMM的数据序列的归一化最大似然(NML)码长度,然后根据最小描述长度原理找到达到NML最小值的混合大小。对于有限域,Kontkanen和Myllymäki提出了一种用于有效计算特定模型的NML代码长度的方法,但是,对于无限域上的常规类,如何有效地计算NML代码长度仍未解决。我们首先提出一种用于计算一般指数族的NML代码长度的一般方法。然后,我们将其应用于GMM的NML代码长度的有效计算。关键思想是结合采用生成函数来计算GMM标准化项的技术来限制数据域。我们使用人工数据集凭经验证明,我们对混合物尺寸的估计收敛到真实值的速度明显快于其他标准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号