...
首页> 外文期刊>Optik: Zeitschrift fur Licht- und Elektronenoptik: = Journal for Light-and Electronoptic >Clustering gene expression data analysis using an improved EM algorithm based on multivariate elliptical contoured mixture models
【24h】

Clustering gene expression data analysis using an improved EM algorithm based on multivariate elliptical contoured mixture models

机译:基于多元椭圆轮廓混合模型的改进EM算法聚类基因表达数据分析

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Clustering gene expression data is an important research topic in bioinformatics because knowing which genes act similarly can lead to the discovery of important biological information. Many clustering algorithms have been used in the field of gene clustering. The multivariate Gaussian mixture distribution function was frequently used as the component of the finite mixture model for clustering, however the clustering cannot be restricted to the normal distribution in the real dataset. In order to make the cluster algorithm strong adaptability, this paper proposes a new scheme for clustering gene expression data based on the multivariate elliptical contoured mixture models (MECMMs). To solve the problem of over-reliance on the initialization, we propose an improved expectation maximization (EM) algorithm by adding and deleting initial value for the classical EM algorithm, and the number of clusters can be treated as a known parameter and inferred with the QAIC criterion. The improved EM algorithm based on the MECMMs is tested and compared with some other clustering algorithms, the performance of our clustering algorithm has been extensively compared over several simulated and real gene expression datasets. Our results indicated that improved EM clustering algorithm is superior to the classical EM algorithm and the support vector machines (SVMs) algorithm, and can be widely used for gene clustering.
机译:基因表达数据的聚类是生物信息学中的重要研究课题,因为知道哪些基因具有相似的作用可以导致发现重要的生物学信息。在基因聚类领域中已经使用了许多聚类算法。多元高斯混合分布函数经常被用作有限混合模型的组成部分,但是在实际数据集中不能将聚集限制为正态分布。为了使聚类算法具有较强的适应性,本文提出了一种基于多元椭圆轮廓混合模型(MECMM)的基因表达数据聚类新方案。为了解决过度依赖初始化的问题,我们通过添加和删除经典EM算法的初始值,提出了一种改进的期望最大化(EM)算法,并且可以将簇数视为已知参数并通过QAIC标准。测试了基于MECMM的改进EM算法,并与其他一些聚类算法进行了比较,我们的聚类算法的性能已在多个模拟和真实基因表达数据集上进行了广泛比较。我们的结果表明,改进的EM聚类算法优于经典的EM算法和支持向量机(SVM)算法,可广泛用于基因聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号