首页> 外文会议>Asia-Pacific Bioinformatics Conference(APBC 2003); 200302; Adelaide(AU) >Model-Based Clustering in Gene Expression Microarrays: An Application to Breast Cancer Data
【24h】

Model-Based Clustering in Gene Expression Microarrays: An Application to Breast Cancer Data

机译:基因表达微阵列中基于模型的聚类:乳腺癌数据的应用。

获取原文
获取原文并翻译 | 示例

摘要

In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. However attention is now turning to model-based clustering approaches. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these clustering problems. To this end, McLachlan et al. (2002) developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van't Veer et al. (2002). Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples in a typical study. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.
机译:在微阵列研究中,聚类技术的应用通常用于得出有意义的数据见解。过去,分层方法一直是用来执行此任务的主要聚类工具。但是,现在的注意力转向基于模型的聚类方法。分层算法已主要启发式应用于这些聚类分析问题。此外,这些方法的主要局限性在于它们无法确定簇数。因此,需要针对这些聚类问题的基于模型的方法。为此,McLachlan等。 (2002年)开发了一种基于混合模型的算法(EMMIX-GENE)用于组织样本的聚类。为了进一步研究作为基于模型的方法的EMMIX-GENE程序,我们提出了一个案例研究,涉及将EMMIX-GENE应用于乳腺癌数据,最近在van't Veer等人中进行了研究。 (2002)。我们的分析考虑了基于基因对组织样本进行聚类的问题,这是一个非标准问题,因为在典型研究中,基因的数目大大超过了组织样本的数目。我们展示了EMMIX-GENE如何在减少初始基因集到更易于计算的大小方面有用。该分析的结果还强调了与基于特定基因子集分离两个组织组的任务相关的困难。这些结果也阐明了为什么监督方法对乳腺癌数据具有如此高的错误分配错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号