Model-based clustering of Gaussian copulas for mixed data

Marbac Matthieu; Biernacki Christophe; Vandewalle Vincent

首页> 外文期刊>Communications in Statistics >Model-based clustering of Gaussian copulas for mixed data

【24h】

Model-based clustering of Gaussian copulas for mixed data

机译：基于模型的Gaussian Copulas用于混合数据的聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering of mixed data is important yet challenging due to a shortage of conventional distributions for such data. In this article, we propose a mixture model of Gaussian copulas for clustering mixed data. Indeed copulas, and Gaussian copulas in particular, are powerful tools for easily modeling the distribution of multivariate variables. This model clusters data sets with continuous, integer, and ordinal variables (all having a cumulative distribution function) by considering the intra-component dependencies in a similar way to the Gaussian mixture. Indeed, each component of the Gaussian copula mixture produces a correlation coefficient for each pair of variables and its univariate margins follow standard distributions (Gaussian, Poisson, and ordered multinomial) depending on the nature of the variable (continuous, integer, or ordinal). As an interesting by-product, this model generalizes many well-known approaches and provides tools for visualization based on its parameters. The Bayesian inference is achieved with a Metropolis-within-Gibbs sampler. The numerical experiments, on simulated and real data, illustrate the benefits of the proposed model: flexible and meaningful parameterization combined with visualization features.

机译：由于这种数据的传统分布不足，混合数据的聚类是重要的，但具有挑战性。在本文中，我们提出了一种用于聚类混合数据的高斯共用的混合模型。实际上，特别是Copulas和Gaussian Copulas是一种强大的工具，可轻松建模多元变量的分布。该模型通过考虑与高斯混合物类似的方式，通过考虑与高斯混合物类似的组件依赖性，具有连续，整数和序列变量（所有具有累积分布函数）的数据集。实际上，高斯Copula混合物的每个组分产生了对每对变量的相关系数，并且其单变量边距遵循标准分布（高斯，泊松和有序多项式），这取决于变量的性质（连续，整数或序数）。作为一个有趣的副产品，该模型概括了许多着名的方法，并根据其参数提供可视化的工具。贝叶斯推断是通过吉布斯in-gibbs采样器内的实现。模拟和实际数据的数值实验说明了所提出的模型的好处：灵活且有意义的参数化与可视化功能相结合。

著录项

来源
《Communications in Statistics》 |2017年第24期|11635-11656|共22页
作者
Marbac Matthieu; Biernacki Christophe; Vandewalle Vincent;
展开▼
作者单位

Inria Lille 40 Ave Halley F-59650 Villeneuve Dascq France|Univ Lille 1 Villeneuve Dascq France;

Inria Lille 40 Ave Halley F-59650 Villeneuve Dascq France|Univ Lille 1 Villeneuve Dascq France|CNRS Paris France;

Inria Lille 40 Ave Halley F-59650 Villeneuve Dascq France|Univ Lille 2 EA 2694 Lille France;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Clustering; Gaussian copula; Metropolis-within-Gibbs algorithm; mixed data; mixture models; visualization;

机译：聚类;高斯谱;大都会内 - 吉布斯算法;混合数据;混合模型;可视化;

相似文献

外文文献
中文文献
专利

1. Vine copulas for mixed data : multi-view clustering for mixed data beyond meta-Gaussian dependencies [J] . Tekumalla Lavanya Sita, Rajan Vaibhav, Bhattacharyya Chiranjib Machine Learning . 2017,第9a10期

机译：混合数据的藤蔓copulas：超出元高斯依存关系的混合数据的多视图聚类
2. Gaussian Copula Mixed Models for Clustered Mixed Outcomes, With Application in Developmental Toxicology [J] . Wu Beilei, de Leon Alexander R. Journal of Agricultural, Biological, and Environmental Statistics . 2014,第1期

机译：高斯Copula混合模型的混合混合结果，在发展毒理学中的应用
3. Learning causal structure from mixed data with missing values using Gaussian copula models [J] . Cui Ruifei, Groot Perry, Heskes Tom Statistics and computing . 2019,第2期

机译：使用高斯copula模型从具有缺失值的混合数据中学习因果结构
4. Vine Copulas for Mixed Data: Multi-view Clustering for Mixed Data Beyond Meta-Gaussian Dependencies [C] . Lavanya Sita Tekumalla, Vaibhav Rajan, Chiranjib Bhattacharyya European conference on machine learning and principles and practice of knowledge discovery in databases . 2017

机译：用于混合数据的Vine Copulas：超越元高斯依存关系的混合数据的多视图聚类
5. Latent Gaussian Copula Model for High Dimensional Mixed Data, and Its Applications [D] . Quan, Xiaoyun . 2020

机译：高尺寸混合数据的潜在高斯谱图型及其应用
6. Bayesian Gaussian Copula Factor Models for Mixed Data [O] . Jared S. Murray, David B. Dunson, Lawrence Carin, -1

机译：混合数据的贝叶斯高斯Copula因子模型
7. Model-based clustering of Gaussian copulas for mixed data [O] . Matthieu Marbac, Christophe Biernacki, Vincent Vandewalle 2017

机译：基于模型的Gaussian Copulas用于混合数据的聚类
8. Model-Based Gaussian and Non-Gaussian Clustering. [R] . Banfield, J. D., Raftery, A. E. 1989

机译：基于模型的高斯和非高斯聚类。

Model-based clustering of Gaussian copulas for mixed data

摘要

著录项

相似文献

相关主题

期刊订阅