...
首页> 外文期刊>Journal of the American statistical association >Variable Selection for Skewed Model-Based Clustering: Application to the Identification of Novel Sleep Phenotypes
【24h】

Variable Selection for Skewed Model-Based Clustering: Application to the Identification of Novel Sleep Phenotypes

机译:基于偏模型的聚类变量选择:在新型睡眠表型的识别中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

In sleep research, applying finite mixture models to sleep characteristics captured through multiple data types, including self-reported sleep diary, a wrist monitor capturing movement (actigraphy), and brain waves (polysomnography), may suggest new phenotypes that reflect underlying disease mechanisms. However, a direct mixture model application is challenging because there are many sleep variables from which to choose, and sleep variables are often highly skewed even in homogenous samples. Moreover, previous sleep research findings indicate that some of the most clinically interesting solutions will be those that incorporate all three data types. Thus, we present two novel skewed variable selection algorithms based on the multivariate skew normal (MSN) distribution: one that selects the best set of variables ignoring data type and another that embraces the exploratory nature of clustering and suggests multiple statistically plausible sets of variables that each incorporate all data types. Through a simulation study, we empirically compare our approach with other asymmetric and normal dimension reduction strategies for clustering. Finally, we demonstrate our methods using a sample of older adults with and without insomnia. The proposed MSN-based variable selection algorithm appears to be suitable for both MSN and multivariate normal cluster distributions, especially with moderate to large-sample sizes. Supplementary materials for this article are available online.
机译:在睡眠研究中,将有限混合模型应用于通过多种数据类型捕获的睡眠特征,包括自我报告的睡眠日记,捕获运动的腕式监测器(书法)和脑电图(多导睡眠图),可能会提出反映潜在疾病机制的新表型。但是,直接混合模型的应用颇具挑战性,因为有许多睡眠变量可供选择,而且即使在同质样本中,睡眠变量也经常会高度偏斜。此外,先前的睡眠研究发现表明,一些最有趣的临床解决方案将是合并了所有三种数据类型的解决方案。因此,我们提出了两种基于多元偏态正态(MSN)分布的新颖偏态变量选择算法:一种选择忽略数据类型的最佳变量集,另一种考虑聚类的探索性质,并提出多个统计上合理的变量集,每个都包含所有数据类型。通过模拟研究,我们在经验上将我们的方法与其他非对称和法向尺寸减小策略进行聚类。最后,我们使用有或没有失眠的老年人样本演示了我们的方法。所提出的基于MSN的变量选择算法似乎既适用于MSN也适用于多元正态集群分布,尤其是中样本到大型样本。可在线获得本文的补充材料。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号