Archetypal Analysis With Missing Data: See All Samples by Looking at a Few Based on Extreme Profiles

Epifanio Irene; Ibanez M. Victoria; Simo Amelia

首页> 外文期刊>The American statistician >Archetypal Analysis With Missing Data: See All Samples by Looking at a Few Based on Extreme Profiles

【24h】

Archetypal Analysis With Missing Data: See All Samples by Looking at a Few Based on Extreme Profiles

机译：缺失数据的原型分析：通过查看基于极端配置文件的少数几个来查看所有样本

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this article, we propose several methodologies for handling missing or incomplete data in archetype analysis (AA) and archetypoid analysis (ADA). AA seeks to find archetypes, which are convex combinations of data points, and to approximate the samples as mixtures of those archetypes. In ADA, the representative archetypal data belong to the sample, that is, they are actual data points. With the proposed procedures, missing data are not discarded or previously filled by imputation and the theoretical properties regarding location of archetypes are guaranteed, unlike the previous approaches. The new procedures adapt the AA algorithm either by considering the missing values in the computation of the solution or by skipping them. In the first case, the solutions of previous approaches are modified to fulfill the theory and a new procedure is proposed, where the missing values are updated by the fitted values. In this second case, the procedure is based on the estimation of dissimilarities between samples and the projection of these dissimilarities in a new space, where AA or ADA is applied, and those results are used to provide a solution in the original space. A comparative analysis is carried out in a simulation study, with favorable results. The methodology is also applied to two real datasets: a well-known climate dataset and a global development dataset. We illustrate how these unsupervised methodologies allow complex data to be understood, even by nonexperts. for this article are available online.

机译：在本文中，我们提出了几种用于在原型分析（AA）和原型分析（ADA）中处理丢失或不完整数据的方法。 AA寻求查找原型，它们是数据点的凸面组合，并将样本近似于那些原型的混合。在ADA中，代表性的原型数据属于样本，即它们是实际数据点。通过提出的程序，缺失数据不会被丢弃或以前填补归属，并且有关以前的方法，保证了关于原型的位置的理论特性。新程序通过考虑解决方案计算中的缺失值或通过跳过它们来调整AA算法。在第一种情况下，先前方法的解决方案被修改以实现理论，提出了一种新的过程，其中缺失的值由装配的值更新。在该第二案例中，该过程基于估计样品与施加AA或ADA的新空间中这些异化之间的异化的估计，并且这些结果用于在原始空间中提供溶液。在模拟研究中进行了比较分析，结果有利。该方法也应用于两个实时数据集：一个着名的气候数据集和全局开发数据集。我们说明了这些无监督的方法如何允许将复杂的数据理解，甚至是非强行。本文可在线获取。

著录项

来源
《The American statistician》 |2020年第2期|169-183|共15页
作者
Epifanio Irene; Ibanez M. Victoria; Simo Amelia;
展开▼
作者单位

Univ Jaume 1 Dept Matemat IMAC Castellon de La Plana 12071 Spain;

Univ Jaume 1 Dept Matemat IMAC Castellon de La Plana 12071 Spain;

Univ Jaume 1 Dept Matemat IMAC Castellon de La Plana 12071 Spain;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Archetype analysis; Incomplete dataset; Multidimensional scaling; Partial distance strategy;

机译：原型分析;不完整的数据集;多维缩放;部分距离策略;

相似文献

外文文献
中文文献
专利

1. Sample-Based Extreme Learning Machine with Missing Data [J] . Gao Hang, Liu Xin-Wang, Peng Yu-Xing, Mathematical Problems in Engineering . 2015,第PTa9期

机译：缺少数据的基于样本的极限学习机
2. Sample-Based Extreme Learning Machine with Missing Data [J] . HangGao, Xin-WangLiu, Yu-XingPeng, Mathematical Problems in Engineering: Theory, Methods and Applications . 2015,第4期

机译：缺少数据的基于样本的极限学习机
3. Archetypal shapes based on landmarks and extension to handle missing data [J] . Epifanio Irene, Victoria Ibanez Maria, Simo Amelia Advances in data analysis and classification . 2018,第3期

机译：基于地标和扩展的原型形状处理缺失数据
4. EMMA: An EM-based Imputation Technique for Handling Missing Sample-Values in Microarray Expression Profiles [C] . Amitava Karmaker, Edward A. Salinas, Stephen Kwek International Conference on Bioinformatics Computational Biology . 2011

机译：EMMA：用于处理微阵列表达配置文件中丢失的样本值的基于EM的归纳技术
5. MISSING VALUES IN STATISTICAL ANALYSIS. (MODIFIED SAMPLING DISTRIBUTIONS,APPROXIMATE STATISTICAL ANALYSIS OF EXPERIMENTAL DATA AND ESTIMATION OF POPULATION PARAMETERS FROM FRAGMENTARY SAMPLES [D] . MATHAI, MATHAI ARAKAPARAMPIL. 1964

机译：统计分析中的缺失值。修改后的抽样分布，实验数据的近似统计分析和片段样本的人口参数估计
6. Validity and Power of Missing Data Imputation for Extreme Sampling and Terminal Measures Designs in Mediation Analysis [O] . Robert Makowsky, T. Mark Beasley, Gary L. Gadbury, 2011

机译：中介分析中极端采样和终端措施设计缺失数据归因的有效性和功效
7. Archetypal Analysis With Missing Data: See All Samples by Looking at a Few Based on Extreme Profiles [O] . Irene Epifanio, M. Victoria Ibáñez, Amelia Simó 2019

机译：缺失数据的原型分析：通过查看基于极端配置文件的少数几个来查看所有样本

Archetypal Analysis With Missing Data: See All Samples by Looking at a Few Based on Extreme Profiles

摘要

著录项

相似文献

相关主题

期刊订阅