Canonical Correlation Analysis on Data With Censoring and Error Information

Sun J.; Keates S.

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Canonical Correlation Analysis on Data With Censoring and Error Information

【24h】

Canonical Correlation Analysis on Data With Censoring and Error Information

机译：带有删失和错误信息的数据的典型相关分析

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We developed a probabilistic model for canonical correlation analysis in the case when the associated datasets are incomplete. This case can arise where data entries either contain measurement errors or are censored (i.e., nonignorable missing) due to uncertainties in instrument calibration and physical limitations of devices and experimental conditions. The aim of our model is to estimate the true correlation coefficients, through eliminating the effects of measurement errors and abstracting helpful information from censored data. As exact inference is not possible for the proposed model, a modified variational Expectation-Maximization (EM) algorithm was developed. In the algorithm developed, we approximated the posteriors of the latent variables as normal distributions. In the experiment, the modified E-step approximation accuracy is first empirically demonstrated by being compared to hybrid Monte Carlo (HMC) sampling. The following experiments were carried out on synthetic datasets with different numbers of censored data and different correlation coefficient settings to compare the proposed algorithm with a maximum a posteriori (MAP) solution and a Markov Chain-EM solution. Experimental results showed that the variational EM solution compares favorably against the MAP solution, approaching the accuracy of the Markov Chain-EM, while maintaining computational simplicity. We finally applied the proposed algorithm to finding the mostly correlated properties of galaxy group with the X-ray luminosity.

机译：当相关数据集不完整时，我们开发了一种用于典范相关性分析的概率模型。如果由于仪器校准的不确定性以及设备和实验条件的物理限制，数据条目包含测量错误或被检查（即不可忽略的丢失），则会出现这种情况。我们模型的目的是通过消除测量误差的影响并从审查数据中提取有用的信息来估计真实的相关系数。由于无法对所提出的模型进行精确推断，因此开发了一种改进的变分期望最大化（EM）算法。在开发的算法中，我们将潜在变量的后验近似为正态分布。在实验中，首先通过与混合蒙特卡洛（HMC）采样进行比较，以经验证明了改进的E步近似精度。在具有不同数量的删失数据和不同相关系数设置的合成数据集上进行了以下实验，以将所提出的算法与最大后验（MAP）解决方案和Markov Chain-EM解决方案进行比较。实验结果表明，变分EM解决方案与MAP解决方案相比具有优势，在保持计算简单性的同时，达到了Markov Chain-EM的精度。我们最终将提出的算法应用于发现与X射线光度最相关的星系组属性。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2013年第12期|1909-1919|共11页
作者
Sun J.; Keates S.;
展开▼
作者单位

School of Engineering, Computing and Applied Mathematics, The University of Abertay Dundee, Dundee, U.K.|c|;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Canonical correlation analysis (CCA); censored data; latent variable model; measurement errors;

机译：典型相关分析（CCA）;删失数据;潜在变量模型;测量误差;

相似文献

外文文献
中文文献
专利

1. Comparison Of Canonical Correlation Analysis And The Generalized Canonical Correlation Analysis Using The Lognormal And Cauchy Distributed Data [J] . S. I. ONYEAGU, G. A. OSUJI, O.M. ONYIA Mathematical Theory and Modeling . 2014,第5期

机译：使用对数正态和柯西分布数据进行典范相关分析和广义典范相关分析的比较
2. Testing the correlation for clustered categorical and censored discrete time-to-event data when covariates are measured without/with errors. [J] . Li Y, Lin X Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology . 2003,第1期

机译：当协变量测量有无误差时，测试聚类的分类数据和删失的离散事件时间数据的相关性。
3. Global dynamics of a system governing an algorithm for regression with censored and non-censored data under general errors [J] . Carlos Rivero, Angela Castillo, Pedro J. Zufiria, Journal of Computational and Applied Mathematics . 2004,第2期

机译：在一般误差下，用于控制带删失数据和无删失数据的回归算法的系统的全局动力学
4. Diagnostics in Multivariate Data Analysis: Sensitivity Analysis for Principal Components and Canonical Correlations [C] . Y. Tanaka, F. Zhang, W. Yang Gesellschaft fur Klassifikation . 2003

机译：多变量数据分析中的诊断：主成分和规范相关性的灵敏度分析
5. The functional data analysis of hourly air pollution data: Canonical correlation and principal component analyses of PM10, PM2.5 and ozone data for El Paso, Texas. [D] . Samuels, Vernon. 2006

机译：每小时空气污染数据的功能数据分析：德克萨斯州埃尔帕索的PM10，PM2.5和臭氧数据的典范相关性和主成分分析。
6. Estimating Correlation with Multiply Censored Data Arising from the Adjustment of Singly Censored Data [O] . Elizabeth Newton, Ruthann Rudel -1

机译：从单删截数据的调整估计与删截数相关的相关性
7. Testing the Correlation for Clustered Categorical and Censored Discrete Time-to-Event Data When Covariates Are Measured without/with Errors [O] . Yi Li, Xihong Lin 2003

机译：在没有/错误的情况下测量协变量时，测试聚类分类和截取的离散时间到事件数据的相关性
8. Does Canonical Correlation Analysis Provide Reliable Information on Data Correlation in Array Processing [R] . Ge, H., Kirsteins, I. P., Wang, X. 2009

机译：规范相关分析是否为阵列处理中的数据关联提供了可靠的信息

Canonical Correlation Analysis on Data With Censoring and Error Information

摘要

著录项

相似文献

相关主题

期刊订阅