首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Canonical Correlation Analysis on Data With Censoring and Error Information
【24h】

Canonical Correlation Analysis on Data With Censoring and Error Information

机译:带有删失和错误信息的数据的典型相关分析

获取原文
获取原文并翻译 | 示例
       

摘要

We developed a probabilistic model for canonical correlation analysis in the case when the associated datasets are incomplete. This case can arise where data entries either contain measurement errors or are censored (i.e., nonignorable missing) due to uncertainties in instrument calibration and physical limitations of devices and experimental conditions. The aim of our model is to estimate the true correlation coefficients, through eliminating the effects of measurement errors and abstracting helpful information from censored data. As exact inference is not possible for the proposed model, a modified variational Expectation-Maximization (EM) algorithm was developed. In the algorithm developed, we approximated the posteriors of the latent variables as normal distributions. In the experiment, the modified E-step approximation accuracy is first empirically demonstrated by being compared to hybrid Monte Carlo (HMC) sampling. The following experiments were carried out on synthetic datasets with different numbers of censored data and different correlation coefficient settings to compare the proposed algorithm with a maximum a posteriori (MAP) solution and a Markov Chain-EM solution. Experimental results showed that the variational EM solution compares favorably against the MAP solution, approaching the accuracy of the Markov Chain-EM, while maintaining computational simplicity. We finally applied the proposed algorithm to finding the mostly correlated properties of galaxy group with the X-ray luminosity.
机译:当相关数据集不完整时,我们开发了一种用于典范相关性分析的概率模型。如果由于仪器校准的不确定性以及设备和实验条件的物理限制,数据条目包含测量错误或被检查(即不可忽略的丢失),则会出现这种情况。我们模型的目的是通过消除测量误差的影响并从审查数据中提取有用的信息来估计真实的相关系数。由于无法对所提出的模型进行精确推断,因此开发了一种改进的变分期望最大化(EM)算法。在开发的算法中,我们将潜在变量的后验近似为正态分布。在实验中,首先通过与混合蒙特卡洛(HMC)采样进行比较,以经验证明了改进的E步近似精度。在具有不同数量的删失数据和不同相关系数设置的合成数据集上进行了以下实验,以将所提出的算法与最大后验(MAP)解决方案和Markov Chain-EM解决方案进行比较。实验结果表明,变分EM解决方案与MAP解决方案相比具有优势,在保持计算简单性的同时,达到了Markov Chain-EM的精度。我们最终将提出的算法应用于发现与X射线光度最相关的星系组属性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号