首页> 外文期刊>Journal of the Royal Statistical Society. Series C, Applied statistics >Correlating two continuous variables subject to detection limits in the context of mixture distributions
【24h】

Correlating two continuous variables subject to detection limits in the context of mixture distributions

机译:在混合物分布的背景下,将两个连续变量与检测极限相关联

获取原文
获取原文并翻译 | 示例
           

摘要

In individuals who are infected with human immunodeficiency virus (HIV), distributions of quantitative HIV ribonucleic acid measurements may be highly left censored with an extra spike below the limit of detection LD of the assay. A two-component mixture model with the lower component entirely supported on [0, LD] is recommended to model the extra spike in univariate analysis better. Let LD_1 and LD_2 be the limits of detection for the two HIV viral load measurements. When estimating the correlation coefficient between two different measures of viral load obtained from each of a sample of patients, a bivariate Gaussian mixture model is recommended to model the extra spike on [0, LD_1 ] and [0, LD_2] better when the proportion below LD is incompatible with the left-hand tail of a bivariate Gaussian distribution. When the proportion of both variables falling below LD is very large, the parameters of the lower component may not be estimable since almost all observations from the lower component are falling below LD. A partial solution is to assume that the lower component's entire support is on [0, LD_1 ] x [0, LD_2]. Maximum likelihood is used to estimate the parameters of the lower and higher components. To evaluate whether there is a lower component, we apply a Monte Carlo approach to assess the p-value of the likelihood ratio test and two information criteria: a bootstrap-based information criterion and a cross-validation-based information criterion. We provide simulation results to evaluate the performance and compare it with two ad hoc estimators and a single-component bivariate Gaussian likelihood estimator. These methods are applied to the data from a cohort study of HIV-infected men in Rio de Janeiro, Brazil, and the data from the Women's Interagency HIV oral study. These results emphasize the need for caution when estimating correlation coefficients from data with a large proportion of non-detectable values when the proportion below LD is incompatible with the left-hand tail of a bivariate Gaussian distribution.
机译:在感染了人类免疫缺陷病毒(HIV)的个体中,HIV核糖核酸定量测量的分布可能会被高度删减,且额外的峰值低于测定的检测LD限。建议在[0,LD]上完全支持较低成分的两成分混合模型,以便更好地对单变量分析中的额外尖峰建模。假设LD_1和LD_2是两次HIV病毒载量测量的检测极限。当估计从每个患者样本中获得的两种不同的病毒载量测量值之间的相关系数时,建议使用双变量高斯混合模型来更好地模拟[0,LD_1]和[0,LD_2]上的额外峰值,当比例低于LD与二元高斯分布的左尾不兼容。当两个变量的总和都落在LD以下时,较低分量的参数可能无法估算,因为几乎所有来自该较低分量的观测值都落在LD以下。一个部分解决方案是假设下部组件的整个支撑位于[0,LD_1] x [0,LD_2]上。最大似然用于估计较低和较高组件的参数。为了评估是否存在较低的分量,我们应用了蒙特卡洛方法来评估似然比检验的p值和两个信息标准:基于引导的信息标准和基于交叉验证的信息标准。我们提供仿真结果以评估性能,并将其与两个临时估计器和一个单分量双变量高斯似然估计器进行比较。这些方法适用于巴西里约热内卢的一项艾滋病毒感染男性队列研究的数据,以及妇女跨部门艾滋病毒口头研究的数据。这些结果强调,当低于LD的比例与双变量高斯分布的左尾不兼容时,从具有较大比例的不可检测值的数据估计相关系数时,需要谨慎行事。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号