首页> 外文期刊>BMC Bioinformatics >Utilization of two sample t -test statistics from redundant probe sets to evaluate different probe set algorithms in GeneChip studies
【24h】

Utilization of two sample t -test statistics from redundant probe sets to evaluate different probe set algorithms in GeneChip studies

机译:利用冗余探针集的两个样本t检验统计数据来评估GeneChip研究中的不同探针集算法

获取原文
           

摘要

Background The choice of probe set algorithms for expression summary in a GeneChip study has a great impact on subsequent gene expression data analysis. Spiked-in cRNAs with known concentration are often used to assess the relative performance of probe set algorithms. Given the fact that the spiked-in cRNAs do not represent endogenously expressed genes in experiments, it becomes increasingly important to have methods to study whether a particular probe set algorithm is more appropriate for a specific dataset, without using such external reference data. Results We propose the use of the probe set redundancy feature for evaluating the performance of probe set algorithms, and have presented three approaches for analyzing data variance and result bias using two sample t -test statistics from redundant probe sets. These approaches are as follows: 1) analyzing redundant probe set variance based on t -statistic rank order, 2) computing correlation of t -statistics between redundant probe sets, and 3) analyzing the co-occurrence of replicate redundant probe sets representing differentially expressed genes. We applied these approaches to expression summary data generated from three datasets utilizing individual probe set algorithms of MAS5.0 , dChip , or RMA . We also utilized combinations of options from the three probe set algorithms. We found that results from the three approaches were similar within each individual expression summary dataset, and were also in good agreement with previously reported findings by others. We also demonstrate the validity of our findings by independent experimental methods. Conclusion All three proposed approaches allowed us to assess the performance of probe set algorithms using the probe set redundancy feature. The analyses of redundant probe set variance based on t -statistic rank order and correlation of t -statistics between redundant probe sets provide useful tools for data variance analysis, and the co-occurrence of replicate redundant probe sets representing differentially expressed genes allows estimation of result bias. The results also suggest that individual probe set algorithms have dataset-specific performance.
机译:背景技术在GeneChip研究中选择用于表达总结的探针集算法对随后的基因表达数据分析具有重大影响。掺入浓度已知的cRNA通常用于评估探针组算法的相对性能。鉴于在实验中掺入的cRNA并不代表内源表达的基因,因此拥有一种方法来研究特定探针集算法是否更适合特定数据集的方法变得越来越重要,而无需使用此类外部参考数据。结果我们建议使用探针集冗余功能来评估探针集算法的性能,并提出了使用冗余探针集的两个样本t检验统计数据分析数据方差和结果偏差的三种方法。这些方法如下:1)基于t统计等级顺序分析冗余探针集方差,2)计算冗余探针集之间的t统计量的相关性,以及3)分析表示差异表达的重复冗余探针集的共现基因。我们将这些方法应用于从MAS5.0,dChip或RMA的单个探针集算法从三个数据集中产生的表达汇总数据。我们还利用了三种探针集算法的选项组合。我们发现,三种方法的结果在每个单独的表达摘要数据集中相似,并且与其他人先前报告的发现也很一致。我们还通过独立的实验方法证明了我们的发现的有效性。结论所有三种提议的方法都使我们能够使用探针集冗余功能来评估探针集算法的性能。基于t统计等级顺序的冗余探针集变异分析以及冗余探针集之间t统计的相关性为数据变异分析提供了有用的工具,并且代表差异表达基因的重复冗余探针集的共现可以评估结果偏压。结果还表明,各个探针集算法具有特定于数据集的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号