...
首页> 外文期刊>BMC Bioinformatics >Arrow plot: a new graphical tool for selecting up and down regulated genes and genes differentially expressed on sample subgroups
【24h】

Arrow plot: a new graphical tool for selecting up and down regulated genes and genes differentially expressed on sample subgroups

机译:箭头图:一种新的图形工具,用于选择上调和下调的基因以及在样品亚组上差异表达的基因

获取原文
           

摘要

Background A common task in analyzing microarray data is to determine which genes are differentially expressed across two (or more) kind of tissue samples or samples submitted under experimental conditions. Several statistical methods have been proposed to accomplish this goal, generally based on measures of distance between classes. It is well known that biological samples are heterogeneous because of factors such as molecular subtypes or genetic background that are often unknown to the experimenter. For instance, in experiments which involve molecular classification of tumors it is important to identify significant subtypes of cancer. Bimodal or multimodal distributions often reflect the presence of subsamples mixtures. Consequently, there can be genes differentially expressed on sample subgroups which are missed if usual statistical approaches are used. In this paper we propose a new graphical tool which not only identifies genes with up and down regulations, but also genes with differential expression in different subclasses, that are usually missed if current statistical methods are used. This tool is based on two measures of distance between samples, namely the overlapping coefficient (OVL) between two densities and the area under the receiver operating characteristic (ROC) curve. The methodology proposed here was implemented in the open-source R software. Results This method was applied to a publicly available dataset, as well as to a simulated dataset. We compared our results with the ones obtained using some of the standard methods for detecting differentially expressed genes, namely Welch t-statistic, fold change (FC), rank products (RP), average difference (AD), weighted average difference (WAD), moderated t-statistic (modT), intensity-based moderated t-statistic (ibmT), significance analysis of microarrays (samT) and area under the ROC curve (AUC). On both datasets all differentially expressed genes with bimodal or multimodal distributions were not selected by all standard selection procedures. We also compared our results with (i) area between ROC curve and rising area (ABCR) and (ii) the test for not proper ROC curves (TNRC). We found our methodology more comprehensive, because it detects both bimodal and multimodal distributions and different variances can be considered on both samples. Another advantage of our method is that we can analyze graphically the behavior of different kinds of differentially expressed genes. Conclusion Our results indicate that the arrow plot represents a new flexible and useful tool for the analysis of gene expression profiles from microarrays.
机译:背景技术分析微阵列数据的常见任务是确定哪些基因在两种(或多种)组织样品或在实验条件下提交的样品中差异表达。通常基于类之间距离的度量,已经提出了几种统计方法来实现此目标。众所周知,生物学样品是异质的,因为实验者通常不知道诸如分子亚型或遗传背景之类的因素。例如,在涉及肿瘤分子分类的实验中,重要的是鉴定重要的癌症亚型。双峰或多峰分布通常反映了子样品混合物的存在。因此,在样本亚组上可能存在差异表达的基因,如果使用常规的统计方法,这些基因将被遗漏。在本文中,我们提出了一种新的图形工具,该工具不仅可以识别具有上下调控的基因,而且可以识别在不同亚类中差异表达的基因,如果使用当前的统计方法,这些基因通常会被遗漏。该工具基于样本之间距离的两个度量,即两个密度之间的重叠系数(OVL)和接收器工作特性(ROC)曲线下方的面积。此处提出的方法已在开源R软件中实现。结果此方法应用于公开可用的数据集以及模拟数据集。我们将我们的结果与使用一些检测差异表达基因的标准方法所获得的结果进行了比较,这些方法分别是Welch t统计量,倍数变化(FC),等级乘积(RP),平均差异(AD),加权平均差异(WAD) ,中度t统计量(modT),基于强度的中度t统计量(ibmT),微阵列的显着性分析(samT)和ROC曲线下面积(AUC)。在两个数据集上,并非所有标准选择程序都选择了具有双峰或多峰分布的所有差异表达基因。我们还将结果与(i)ROC曲线和上升区域之间的面积(ABCR)和(ii)ROC曲线不正确的测试(TNRC)进行了比较。我们发现我们的方法更加全面,因为它可以检测双峰和多峰分布,并且两个样本都可以考虑不同的方差。我们方法的另一个优点是我们可以通过图形方式分析不同种类的差异表达基因的行为。结论我们的结果表明,箭头图代表了一种新的灵活有用的工具,可用于分析微阵列基因表达谱。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号