【24h】

A data integration methodology for systems biology

机译:系统生物学的数据集成方法

获取原文
获取原文并翻译 | 示例
           

摘要

Different experimental technologies measure different aspects of a system and to differing depth and breadth. High-throughput assays have inherently high false-positive and false-negative rates. Moreover, each technology includes systematic biases of a different nature. These differences make network reconstruction from multiple data sets difficult and error-prone. Additionally, because of the rapid rate of progress in biotechnology, there is usually no curated exemplar data set from which one might estimate data integration parameters. To address these concerns, we have developed data integration methods that can handle multiple data sets differing in statistical power, type, size, and network coverage without requiring a curated training data set. Our methodology is general in purpose and may be applied to integrate data from any existing and future technologies. Here we outline our methods and then demonstrate their performance by applying them to simulated data sets. The results show that these methods select true-positive data elements much more accurately than classical approaches. In an accompanying companion paper, we demonstrate the applicability of our approach to biological data. We have integrated our methodology into a free open source software package named POINTILLIST.
机译:不同的实验技术测量系统的不同方面以及深度和广度。高通量分析固有地具有较高的假阳性和假阴性率。此外,每种技术都包含不同性质的系统偏差。这些差异使从多个数据集重建网络变得困难且容易出错。此外,由于生物技术的快速发展,通常没有经过整理的示例性数据集可用来估计数据集成参数。为了解决这些问题,我们开发了数据集成方法,该方法可以处理统计能力,类型,大小和网络覆盖范围不同的多个数据集,而无需精心设计的训练数据集。我们的方法具有通用性,可用于集成来自任何现有技术和未来技术的数据。在这里,我们概述了我们的方法,然后通过将其应用于模拟数据集来演示其性能。结果表明,与传统方法相比,这些方法选择真实阳性数据元素的准确性要高得多。在随附的随附论文中,我们演示了我们的生物数据处理方法的适用性。我们已将方法论集成到名为POINTILLIST的免费开源软件包中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号