...
【24h】

A binary search approach to whole-genome data analysis

机译:二元搜索全基因组数据分析方法

获取原文
获取原文并翻译 | 示例
           

摘要

A sequence analysis-oriented. binary search-like algorithm was transformed to a sensitive and accurate analysis tool for processing whole-genome data. The advantage of the algorithm over previous methods is its ability to detect the margins of both short and long genome fragments, enriched by up-regulated signals, at equal accuracy. The score of an enriched genome fragment reflects the difference between the actual concentration of up-regulated signals in the fragment and the chromosome signal baseline. The "divide-and-conquer"-type algorithm detects a series of noninter-secting fragments of various lengths with locally optimal scores. The procedure is applied to detected fragments in a nested manner by recalculating the lower-than-baseline signals in the chromosome. The algorithm was applied to simulated whole-genome data, and its sensitivity/specificity were compared with those of several alternative algorithms. The algorithm was also tested with four biological tiling array datasets comprising Arabidopsis (ⅰ) expression and (ⅱ) histone 3 lysine 27 trimethylation CHIP-on-chip data-sets; Saccharomyces cerevisiae (ⅲ) spliced intron data and (ⅳ) chromatin remodeling factor binding sites. The analyses' results demonstrate the power of the algorithm in identifying both the short up-regulated fragments (such as exons and transcription factor binding sites) and the long-even moderately up-regulated zones-at their precise genome margins. The algorithm generates an accurate whole-genome landscape that could be used for cross-comparison of signals across the same genome in evolutionary and general genomic studies.
机译:面向序列分析。类似二元搜索的算法已转换为灵敏且准确的分析工具,可用于处理全基因组数据。与以前的方法相比,该算法的优势在于它能够以相等的精度检测短时基因组片段和长时基因组片段的边缘,这些片段被上调信号所富集。富集的基因组片段的分数反映了片段中上调信号的实际浓度与染色体信号基线之间的差异。 “分而治之”类型的算法可检测一系列具有局部最优分数的各种长度的非相交片段。通过重新计算染色体中低于基线的信号,将该过程以嵌套的方式应用于检测到的片段。将该算法应用于模拟的全基因组数据,并将其敏感性/特异性与几种替代算法的敏感性/特异性进行了比较。还用包括拟南芥(ido)表达和(ⅱ)组蛋白3赖氨酸27三甲基化芯片上CHIP数据集的四个生物切片阵列数据集测试了该算法。酿酒酵母(ⅲ)剪接内含子数据和(ⅳ)染色质重塑因子结合位点。分析结果证明了该算法在识别短的上调片段(例如外显子和转录因子结合位点)和长到中度上调的区域(精确的基因组边界)方面的能力。该算法可生成准确的全基因组图谱,可用于在进化和一般基因组研究中交叉比较同一基因组上的信号。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号