首页> 外文学位 >Development of bioinformatics applications for prediction and validation of polymorphisms in soybean genome using EST data.
【24h】

Development of bioinformatics applications for prediction and validation of polymorphisms in soybean genome using EST data.

机译:利用EST数据开发用于预测和验证大豆基因组多态性的生物信息学应用程序。

获取原文
获取原文并翻译 | 示例

摘要

Soybean is a major commercial crop, and is an important raw material in the manufacture of food, cosmetic and industrial products. DNA polymorphisms in the soybean genome across different cultivars can be used to increase disease resistance, improve oil/seed quality, enhance tolerance to extreme conditions and increase soybean productivity. A soybean whole genome sequence will not likely be completed soon. Expressed sequence tag (EST) sequencing has been performed as an economically feasible alternative to derive information about soybean genes. About 330,000 soybean EST sequences are available in GenBank. This study focuses on development of high throughput bioinformatics tools for prediction and validation of polymorphisms in soybean genome using available EST data. The Tools developed include an (i) automated EST processing pipeline (EST-PAGE), (ii) an automated polymorphism discovery pipeline (SNP-PHAGE), (iii) software to use machine learning programs in confirming the polymorphisms (SNP-PHAGE-ML), and (iv) new algorithms to predict polymorphisms in soybean genome from EST data. The EST-PAGE software pipeline for EST processing was built with several useful features such as plate and spot record management, graphical and statistical features for analyzing plate and library statistics, annotation and automated processing. This software has been used for analysis of about 300,000 soybean ESTs and 100,000 ESTs from livestock species. The SNP-PHAGE software discovers polymorphisms in amplicons sequenced from different genotypes/cultivars, performs haplotype analysis and facilitates GenBank (dbSNP) submissions. Approximately 6,000 soybean polymorphisms were discovered using this pipeline. SNP-PHAGE-ML software includes a machine learning program to confirm positive polymorphisms as a part of the polymorphism discovery package SNP-PHAGE. An accuracy of 97.3% was achieved in five way cross-validation. This package will reduce the time and cost of SNP discovery by decreasing expert intervention. New algorithms were developed to predict polymorphisms. Experimental testing of these predictions resulted in the discovery of approximately 1000 confirmed polymorphisms. The new targeted-amplification method gave better results as compared to random amplification of sequence tag sites (STS) from unique 3' EST. The bioinformatics tools developed were generalized to be applicable to a wide variety of species other than soybean and will be made available open source.
机译:大豆是主要的商业作物,并且是食品,化妆品和工业产品生产中的重要原材料。跨不同品种的大豆基因组中的DNA多态性可用于提高抗病性,改善油料/种子质量,增强对极端条件的耐受性并提高大豆产量。大豆全基因组序列不太可能很快完成。表达序列标签(EST)测序已作为一种经济可行的替代方法来获得有关大豆基因的信息。 GenBank中提供了约330,000个大豆EST序列。这项研究着重于开发高通量生物信息学工具,以利用可获得的EST数据预测和验证大豆基因组中的多态性。开发的工具包括(i)自动EST处理管道(EST-PAGE),(ii)自动多态性发现管道(SNP-PHAGE),(iii)使用机器学习程序确认多态性的软件(SNP-PHAGE- ML),以及(iv)从EST数据预测大豆基因组中多态性的新算法。用于EST处理的EST-PAGE软件管道具有多种有用的功能,例如板和斑点记录管理,用于分析板和库统计数据的图形和统计功能,注释和自动处理。该软件已用于分析来自牲畜物种的约300,000大豆EST和100,000 EST。 SNP-PHAGE软件发现了从不同基因型/品种测序的扩增子中的多态性,进行了单倍型分析并促进了GenBank(dbSNP)提交。使用该管道发现了大约6,000个大豆多态性。 SNP-PHAGE-ML软件包含一个机器学习程序,以确认正多态性作为多态性发现程序包SNP-PHAGE的一部分。通过五种方式进行的交叉验证,其准确度达到了97.3%。该软件包将通过减少专家干预来减少SNP发现的时间和成本。开发了新算法来预测多态性。对这些预测的实验测试导致发现了大约1000个已确认的多态性。与从独特的3'EST随机扩增序列标签位点(STS)相比,新的靶向扩增方法可提供更好的结果。普遍开发的生物信息学工具可适用于除大豆以外的多种物种,并将开放源代码提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号