...
首页> 外文期刊>Journal of genetics >Exact Tandem Repeats Analyzer (E-TRA): A new program for DNA sequence mining
【24h】

Exact Tandem Repeats Analyzer (E-TRA): A new program for DNA sequence mining

机译:精确串联重复序列分析仪(E-TRA):DNA序列挖掘的新程序

获取原文
           

摘要

Exact Tandem Repeats Analyzer 1.0 (E-TRA) combines sequence motif searches with keywords such as a€?organsa€?, a€?tissuesa€?, a€?cell linesa€? and a€?development stagesa€? for finding simple exact tandem repeats as well as non-simple repeats. E-TRA has several advanced repeat search parameters/options compared to other repeat finder programs as it not only accepts GenBank, FASTA and expressed sequence tags (EST) sequence files, but also does analysis of multiple files with multiple sequences. The minimum and maximum tandem repeat motif lengths that E-TRA finds vary from one to one thousand. Advanced user defined parameters/options let the researchers use different minimum motif repeats search criteria for varying motif lengths simultaneously. One of the most interesting features of genomes is the presence of relatively short tandem repeats (TRs). These repeated DNA sequences are found in both prokaryotes and eukaryotes, distributed almost at random throughout the genome. Some of the tandem repeats play important roles in the regulation of gene expression whereas others do not have any known biological function as yet. Nevertheless, they have proven to be very beneficial in DNA profiling and genetic linkage analysis studies. To demonstrate the use of E-TRA, we used 5,465,605 human EST sequences derived from 18,814,550 GenBank EST sequences. Our results indicated that 12.44% (679,800) of the human EST sequences contained simple and non-simple repeat string patterns varying from one to 126 nucleotides in length. The results also revealed that human organs, tissues, cell lines and different developmental stages differed in number of repeats as well as repeat composition, indicating that the distribution of expressed tandem repeats among tissues or organs are not random, thus differing from the un-transcribed repeats found in genomes.
机译:精确串联重复分析仪1.0(E-TRA)结合了序列基序搜索和诸如“器官”,“组织”,“细胞系”等关键字。和一个发展阶段查找简单的精确串联重复序列和非简单的重复序列。与其他重复查找程序相比,E-TRA具有几个高级重复搜索参数/选项,因为它不仅可以接受GenBank,FASTA和表达的序列标签(EST)序列文件,而且还可以分析具有多个序列的多个文件。 E-TRA发现的最小和最大串联重复基序长度从一千到一千不等。先进的用户定义参数/选项使研究人员可以同时使用不同的最小图案重复搜索标准,以同时改变图案长度。基因组最有趣的特征之一是存在相对较短的串联重复序列(TR)。在原核生物和真核生物中都发现了这些重复的DNA序列,几乎随机分布在整个基因组中。一些串联重复序列在基因表达的调节中起重要作用,而另一些尚未具有任何已知的生物学功能。然而,事实证明它们在DNA谱分析和遗传连锁分析研究中非常有益。为了证明E-TRA的使用,我们使用了来自18,814,550 GenBank EST序列的5,465,605个人EST序列。我们的结果表明,人类EST序列中有12.44%(679,800)包含简单和非简单的重复字符串模式,长度从1到126个核苷酸不等。结果还表明,人体器官,组织,细胞系和不同发育阶段的重复数和重复组成不同,这表明表达的串联重复在组织或器官之间的分布不是随机的,因此与未转录的不同基因组中发现重复序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号