首页> 外文期刊>Nucleic acids research >GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences
【24h】

GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences

机译:GeneTack数据库:原核基因组和真核mRNA序列中具有移码的基因

获取原文
           

摘要

Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at http://topaz.gatech.edu/GeneTack/db.html) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206?991 fs-genes from 1106 complete prokaryotic genomes and 45?295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (?1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).
机译:原核生物基因组和真核mRNA序列的数据库注释对破坏蛋白质编码基因的框架转换的关注度相对较低。帧转换(移码)可能是由蛋白质编码区内的测序错误或插入缺失突变引起的。其他观察到的移码与记录事件(进化为控制某些基因的表达)有关。之前,我们已经开发了一种算法和软件程序GeneTack,用于在无内含子基因中从头开始进行移码查找。在这里,我们描述了一个数据库(可从http://topaz.gatech.edu/GeneTack/db.html免费获得),该数据库包含GeneTack预测的具有移码(fs-genes)的基因。该数据库包括来自1106个完整原核基因组的206-991个fs基因和100个真核生物基因组的mRNA序列中预测的45-295个移码。根据fs蛋白(概念上翻译的fs基因)之间的序列相似性,移码位置和移码方向的保守性(?1,+1),将整个fs基因集合分为几类。可以通过Web界面通过与给定查询序列的相似性搜索,fs-gene集群浏览等方式来检索fs-genes。fs-genes的集群针对其可能的来源进行了表征,例如伪遗传,相位变化,最大的簇包含具有已编程移码(与重新编码事件有关)的fs基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号