首页> 美国卫生研究院文献>Nucleic Acids Research >GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences
【2h】

GeneTack database: genes with frameshifts in prokaryotic genomes and eukaryotic mRNA sequences

机译:GeneTack数据库:原核基因组和真核mRNA序列中具有移码的基因

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Database annotations of prokaryotic genomes and eukaryotic mRNA sequences pay relatively low attention to frame transitions that disrupt protein-coding genes. Frame transitions (frameshifts) could be caused by sequencing errors or indel mutations inside protein-coding regions. Other observed frameshifts are related to recoding events (that evolved to control expression of some genes). Earlier, we have developed an algorithm and software program GeneTack for ab initio frameshift finding in intronless genes. Here, we describe a database (freely available at ) containing genes with frameshifts (fs-genes) predicted by GeneTack. The database includes 206 991 fs-genes from 1106 complete prokaryotic genomes and 45 295 frameshifts predicted in mRNA sequences from 100 eukaryotic genomes. The whole set of fs-genes was grouped into clusters based on sequence similarity between fs-proteins (conceptually translated fs-genes), conservation of the frameshift position and frameshift direction (−1, +1). The fs-genes can be retrieved by similarity search to a given query sequence via a web interface, by fs-gene cluster browsing, etc. Clusters of fs-genes are characterized with respect to their likely origin, such as pseudogenization, phase variation, etc. The largest clusters contain fs-genes with programed frameshifts (related to recoding events).
机译:原核生物基因组和真核mRNA序列的数据库注释对破坏蛋白质编码基因的框架转换的关注相对较低。帧转换(移码)可能是由于蛋白质编码区内的测序错误或插入缺失突变引起的。其他观察到的移码与记录事件(进化为控制某些基因的表达)有关。之前,我们已经开发了一种算法和软件程序GeneTack,用于在无内含子基因中从头开始进行移码查找。在这里,我们描述了一个数据库(可从处免费获得),该数据库包含GeneTack预测的具有移码(fs-genes)的基因。该数据库包含来自1106个完整原核基因组的206-991个fs基因和100个真核生物基因组的mRNA序列中预测的45-295个移码。根据fs蛋白(概念上翻译的fs基因)之间的序列相似性,移码位置和移码方向(-1,+ 1)的保守性,将整个fs基因集分为几类。可以通过Web界面通过与给定查询序列的相似性搜索,通过fs-gene集群浏览等方式检索fs-genes。fs-genes的集群根据其可能的来源进行特征描述,例如伪遗传,相位变化,最大的簇包含具有已编程移码(与重新编码事件有关)的fs基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号