首页> 美国卫生研究院文献>Nucleic Acids Research >The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation
【2h】

The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation

机译:博洛尼亚注释资源(BAR 3.0):改善蛋白质功能注释

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

BAR 3.0 updates our server BAR (Bologna Annotation Resource) for predicting protein structural and functional features from sequence. We increase data volume, query capabilities and information conveyed to the user. The core of BAR 3.0 is a graph-based clustering procedure of UniProtKB sequences, following strict pairwise similarity criteria (sequence identity ≥40% with alignment coverage ≥90%). Each cluster contains the available annotation downloaded from UniProtKB, GO, PFAM and PDB. After statistical validation, GO terms and PFAM domains are cluster-specific and annotate new sequences entering the cluster after satisfying similarity constraints. BAR 3.0 includes 28 869 663 sequences in 1 361 773 clusters, of which 22.2% (22 241 661 sequences) and 47.4% (24 555 055 sequences) have at least one validated GO term and one PFAM domain, respectively. 1.4% of the clusters (36% of all sequences) include PDB structures and the cluster is associated to a hidden Markov model that allows building template-target alignment suitable for structural modeling. Some other 3 399 026 sequences are singletons. BAR 3.0 offers an improved search interface, allowing queries by UniProtKB-accession, Fasta sequence, GO-term, PFAM-domain, organism, PDB and ligand/s. When evaluated on the CAFA2 targets, BAR 3.0 largely outperforms our previous version and scores among state-of-the-art methods. BAR 3.0 is publicly available and accessible at .
机译:BAR 3.0更新了我们的服务器BAR(博洛尼亚注释资源),用于根据序列预测蛋白质的结构和功能特征。我们增加了数据量,查询功能和传达给用户的信息。 BAR 3.0的核心是遵循严格的成对相似性标准(序列同一性≥40%,比对覆盖率≥90%)的UniProtKB序列基于图的聚类过程。每个群集都包含从UniProtKB,GO,PFAM和PDB下载的可用注释。经过统计验证后,GO术语和PFAM域是特定于群集的,并在满足相似性约束后注释进入群集的新序列。 BAR 3.0在1 361 773个簇中包含28 869 663个序列,其中22.2%(22 241 661个序列)和47.4%(24 555 055个序列)分别具有至少一个经过验证的GO项和一个PFAM域。 1.4%的簇(所有序列的36%)包括PDB结构,并且该簇与隐马尔可夫模型关联,该隐马尔可夫模型允​​许构建适用于结构建模的模板-目标比对。其他一些3 399 026序列是单例。 BAR 3.0提供了改进的搜索界面,允许按UniProtKB登录,Fasta序列,GO项,PFAM域,生物,PDB和配体进行查询。在对CAFA2目标进行评估时,BAR 3.0大大优于我们的先前版本,并且在最新方法中得分很高。 BAR 3.0可通过以下途径公开获得和访问。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号