The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools

Andreas Wilke; Travis Harrison; Jared Wilkening; Dawn Field; Elizabeth M Glass; Nikos Kyrpides; Konstantinos Mavrommatis; Folker Meyer

首页> 外文期刊>BMC Bioinformatics >The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools

【24h】

The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools

机译：M5nr：一个新颖的非冗余数据库，包含来自多个来源和相关工具的蛋白质序列和注释

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Computing of sequence similarity results is becoming a limiting factor in metagenome analysis. Sequence similarity search results encoded in an open, exchangeable format have the potential to limit the needs for computational reanalysis of these data sets. A prerequisite for sharing of similarity results is a common reference. Description We introduce a mechanism for automatically maintaining a comprehensive, non-redundant protein database and for creating a quarterly release of this resource. In addition, we present tools for translating similarity searches into many annotation namespaces, e.g. KEGG or NCBI's GenBank. Conclusions The data and tools we present allow the creation of multiple result sets using a single computation, permitting computational results to be shared between groups for large sequence data sets.

机译：序列相似性结果的背景计算正在成为基因组分析中的限制因素。以开放的，可交换的格式编码的序列相似性搜索结果可能会限制对这些数据集进行计算重新分析的需求。共享相似结果的先决条件是共同的参考。描述我们介绍了一种机制，该机制可自动维护一个全面的，非冗余的蛋白质数据库，并按季度创建此资源。此外，我们提供了将相似性搜索转换为许多注释名称空间的工具，例如KEGG或NCBI的GenBank。结论我们提供的数据和工具允许使用一次计算创建多个结果集，从而可以在大型序列数据集的组之间共享计算结果。

著录项

来源
《BMC Bioinformatics》 |2012年第1期|共页
作者
Andreas Wilke; Travis Harrison; Jared Wilkening; Dawn Field; Elizabeth M Glass; Nikos Kyrpides; Konstantinos Mavrommatis; Folker Meyer;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类生物科学;
关键词

相似文献

外文文献
专利

1. The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools [J] . Andreas Wilke, Travis Harrison, Jared Wilkening, BMC Bioinformatics . 2012,第1期

机译：M5nr：一个新颖的非冗余数据库，包含来自多个来源和相关工具的蛋白质序列和注释
2. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins [J] . Donna R. Maglott, Kim D. Pruitt, Tatiana Tatusova Nucleic acids research . 2007,第suppla1期

机译：NCBI参考序列（RefSeq）：基因组，转录本和蛋白质的精选非冗余序列数据库
3. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins [J] . Donna R. Maglott, Kim D. Pruitt, Tatiana Tatusova Nucleic acids research . 2007,第suppla1期

机译：NCBI参考序列（RefSeq）：基因组，转录本和蛋白质的精选非冗余序列数据库
4. Software Tool for Researching Annotations of Proteins (STRAP): Open-Source Software for Protein Annotation and Data Visualization [C] . Vivek Bhatia, David H. Perlman, Catherine E. Costello, American Society for Mass Spectrometry Conference on Mass Spectrometry and Allied Topics . 2009

机译：用于研究蛋白质注释（表带）注释的软件工具：用于蛋白质注释和数据可视化的开源软件
5. Automatic annotation of multiple protein sequence alignments using recurrent neural networks. [D] . Aggarwal, Aditya. 2006

机译：使用递归神经网络自动注释多个蛋白质序列比对。
6. The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools [O] . Andreas Wilke, Travis Harrison, Jared Wilkening, 2012

机译：M5nr：一种新颖的非冗余数据库包含来自多个来源和相关工具的蛋白质序列和注释
7. The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools [O] . Wilke Andreas, Harrison Travis, Wilkening Jared, 2012

机译：M5nr：一种新颖的非冗余数据库，包含来自多个来源和相关工具的蛋白质序列和注释

The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools

摘要

著录项

相似文献

相关主题

期刊订阅