首页> 外文期刊>Database >MisPred: a resource for identification of erroneous protein sequences in public databases
【24h】

MisPred: a resource for identification of erroneous protein sequences in public databases

机译:MisPred:在公共数据库中用于识别错误蛋白质序列的资源

获取原文
           

摘要

Correct prediction of the structure of protein-coding genes of higher eukaryotes is still a difficult task; therefore, public databases are heavily contaminated with mispredicted sequences. The high rate of misprediction has serious consequences because it significantly affects the conclusions that may be drawn from genome-scale sequence analyses of eukaryotic genomes. Here we present the MisPred database and computational pipeline that provide efficient means for the identification of erroneous sequences in public databases. The MisPred database contains a collection of abnormal, incomplete and mispredicted protein sequences from 19 metazoan species identified as erroneous by MisPred quality control tools in the UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, NCBI/RefSeq and EnsEMBL databases. Major releases of the database are automatically generated and updated regularly. The database (http://www.mispred.com) is easily accessible through a simple web interface coupled to a powerful query engine and a standard web service. The content is completely or partially downloadable in a variety of formats. Database URL: http://www.mispred.com
机译:正确预测高级真核生物蛋白质编码基因的结构仍然是一项艰巨的任务。因此,公共数据库被错误预测的序列严重污染。错误预测的高比率具有严重的后果,因为它会严重影响可能从真核基因组的基因组规模序列分析得出的结论。在这里,我们介绍了MisPred数据库和计算管道,它们为识别公共数据库中的错误序列提供了有效的手段。 MisPred数据库包含UniProtKB / Swiss-Prot,UniProtKB / TrEMBL,NCBI / RefSeq和EnsEMBL数据库中被MisPred质量控制工具鉴定为错误的19种后生动物物种的异常,不完整和错误预测的蛋白质序列的集合。数据库的主要版本会自动生成并定期更新。可通过一个简单的Web界面轻松访问该数据库(http://www.mispred.com),该界面与功能强大的查询引擎和标准的Web服务结合在一起。内容可以全部或部分以各种格式下载。数据库URL:http://www.mispred.com

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号