首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Protein Classification with Extended-Sequence Coding by Sliding Window
【24h】

Protein Classification with Extended-Sequence Coding by Sliding Window

机译:滑动窗扩展序列编码的蛋白质分类

获取原文
获取原文并翻译 | 示例
           

摘要

A large number of unclassified sequences is still found in public databases, which suggests that there is still need for new investigations in the area. In this contribution, we present a methodology based on Artificial Neural Networks for protein functional classification. A new protein coding scheme, called here Extended-Sequence Coding by Sliding Windows, is presented with the goal of overcoming some of the difficulties of the well method Sequence Coding by Sliding Window. The new protein coding scheme uses more than one sliding window length with a weight factor that is proportional to the window length, avoiding the ambiguity problem without ignoring the identity of small subsequences Accuracy for Sequence Coding by Sliding Windows ranged from 60.1 to 77.7 percent for the first bacterium protein set and from 61.9 to 76.7 percent for the second one, whereas the accuracy for the proposed Extended-Sequence Coding by Sliding Windows scheme ranged from 70.7 to 97.1 percent for the first bacterium protein set and from 61.1 to 93.3 percent for the second one. Additionally, protein sequences classified inconsistently by the Artificial Neural Networks were analyzed by CD-Search revealing that there are some disagreement in public repositories, calling the attention for the relevant issue of error propagation in annotated databases due the incorrect transferred annotations.
机译:在公共数据库中仍发现大量未分类序列,这表明仍需要对该地区进行新的调查。在这项贡献中,我们提出了一种基于人工神经网络的蛋白质功能分类方法。提出了一种新的蛋白质编码方案,称为滑动窗扩展序列编码,其目的是克服井方法的滑动窗序列编码的一些困难。新的蛋白质编码方案使用了多个滑动窗口长度,且权重因子与窗口长度成正比,避免了模棱两可的问题,同时又不忽略小子序列的身份。滑动窗口的序列编码准确度范围为60.1%至77.7%。第一个细菌蛋白集的准确率在第二个细菌蛋白集的61.9%至76.7%之间,而拟议的扩展窗口滑动序列编码方案的准确度在第一个细菌蛋白集的70.7%至97.1%之间,在第二个细菌蛋白集的准确性在61.1%至93.3%之间一。此外,通过CD-Search对通过人工神经网络分类不一致的蛋白质序列进行了CD-Search分析,发现在公共存储库中存在一些分歧,由于错误地转移了注释,引起了人们对注释数据库中错误传播的相关问题的关注。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号