...
首页> 外文期刊>NanoBioscience, IEEE Transactions on >Protein Superfamily Classification Using Fuzzy Rule-Based Classifier
【24h】

Protein Superfamily Classification Using Fuzzy Rule-Based Classifier

机译:基于模糊规则分类器的蛋白质超家族分类

获取原文
           

摘要

In this paper, we have proposed a fuzzy rule-based classifier for assigning amino acid sequences into different superfamilies of proteins. While the most popular methods for protein classification rely on sequence alignment, our approach is alignment-free and so more human readable. It accounts for the distribution of contiguous patterns of n amino acids ( n-grams) in the sequences as features, alike other alignment-independent methods. Our approach, first extracts a plenty of features from a set of training sequences, then selects only some best of them, using a proposed feature ranking method. Thereafter, using these features, a novel steady-state genetic algorithm for extracting fuzzy classification rules from data is used to generate a compact set of interpretable fuzzy rules. The generated rules are simple and human understandable. So, the biologists can utilize them, for classification purposes, or incorporate their expertise to interpret or even modify them. To evaluate the performance of our fuzzy rule-based classifier, we have compared it with the conventional nonfuzzy C4.5 algorithm, beside some other fuzzy classifiers. This comparative study is conducted through classifying the protein sequences of five superfamily classes, downloaded from a public domain database. The obtained results show that the generated fuzzy rules are more interpretable, with acceptable improvement in the classification accuracy.
机译:在本文中,我们提出了一种基于模糊规则的分类器,用于将氨基酸序列分配给不同的蛋白质超家族。尽管最流行的蛋白质分类方法依赖于序列比对,但我们的方法是无比对的,因此更易于人类阅读。与其他不依赖比对的方法一样,它说明了序列中n个氨基酸(n-克)的连续模式的分布特征。我们的方法是,首先从一组训练序列中提取大量特征,然后使用建议的特征排名方法仅选择一些最佳特征。此后,利用这些功能,可以使用一种新颖的稳态遗传算法从数据中提取模糊分类规则,以生成一组紧凑的可解释模糊规则。生成的规则简单易懂。因此,生物学家可以将其用于分类目的,或者结合其专业知识来解释甚至修改它们。为了评估基于模糊规则的分类器的性能,我们将其与常规的非模糊C4.5算法进行了比较,此外还进行了其他一些模糊分类器的比较。这项比较研究是通过对从公共领域数据库下载的五个超家族蛋白序列进行分类来进行的。所得结果表明,所生成的模糊规则更具可解释性,分类精度得到了可接受的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号