首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >Subcellular Localization Prediction through Boosting Association Rules
【24h】

Subcellular Localization Prediction through Boosting Association Rules

机译:通过促进关联规则的亚细胞定位预测

获取原文
获取原文并翻译 | 示例
           

摘要

Computational methods for predicting protein subcellular localization have used various types of features, including N-terminal sorting signals, amino acid compositions, and text annotations from protein databases. Our approach does not use biological knowledge such as the sorting signals or homologues, but use just protein sequence information. The method divides a protein sequence into short k-mer sequence fragments which can be mapped to word features in document classification. A large number of class association rules are mined from the protein sequence examples that range from the N-terminus to the C-terminus. Then, a boosting algorithm is applied to those rules to build up a final classifier. Experimental results using benchmark data sets show that our method is excellent in terms of both the classification performance and the test coverage. The result also implies that the k-mer sequence features which determine subcellular locations do not necessarily exist in specific positions of a protein sequence. Online prediction service implementing our method is available at http://isoft.postech.ac.kr/research/BCAR/subcell.
机译:预测蛋白质亚细胞定位的计算方法已使用了各种类型的功能,包括N端分选信号,氨基酸组成和蛋白质数据库中的文本注释。我们的方法不使用生物学知识,例如分类信号或同源物,而仅使用蛋白质序列信息。该方法将蛋白质序列分成短的k-mer序列片段,可以将其映射到文档分类中的单词特征。从N端到C端的蛋白质序列示例中提取了大量类别关联规则。然后,将增强算法应用于这些规则以构建最终分类器。使用基准数据集的实验结果表明,我们的方法在分类性能和测试覆盖率方面均非常出色。该结果还暗示确定亚细胞位置的k-mer序列特征不一定存在于蛋白质序列的特定位置。实现我们方法的在线预测服务可从http://isoft.postech.ac.kr/research/BCAR/subcell获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号