...
首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >A Guaranteed Similarity Metric Learning Framework for Biological Sequence Comparison
【24h】

A Guaranteed Similarity Metric Learning Framework for Biological Sequence Comparison

机译:生物序列比较的有保证的相似度量学习框架

获取原文
获取原文并翻译 | 示例
           

摘要

Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. The distance and similarity between two sequence are very important and widely studied. During the last decades, Similarity(distance) metric learning is one of the hottest topics of machine learning/data mining as well as their applications in the bioinformatics field. It is feasible to introduce machine learning technology to learn similarity metric from biological data. In this paper, we propose a novel framework of guaranteed similarity metric learning (GMSL) to perform alignment of biology sequences in any feature vector space. It introduces the (ϵ,γ,τ) -goodness similarity theory to Mahalanobis metric learning. As a theoretical guaranteed similarity metric learning approach, GMSL guarantees that the learned similarity function performs well in classification and clustering. Our experiments on the most used datasets demonstrate that our approach outperforms the state-of-the-art biological sequences alignment methods and other similarity metric learning algorithms in both accuracy and stability.
机译:序列的相似性是生物学分类和系统发育研究的关键数学概念。两个序列之间的距离和相似性非常重要,并且得到了广泛的研究。在过去的几十年中,相似性(距离)度量学习是机器学习/数据挖掘及其在生物信息学领域的应用中最热门的主题之一。引入机器学习技术从生物学数据中学习相似性度量是可行的。在本文中,我们提出了一个保证相似性度量学习(GMSL)的新颖框架,可以在任何特征向量空间中执行生物学序列的比对。它将(ϵ,γ,τ)-善度相似性理论引入到Mahalanobis度量学习中。作为理论上保证的相似性度量学习方法,GMSL保证所学习的相似性函数在分类和聚类中表现良好。我们在最常用的数据集上进行的实验表明,我们的方法在准确性和稳定性方面都优于最新的生物序列比对方法和其他相似性度量学习算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号