...
首页> 外文期刊>Journal of computational biology >BBK* (Branch and Bound Over K*): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces
【24h】

BBK* (Branch and Bound Over K*): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces

机译:BBK *(K *上的分支和边界):一种可验证且有效的基于集合的蛋白质设计算法,可优化大序列空间上的稳定性和结合亲和力

获取原文
           

摘要

Computational protein design (CPD) algorithms that compute binding affinity, Ka, search for sequences with an energetically favorable free energy of binding. Recent work shows that three principles improve the biological accuracy of CPD: ensemble-based design, continuous flexibility of backbone and side-chain conformations, and provable guarantees of accuracy with respect to the input. However, previous methods that use all three design principles are single-sequence (SS) algorithms, which are very costly: linear in the number of sequences and thus exponential in the number of simultaneously mutable residues. To address this computational challenge, we introduce BBK*, a new CPD algorithm whose key innovation is the multisequence (MS) bound: BBK* efficiently computes a single provable upper bound to approximate Ka for a combinatorial number of sequences, and avoids SS computation for all provably suboptimal sequences. Thus, to our knowledge, BBK* is the first provable, ensemble-based CPD algorithm to run in time sublinear in the number of sequences. Computational experiments on 204 protein design problems show that BBK* finds the tightest binding sequences while approximating Ka for up to 105-fold fewer sequences than the previous state-of-the-art algorithms, which require exhaustive enumeration of sequences. Furthermore, for 51 protein–ligand design problems, BBK* provably approximates Ka up to 1982-fold faster than the previous state-of-the-art iMinDEE// algorithm. Therefore, BBK* not only accelerates protein designs that are possible with previous provable algorithms, but also efficiently performs designs that are too large for previous methods.
机译:计算结合亲和力Ka的计算蛋白质设计(CPD)算法搜索具有能量上有利的结合自由能的序列。最近的工作表明,三个原则提高了CPD的生物学准确性:基于整体的设计,主干和侧链构象的连续灵活性以及相对于输入的可证明的准确性保证。但是,以前使用这三种设计原理的方法都是单序列(SS)算法,该算法非常昂贵:序列数呈线性,因此同时可变的残基数呈指数。为了解决这一计算难题,我们引入了BBK *,这是一种新的CPD算法,其关键创新是多序列(MS)界:BBK *有效地计算了单个可证明的上限,以近似于组合数量的序列的Ka,并避免了针对所有可证明的次优序列。因此,据我们所知,BBK *是第一种可证明的,基于整体的CPD算法,可以在时间上以线性方式在序列数上运行。针对204种蛋白质设计问题的计算实验表明,BBK *找到最紧密的结合序列,同时与Ka相比,其Ka近似少了多达105倍的序列,而后者需要对序列进行详尽的枚举。此外,对于51种蛋白质-配体设计问题,BBK *可证明比以前最先进的iMinDEE //算法更快地将Ka逼近1982倍。因此,BBK *不仅可以加速使用先前可证明的算法进行的蛋白质设计,而且可以有效地执行对于先前方法过大的设计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号