...
首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Weighted similarity-based clustering of chemical structures and bioactivity data in early drug discovery
【24h】

Weighted similarity-based clustering of chemical structures and bioactivity data in early drug discovery

机译:早期药物发现中基于加权相似度的化学结构和生物活性数据聚类

获取原文
获取原文并翻译 | 示例
           

摘要

The modern process of discovering candidate molecules in early drug discovery phase includes a wide range of approaches to extract vital information from the intersection of biology and chemistry. A typical strategy in compound selection involves compound clustering based on chemical similarity to obtain representative chemically diverse compounds (not incorporating potency information). In this paper, we propose an integrative clustering approach that makes use of both biological (compound efficacy) and chemical (structural features) data sources for the purpose of discovering a subset of compounds with aligned structural and biological properties. The datasets are integrated at the similarity level by assigning complementary weights to produce a weighted similarity matrix, serving as a generic input in any clustering algorithm. This new analysis work flow is semi-supervised method since, after the determination of clusters, a secondary analysis is performed wherein it finds differentially expressed genes associated to the derived integrated cluster(s) to further explain the compound-induced biological effects inside the cell. In this paper, datasets from two drug development oncology projects are used to illustrate the usefulness of the weighted similarity-based clustering approach to integrate multi-source high-dimensional information to aid drug discovery. Compounds that are structurally and biologically similar to the reference compounds are discovered using this proposed integrative approach.
机译:在早期药物发现阶段发现候选分子的现代过程包括多种从生物学和化学交叉中提取重要信息的方法。化合物选择的典型策略涉及基于化学相似性的化合物聚类以获得具有代表性的化学上多样化的化合物(不包括效价信息)。在本文中,我们提出了一种综合聚类方法,该方法利用生物学(化合物功效)和化学(结构特征)数据源,以发现具有一致的结构和生物学特性的化合物的子集。通过分配互补权重以生成加权相似度矩阵,将数据集以相似度级别进行集成,用作任何聚类算法中的通用输入。这种新的分析工作流程是半监督方法,因为在确定簇之后,进行了二次分析,其中发现与衍生的整合簇相关的差异表达基因,以进一步解释化合物诱导的细胞内生物学效应。 。在本文中,来自两个药物开发肿瘤学项目的数据集用于说明基于加权相似度的聚类方法用于整合多源高维信息以帮助药物发现的有用性。使用这种提议的整合方法发现了与参考化合物在结构和生物学上相似的化合物。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号