首页> 外文期刊>JMIR Medical Informatics >A Graph Convolutional Network–Based Method for Chemical-Protein Interaction Extraction: Algorithm Development
【24h】

A Graph Convolutional Network–Based Method for Chemical-Protein Interaction Extraction: Algorithm Development

机译:基于图的化学蛋白质相互作用提取方法:算法开发

获取原文
           

摘要

Background Extracting the interactions between chemicals and proteins from the biomedical literature is important for many biomedical tasks such as drug discovery, medicine precision, and knowledge graph construction. Several computational methods have been proposed for automatic chemical-protein interaction (CPI) extraction. However, the majority of these proposed models cannot effectively learn semantic and syntactic information from complex sentences in biomedical texts. Objective To relieve this problem, we propose a method to effectively encode syntactic information from long text for CPI extraction. Methods Since syntactic information can be captured from dependency graphs, graph convolutional networks (GCNs) have recently drawn increasing attention in natural language processing. To investigate the performance of a GCN on CPI extraction, this paper proposes a novel GCN-based model. The model can effectively capture sequential information and long-range syntactic relations between words by using the dependency structure of input sentences. Results We evaluated our model on the ChemProt corpus released by BioCreative VI; it achieved an F-score of 65.17%, which is 1.07% higher than that of the state-of-the-art system proposed by Peng et al. As indicated by the significance test (P Conclusions Our model can obtain more information from the dependency graph than previously proposed models. Experimental results suggest that it is competitive to state-of-the-art methods and significantly outperforms other methods on the ChemProt corpus, which is the benchmark data set for CPI extraction.
机译:背景技术从生物医学文献中提取化学品和蛋白质之间的相互作用对于许多生物医学任务,例如药物发现,医学精度和知识图形建设是重要的。已经提出了用于自动化学蛋白质相互作用(CPI)提取的几种计算方法。然而,这些拟议的模型中的大多数不能从生物医学文本中的复杂句子中有效地学习语义和句法信息。目的旨在缓解此问题,我们提出了一种方法来有效地从长文本中编码句法信息以进行CPI提取。方法由于可以从依赖图捕获句法信息,因此最近在自然语言处理中汲取了越来越长的关注的图形卷积网络。为了探讨GCN对CPI提取的性能,本文提出了一种基于GCN的基于GCN的模型。该模型可以通过使用输入句子的依赖性结构有效地捕获单词之间的顺序信息和远程句法关系。结果我们在Biocreative VI释放的ChemProt语料库上评估了我们的模型;它达到了65.17%的F分,比Peng等人提出的最先进系统高1.07%。如显着性测试所示(P结果可以从依赖性图中获取多于先前提出的模型的更多信息。实验结果表明它对最先进的方法具有竞争力,并且在化学网语上显着优于其他方法,这是CPI提取的基准数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号