...
首页> 外文期刊>BMC Genomics >Can the vector space model be used to identify biological entity activities?
【24h】

Can the vector space model be used to identify biological entity activities?

机译:向量空间模型可以用于识别生物实体活动吗?

获取原文
   

获取外文期刊封面封底 >>

       

摘要

BackgroundBiological systems are commonly described as networks of entity interactions. Some interactions are already known and integrate the current knowledge in life sciences. Others remain unknown for long periods of time and are frequently discovered by chance. In this work we present a model to predict these unknown interactions from a textual collection using the vector space model (VSM), a well known and established information retrieval model. We have extended the VSM ability to retrieve information using a transitive closure approach. Our objective is to use the VSM to identify the known interactions from the literature and construct a network. Based on interactions established in the network our model applies the transitive closure in order to predict and rank new interactions.ResultsWe have tested and validated our model using a collection of patent claims issued from 1976 to 2005. From 266,528 possible interactions in our network, the model identified 1,027 known interactions and predicted 3,195 new interactions. Iterating the model according to patent issue dates, interactions found in a given past year were often confirmed by patent claims not in the collection and issued in more recent years. Most confirmation patent claims were found at the top 100 new interactions obtained from each subnetwork. We have also found papers on the Web which confirm new inferred interactions. For instance, the best new interaction inferred by our model relates the interaction between the adrenaline neurotransmitter and the androgen receptor gene. We have found a paper that reports the partial dependence of the antiapoptotic effect of adrenaline on androgen receptor. ConclusionsThe VSM extended with a transitive closure approach provides a good way to identify biological interactions from textual collections. Specifically for the context of literature-based discovery, the extended VSM contributes to identify and rank relevant new interactions even if these interactions occcur in only a few documents in the collection. Consequently, we have developed an efficient method for extracting and restricting the best potential results to consider as new advances in life sciences, even when indications of these results are not easily observed from a mass of documents.
机译:背景技术生物学系统通常被描述为实体相互作用的网络。一些相互作用是已知的,并将当前的知识整合到生命科学中。其他人则长期处于未知状态,经常被偶然发现。在这项工作中,我们提出了一个模型,该模型使用向量空间模型(VSM)(一个众所周知的已建立的信息检索模型)从文本集合中预测这些未知的相互作用。我们扩展了VSM使用传递闭包方法检索信息的能力。我们的目标是使用VSM从文献中识别已知的相互作用并构建网络。基于网络中建立的交互,我们的模型应用了传递闭包,以便预测和排序新的交互。结果我们使用了1976年至2005年间发布的一系列专利权利要求对模型进行了测试和验证。在网络中,从266,528种可能的交互中,该模型确定了1,027个已知交互,并预测了3,195个新交互。根据专利发布日期对模型进行迭代,在过去一年中发现的交互作用通常由不在集合中且在最近几年发布的专利声明所证实。从每个子网络获得的前100个新交互中,大多数确认专利权要求都被发现。我们还在网络上找到了证实新的推断相互作用的论文。例如,我们的模型推断出的最佳新相互作用是与肾上腺素神经递质和雄激素受体基因之间的相互作用有关。我们发现有一篇论文报道了肾上腺素对雄激素受体的抗凋亡作用的部分依赖性。结论通过传递闭包方法扩展的VSM提供了一种从文本集合中识别生物相互作用的好方法。专门针对基于文献的发现,扩展的VSM有助于识别和排序相关的新交互,即使这些交互仅出现在集合中的少数文档中也是如此。因此,我们开发了一种有效的方法来提取和限制可能被视为生命科学新进展的最佳潜在结果,即使从大量文档中不容易观察到这些结果的迹象。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号