首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Cascade word embedding to sentence embedding: A class label enhanced approach to phenotype extraction
【24h】

Cascade word embedding to sentence embedding: A class label enhanced approach to phenotype extraction

机译:级联词嵌入到句子嵌入:类标签增强的表型提取方法

获取原文

摘要

In molecular biology, phenotypes are often described using complex semantics and diverse biomedical expressions, thereby facilitating the development of named entity recognition (NER). Here, we propose a novel approach of recognizing plant phenotypes by cascading word embedding to sentence embedding with a class label enhancement. We utilized a word embedding method to find high-frequency phenotypes with original sentences used as input in a sentence embedding method. Using this cascaded approach, we identified author-specific phenotypic expressions. In addition, we integrated a negative class label enhanced (NCLE) algorithm into our method to further optimize the training model of Sen2Vec. We used 56,748 PubMed abstracts of model organism Arabidopsis thaliana to test the effectiveness of our approach, which results in a 135% increase in the number of new phenotypic descriptions compared with the original phenotype ontology.
机译:在分子生物学中,通常使用复杂的语义和多种生物医学表达来描述表型,从而促进命名实体识别(NER)的发展。在这里,我们提出了一种通过将单词嵌入级联到带有类标签增强功能的句子嵌入中来识别植物表型的新方法。我们利用单词嵌入方法来找到高频表型,其中原始句子被用作句子嵌入方法中的输入。使用这种级联的方法,我们确定了作者特定的表型表达。此外,我们将阴性类别标签增强(NCLE)算法集成到我们的方法中,以进一步优化Sen2Vec的训练模型。我们使用了56748种模式生物拟南芥的PubMed摘要来测试我们方法的有效性,与原始表型本体相比,新表型描述的数量增加了135%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号