Supervised Learning and Knowledge-Based Approaches Applied to Biomedical Word Sense Disambiguation

机译：监督学习和基于知识的方法应用于生物医学词义消歧

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Word sense disambiguation (WSD) is an important step in biomedical text mining, which is responsible for assigning an unequivocal concept to an ambiguous term, improving the accuracy of biomedical information extraction systems. In this work we followed supervised and knowledge-based disambiguation approaches, with the best results obtained by supervised means. In the supervised method we used bag-of-words as local features, and word embeddings as global features. In the knowledge-based method we combined word embeddings, concept textual definitions extracted from the UMLS database, and concept association values calculated from the MeSH co-occurrence counts from MEDLINE articles. Also, in the knowledge-based method, we tested different word embedding averaging functions to calculate the surrounding context vectors, with the goal to give more importance to closest words of the ambiguous term. The MSH WSD dataset, the most common dataset used for evaluating biomedical concept disambiguation, was used to evaluate our methods. We obtained a top accuracy of 95.6 % by supervised means, while the best knowledge-based accuracy was 87.4 %. Our results show that word embedding models improved the disambiguation accuracy, proving to be a powerful resource in the WSD task.

机译：词义消歧（WSD）是生物医学文本挖掘中的重要步骤，该任务负责为模棱两可的术语指定明确的概念，从而提高生物医学信息提取系统的准确性。在这项工作中，我们遵循有监督和基于知识的消歧方法，并通过有监督的手段获得了最佳结果。在监督方法中，我们使用词袋作为局部特征，并使用词嵌入作为全局特征。在基于知识的方法中，我们结合了词嵌入，从UMLS数据库中提取的概念文本定义以及从MEDLINE文章的MeSH共现计数中计算出的概念关联值。同样，在基于知识的方法中，我们测试了不同的词嵌入平均函数以计算周围的上下文向量，目的是更加重视模糊词的最接近词。 MSH WSD数据集是用于评估生物医学概念消歧的最常见数据集，用于评估我们的方法。通过监督手段，我们获得了95.6％的最高准确性，而基于知识的最佳准确性为87.4％。我们的结果表明，词嵌入模型提高了歧义消除的准确性，被证明是WSD任务中的强大资源。

著录项

期刊名称 Journal of Integrative Bioinformatics
作者
Rui Antunes; Sérgio Matos;
展开▼
作者单位

展开▼
年(卷),期 2017(14),4
年度 2017
页码 20170051
总页数 8
原文格式 PDF
正文语种
中图分类生物学;
关键词
Biomedical text mining information extraction word embeddings;

机译：生物医学文本挖掘;信息提取;单词嵌入;

相似文献

外文文献
中文文献
专利

1. Supervised Learning and Knowledge-Based Approaches Applied to Biomedical Word Sense Disambiguation [J] . Rui Antunes, Sérgio Matos Journal of Integrative Bioinformatics . 2017,第4期

机译：监督学习和基于知识的方法应用于生物医学词义消歧
2. Knowledge-based biomedical word sense disambiguation: comparison of approaches [J] . Antonio J Jimeno-Yepes, Alan R Aronson BMC Bioinformatics . 2010,第1期

机译：基于知识的生物医学单词义消歧：方法的比较
3. Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation [J] . Ba#351, kaya Osman, Jurgens David The Journal of Artificial Intelligence Research . 2016,第10期

机译：半监督学习与诱导词义相结合，可实现最先进的词义歧义消除
4. Word sense disambiguation: an evaluation study of semi-supervised approaches with word embeddings [C] . Samuel Sousa, Evangelos Milios, Lilian Berton International Joint Conference on Neural Networks . 2020

机译：词义消歧：带词嵌入的半监督方法的评估研究
5. Towards high-performance word sense disambiguation by combining rich linguistic knowledge and machine learning approaches. [D] . Chen, Jinying. 2006

机译：通过将丰富的语言知识和机器学习方法结合起来，实现高性能的单词歧义消除。
6. Knowledge-based biomedical word sense disambiguation: comparison of approaches [O] . Antonio J Jimeno-Yepes, Alan R Aronson 2010

机译：基于知识的生物医学单词义消歧：方法的比较
7. Knowledge-based biomedical word sense disambiguation: comparison of approaches [O] . Antonio J Jimeno Yepes, Alan R Aronson 2010

机译：基于知识的生物医学单词义消歧：方法的比较

Supervised Learning and Knowledge-Based Approaches Applied to Biomedical Word Sense Disambiguation

摘要

著录项

相似文献

相关主题

期刊订阅