首页> 外文会议>International Joint Conference on Neural Networks >Word sense disambiguation: an evaluation study of semi-supervised approaches with word embeddings
【24h】

Word sense disambiguation: an evaluation study of semi-supervised approaches with word embeddings

机译:词义消歧:带词嵌入的半监督方法的评估研究

获取原文

摘要

Word Sense Disambiguation (WSD) is a well-known problem in the field of Natural Language Processing (NLP) related to automatically determining the most appropriate sense of words in context. Several machine learning-based approaches have been proposed to tackle the ambiguity of language, but the lack of labeled data to train supervised models made semi-supervised learning (SSL) appear as an attractive option. Furthermore, the use of word embeddings to enhance the results of NLP tasks was shown to be an efficient strategy. Thus, this paper aims at adapting semi-supervised algorithms for WSD using word embeddings from Word2Vec, FastText, and BERT models combined with part-of-speech tags as input. We conduct a systematic evaluation of four graph-based SSL models analyzing the influence of their hyperparameters on the results, as well as the distances to build the graphs, the percentages of labeled data, and the word embeddings architectural variations. As a result, we show that SSL algorithms which received 10% of labeled data are strong baselines on the subsets of nouns and adjectives. Additionally, these algorithms do not need further training to disambiguate new words, hence being competitive to supervised systems.
机译:词义消歧(WSD)是自然语言处理(NLP)领域中的一个众所周知的问题,与自动确定上下文中最合适的词义有关。已经提出了几种基于机器学习的方法来解决语言的歧义,但是由于缺少用于训练监督模型的标记数据,使得半监督学习(SSL)成为一种有吸引力的选择。此外,使用词嵌入来增强NLP任务的结果被证明是一种有效的策略。因此,本文旨在使用Word2Vec,FastText和BERT模型中的词嵌入,并结合词性标签作为输入,从而为WSD改编半监督算法。我们对四个基于图形的SSL模型进行了系统评估,分析了它们的超参数对结果的影响,以及构建图形的距离,标记数据的百分比以及词嵌入体系结构的变化。结果,我们证明了接收10%标记数据的SSL算法是名词和形容词的子集的强基准。另外,这些算法不需要进一步的培训就可以消除新词的歧义,因此在监督系统中具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号