A Context-Aware Approach for the Identification of Complex Words in Natural Language Texts

机译：上下文识别自然语言文本中复杂词的识别方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper evaluates the effect of the context on the identification of complex words in natural language texts. The approach automatically tags words as either complex or not, based on two sets of features: base features that only pertain to the target word, and contextual features that take the context of the target word into account. We experimented with several supervised machine learning models, and trained and tested the approach with the SemEval-2016 dataset. Results show that considering contextual features significantly improves the identification of complex words by reaching an F-measure of 0.260 compared to 0.184 without them.

机译：本文评估了上下文对自然语言文本中复杂单词识别的影响。该方法基于两组功能自动将单词标记为复杂或不复杂：仅与目标单词相关的基本特征，以及将目标单词的上下文考虑在内的上下文特征。我们尝试了几种有监督的机器学习模型，并使用SemEval-2016数据集训练和测试了该方法。结果表明，考虑到上下文特征，F度量达到0.260相比F度量显着改善了复杂单词的识别，而没有度量则为0.184。

著录项

来源
《IEEE International Conference on Semantic Computing》|2017年|97-100|共4页
会议地点
作者
Elnaz Davoodi; Leila Kosseim; Matthew Mongrain;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Context; Training; Pragmatics; Natural languages; Complexity theory; Encyclopedias;

机译：语境;培训;语用学;自然语言;复杂性理论;百科全书;

相似文献

外文文献
中文文献
专利

1. Text independent root word identification in Hindi language using natural language processing [J] . Leena Jain, Prateek Agrawal International journal of advanced intelligence paradigms . 2015,第3a4期

机译：使用自然语言处理以印地语进行文本独立的根词识别
2. Entropy analysis of word-length series of natural language texts: Effects of text language and genre [J] . Kalimeri M., Constantoudis V., Papadimitriou C., International journal of bifurcation and chaos in applied sciences and engineering . 2012,第9期

机译：自然语言文本的词长系列的熵分析：文本语言和体裁的影响
3. Efficient Reuse of Natural Language Processing Models for Phenotype-Mention Identification in Free-text Electronic Medical Records: A Phenotype Embedding Approach [J] . Honghan Wu, Karen Hodgson, Sue Dyson, JMIR Medical Informatics . 2019,第4期

机译：在自由文本电子医疗记录中有效地重用自然语言处理模型的表型提及识别：嵌入方法的表型
4. A Context-Aware Approach for the Identification of Complex Words in Natural Language Texts [C] . Elnaz Davoodi, Leila Kosseim, Matthew Mongrain IEEE International Conference on Semantic Computing . 2017

机译：一种语境意识到自然语言文本中复杂词的识别方法
5. A machine-aided approach to intelligent index generation: Using natural language processing and latent semantic analysis to determine the contexts and relationships among words in a corpus. [D] . Lukon, Shelly Candita. 2006

机译：一种机器辅助的智能索引生成方法：使用自然语言处理和潜在语义分析来确定语料库中单词之间的上下文和关系。
6. Strategies for searching medical natural language text. Distribution of words in the anatomic diagnoses of 7000 autopsy subjects. [O] . G. W. Moore, G. M. Hutchins, R. E. Miller 1984

机译：搜索医学自然语言文本的策略。 7000名尸检对象的解剖学诊断中单词的分布。
7. Entropy analysis of word-length series of natural language texts: Effects of text language and genre∗ [O] . Maria Kalimeri, Vassilios Constantoudis, Constantinos Papadimitriou, 2016

机译：自然语言文本词长系列的熵分析：文本语言和类型的影响*

A Context-Aware Approach for the Identification of Complex Words in Natural Language Texts

摘要

著录项

相似文献

相关主题

期刊订阅