首页> 外文学位 >The Role of Approximate Negators in Modeling the Automatic Detection of Negation in Tweets
【24h】

The Role of Approximate Negators in Modeling the Automatic Detection of Negation in Tweets

机译:近似否定符在为推文中的否定自动检测建模方面的作用

获取原文
获取原文并翻译 | 示例

摘要

Although improvements have been made in the performance of sentiment analysis tools, the automatic detection of negated text (which affects negative sentiment prediction) still presents challenges. More research is needed on new forms of negation beyond prototypical negation cues such as "not" or "never." The present research reports findings on the role of a set of words called "approximate negators," namely "barely," "hardly," "rarely," "scarcely," and "seldom," which, in specific occasions (such as attached to a word from the non-affirmative adverb "any" family), can operationalize negation styles not yet explored. Using a corpus of 6,500 tweets, human annotation allowed for the identification of 17 recurrent usages of these words as negatives (such as "very seldom") which, along with findings from the literature, helped engineer specific features that guided a machine learning classifier in predicting negated tweets. The machine learning experiments also modeled negation scope (i.e. in which specific words are negated in the text) by employing lexical and dependency graph information. Promising results included F1 values for negation detection ranging from 0.71 to 0.89 and scope detection from 0.79 to 0.88. Future work will be directed to the application of these findings in automatic sentiment classification, further exploration of patterns in data (such as part-of-speech recurrences for these new types of negation), and the investigation of sarcasm, formal language, and exaggeration as themes that emerged from observations during corpus annotation.
机译:尽管在情感分析工具的性能方面已进行了改进,但否定文本的自动检测(这会影响负面情感预测)仍然带来挑战。除了“否”或“从不”之类的典型否定线索以外,还需要对新的否定形式进行更多研究。本研究报告了有关一组词的作用的发现,这些词被称为“近似否定词”,即“几乎”,“几乎”,“很少”,“很少”和“很少”,在特定情况下(例如附加词) (来自非肯定性副词“ any”家族的一个单词)可以操作尚未探索的否定样式。使用6,500条推文的语料库,人工注释可以识别出这些单词的17种重复用法(例如“很少”),作为否定词,再加上来自文献的发现,有助于设计出指导机器学习分类器的特定功能。预测否定的推文。机器学习实验还通过利用词法和依存关系图信息对否定范围(即文本中否定特定单词)进行了建模。有希望的结果包括否定检测的F1值在0.71至0.89之间,范围检测在0.79至0.88之间。未来的工作将针对这些发现在自动情感分类中的应用,数据模式的进一步探索(例如这些新型否定的词性重复)以及讽刺,形式语言和夸张的调查作为在语料注解过程中从观察中出现的主题。

著录项

  • 作者

    Palomino, Norma.;

  • 作者单位

    Syracuse University.;

  • 授予单位 Syracuse University.;
  • 学科 Artificial intelligence.;Linguistics.;Computer science.;Web studies.
  • 学位 D.P.S.
  • 年度 2018
  • 页码 203 p.
  • 总页数 203
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号