首页> 外文学位 >Sentiment analysis of Twitter data.
【24h】

Sentiment analysis of Twitter data.

机译:Twitter数据的情绪分析。

获取原文
获取原文并翻译 | 示例

摘要

Sentiment Analysis and Opinion Mining has become a research hot-spot with the rapid development of social network websites.Twitter is a typical social network application with millions of users expressing their sentiment every day. In this work, we explored comprehensively the methodologies applied in sentiment classification over Twitter data: lexicon-based, rule-based and machine learning-based methods. Our data-set is crawled and manually cleaned with the principle of Naturally Annotated Big Data. The data-set contains 20, 000 tweets ranging over ten popular topics.;For lexicon-based methods, we experimented with the Simple Word Count approach and Feature Scoring approach using most popular sentiment lexicons and semantic resources, namely MPQA subjectivity lexicon, SentiWordNet, Vader Sentiment Lexicon, Bing Liu's lexicon and General Inquirer. We built customized sentiment lexicons, designed featuring scores and compared ten classifiers on real-world Twitter data. Further, we designed Lingusitic Inference Rules(LIR) to improve lexicon-based classifiers. LIR aims to handle negation, valence shift and contrast conjunctions in natural language. For machine learning-based methods, we used state-of-the-art supervised learning models: Naive Bayes, Maximum Entropy and Support Vector Machines. Two sets of features are compared. The first set of features is Bag-of-Words with N-Gram. The second set of features is Part-of-Speech linguistic annotation.
机译:情感分析和观点挖掘已成为社交网络网站快速发展的研究热点。Twitter是一种典型的社交网络应用程序,每天都有数百万用户表达其情感。在这项工作中,我们全面探讨了Twitter数据在情感分类中应用的方法:基于词典,基于规则和基于机器学习的方法。我们的数据集已按照自然标注的大数据原理进行了爬网和手动清理。数据集包含20、000条推文,涉及10个热门主题。;对于基于词典的方法,我们使用最流行的情感词典和语义资源(即MPQA主观性词典,SentiWordNet, Vader Sentiment词典,刘冰的词典和一般询问者。我们构建了定制的情感词典,以得分为特征进行设计,并在真实的Twitter数据上比较了十个分类器。此外,我们设计了语言推理规则(LIR)以改进基于词典的分类器。 LIR旨在处理自然语言中的否定,价位转换和对比连词。对于基于机器学习的方法,我们使用了最新的监督学习模型:朴素贝叶斯,最大熵和支持向量机。比较两组功能。第一组功能是带有N-Gram的单词袋。第二组功能是词性语言注释。

著录项

  • 作者

    Yuan, Bo.;

  • 作者单位

    Rensselaer Polytechnic Institute.;

  • 授予单位 Rensselaer Polytechnic Institute.;
  • 学科 Computer science.
  • 学位 M.S.
  • 年度 2016
  • 页码 60 p.
  • 总页数 60
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号