首页> 外文期刊>International journal of open source software & processes >A Novel Approach to Optimize the Performance of Hadoop Frameworks for Sentiment Analysis
【24h】

A Novel Approach to Optimize the Performance of Hadoop Frameworks for Sentiment Analysis

机译:优化用于情感分析的Hadoop框架性能的新方法

获取原文
获取原文并翻译 | 示例
           

摘要

Twitter is one among most popular micro blogging services with millions of active users. It is a hub of massive collection of data arriving from various sources. In Twitter, users most often express their views, opinions, thoughts, emotions or feelings about a particular topic, product or service, of their interest, choice or concern. This makes twitter a hub of gargantuan amount of data, and at the same time a useful platform in getting to know and understand the underlying sentiment behind a particular product or for that matter anything expressed in twitter as tweets. It is important to note here that aforesaid massive collection of data is not just any redundant data, but one which contains useful information as noted earlier. In view of aforesaid context, Sentiment analysis in relation to twitter data gains enormous importance. Sentiment analysis offers itself as a good approach in classifying the opinions formulated by individuals (tweeters) into different sentiments such as, positive, negative, or neutral. Implementing Sentiment analysis algorithms using conventional tools leads to high computation time, and thus are less effective. Hence, there is a need for state-of-the-art tools and techniques to be developed for sentiment analysis making it the need of the hour to facilitate faster computation. An Apache Hadoop framework is one such option that supports distributed data computing and has been commonly adopted for a variety of use-cases. In this article, the author identifies factors affecting the performance of sentiment analysis algorithms based on Hadoop framework and proposes an approach for optimizing the performance of sentiment analysis. The experimental results depict the potential of the proposed approach.
机译:Twitter是最受欢迎的微博客服务之一,拥有数百万的活跃用户。它是大量收集来自各种来源的数据的中心。在Twitter中,用户最经常表达其对特定主题,产品或服务的兴趣,选择或关注的看法,见解,想法,情感或感觉。这使Twitter成为大量数据的枢纽,同时又是一个有用的平台,可以用来了解和理解特定产品或在Twitter上以tweet表示的任何内容的潜在情绪。在此重要的是要注意,前述的大量数据收集不仅是任何冗余数据,而且还包含如前所述的有用信息。鉴于上述情况,与推特数据相关的情感分析具有极大的重要性。情绪分析提供了一种很好的方法,可以将个人(高音喇叭)提出的意见分为不同的情绪,例如正面,负面或中立。使用常规工具实施情感分析算法会导致计算时间较长,因此效率较低。因此,需要开发用于情感分析的最新工具和技术,使得需要一个小时来促进更快的计算。 Apache Hadoop框架就是这样一种选择,它支持分布式数据计算,并且已在各种用例中普遍采用。在本文中,作者确定了影响基于Hadoop框架的情感分析算法性能的因素,并提出了一种优化情感分析性能的方法。实验结果说明了该方法的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号