首页> 外文会议>2016 5th International Conference on Informatics, Electronics and Vision >An effective approach of intrinsic and extrinsic domain relevance technique for feature extraction in opinion mining
【24h】

An effective approach of intrinsic and extrinsic domain relevance technique for feature extraction in opinion mining

机译:观点挖掘中特征提取的内在和外在领域相关技术的有效途径

获取原文
获取原文并翻译 | 示例

摘要

Opinion mining, also known as sentiment analysis, refers to the method of identifying and extracting subjective information from the source materials by the use of natural language processing, text analysis and computational linguistics. It has gained immense amount of importance in recent times due to the growing number of blogs, forums and other social networks which contain huge amount of opinions. Feature extraction is an important factor in opinion mining which refers to the method of extracting those properties on which the opinions are based on. There is a method for identifying features based on Intrinsic and Extrinsic Domain Relevance (IEDR) which exploits the difference in opinion feature's statistics across two corpora. The approach includes syntactic rules to process the review sentences and a well known and generalized weight equation with a numerical statistic known as Term Frequency-Inverse Document Frequency to calculate domain relevance which often fails to identify many of the legitimate features. So in this paper, we propose an effective approach of this IEDR technique for the purpose of feature extraction. Our proposed approach includes a handful of extended syntactic rules to process review sentences. It also includes optimization in calculation of domain relevance with the modification of weight equation. To verify our proposed approach, we have applied it on two real-world review corpora along with the existing IEDR approach. Our proposed approach exhibits a remarkable improvement in performance for finding opinion features outperforming the currently existing IEDR method.
机译:观点挖掘,也称为情感分析,是指通过使用自然语言处理,文本分析和计算语言学从源材料中识别和提取主观信息的方法。近年来,由于博客,论坛和其他包含大量观点的社交网络的数量不断增加,它的重要性越来越高。特征提取是意见挖掘中的重要因素,是指提取意见所基于的那些属性的方法。有一种基于内部和外部域相关性(IEDR)的特征识别方法,该方法利用了两个语料库中意见特征统计数据的差异。该方法包括用于处理复审语句的句法规则以及众所周知的广义权重方程,该方程具有数字统计量,称为术语频率-反文档频率,以计算域相关性,而通常无法识别许多合法特征。因此,在本文中,出于特征提取的目的,我们提出了这种IEDR技术的有效方法。我们提出的方法包括一些扩展的句法规则来处理复习句子。它还包括通过权重方程的修改来优化域相关性。为了验证我们提出的方法,我们将其与现有的IEDR方法一起应用于两个真实世界的审查语料库。我们提出的方法在性能上表现出了显着的提高,可以找到优于当前IEDR方法的意见特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号