首页> 外文期刊>Journal of Theoretical and Applied Information Technology >A SELF-ORGANIZING MAP ALGORITHM USING ONLY A TESTING DATA SET WITH THE ONE-DIMENSIONAL VECTORS AND AN ODDS RATIO COEFFICIENT FOR ENGLISH SENTIMENT CLASSIFICATION IN A PARALLEL SYSTEM
【24h】

A SELF-ORGANIZING MAP ALGORITHM USING ONLY A TESTING DATA SET WITH THE ONE-DIMENSIONAL VECTORS AND AN ODDS RATIO COEFFICIENT FOR ENGLISH SENTIMENT CLASSIFICATION IN A PARALLEL SYSTEM

机译:一种仅使用带有一维矢量和ODDS系数的测试数据集进行并行系统英语情感分类的自组织映射算法

获取原文
           

摘要

Many different approaches have already been studied for sentiment classification for many years because It has been significant in everyday life, such as in political activities, commodity production, and commercial activities. A new model using an unsupervised learning for big data sentiment classification has been proposed in this survey. We have used a Self-Organizing Map Algorithm (SOM) to cluster all sentences of one document of the testing data set comprising 8,500,000 documents, which are the 4,250,000 positive and the 4,250,000 negative in English, into either the positive polarity or the negative polarity certainly. In this survey, we do not use any data sets. We do not any one-dimensional vectors based on a vector space modeling (VSM). We also do not use any multi-dimensional vectors based on the VSM. We only use many one-dimensional vectors based on many sentiment lexicons of our basis English sentiment dictionary (bESD). The valences and the polarities of the sentiment lexicons of the bESD are calculated by using An Odds Ratio Coefficient (ORC) through a Google search engine with AND operator and OR operator. We also do not use many multi-dimensional vectors based on the sentiment lexicons of the bESD. With one document of the testing data set, the SOM is used to cluster all the sentences of this document into either the positive or the negative on a map. The sentiment classification of this document is identified based on this map completely. We have tested the proposed model in both a sequential environment and a distributed network system. We have achieved 88.14% accuracy of the testing data set. The execution of the proposed model in the sequential system is greater than that in the parallel network environment. Many applications and research of the sentiment classification can widely use the results of the proposed model.
机译:由于情感分类在日常生活中非常重要,例如在政治活动,商品生产和商业活动中,因此已经为情感分类研究了许多种方法。本次调查提出了使用无监督学习进行大数据情感分类的新模型。我们已使用自组织映射算法(SOM)将包含8,500,000个文档(英语中的4,250,000个正数和4,250,000个负数)的测试数据集的一个文档的所有句子确定为正极性或负极性。在此调查中,我们不使用任何数据集。我们没有基于向量空间建模(VSM)的任何一维向量。我们也不使用基于VSM的任何多维矢量。我们仅基于基础英语情感词典(bESD)的许多情感词典使用许多一维向量。通过使用具有AND运算符和OR运算符的Google搜索引擎,使用几率比系数(ORC),可以计算bESD情感词典的价数和极性。我们也不基于bESD的情感词典使用很多多维向量。对于测试数据集的一个文档,SOM用于将该文档的所有句子聚类为地图上的肯定或否定。完全根据此地图确定该文档的情感分类。我们已经在顺序环境和分布式网络系统中测试了建议的模型。我们已经达到了测试数据集88.14%的准确性。所提出的模型在顺序系统中的执行量大于在并行网络环境中的执行量。情感分类的许多应用和研究都可以广泛使用所提出模型的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号