首页> 中文期刊> 《计算机技术与发展》 >基于SVM的高维混合特征短文本情感分类

基于SVM的高维混合特征短文本情感分类

         

摘要

Aiming at the characteristics of short texts which are sparse,unnormative and ambiguous in subject,we present a hybrid feature model with high dimension based on SVM.Firstly,we introduce six types of feature about both semantics and emotion,involving expression symbols,word clustering symbols,part-of-speech tagging,n-gram,negation and the sentiment dictionary,which are mainly introduced in their concept,extraction and output form.Then a five-fold crossover method is used to verify the validity of the model according to the data of COAE2014.The average accuracy rate is 84.69%,the average recall rate is 83.13%,and the average F1value is 83.90%.Thirdly,we discuss the influence of SVM regularization parameter on experiment.Finally,the proposed model is compared and analyzed with Recursive Auto Encoder,Doc2vec and so on,which show that it is more effective for short text emotion classification.%针对短文本具有的稀疏性、不规范性、主题不明确性等相关特点,提出一种基于SVM的高维混合特征模型.首先介绍了兼顾语义和情感的6类特征:表情符号特征、词聚类特征、词性标注特征、n-gram特征、否定特征和情感词典.其中主要介绍了该6类特征的概念、抽取方式以及输出形式;其次在第六届中文倾向性分析评测(COAE2014)为基础的数据集上,采用5折交叉的方法对该模型进行了有效性验证,其平均准确率为84.69%、平均召回率为83.13%,而平均F1值为83.90%;接着探讨了SVM惩罚系数对实验的影响;最后将该模型与一步三分类方法、Recursive Auto Encoder、Doc2vec做了对比分析,结果表明提出的模型对短文本情感分类更有效.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号