首页> 外文会议>International Conference on Cyber and IT Service Management >Feature selection based on Genetic algorithm, particle swarm optimization and principal component analysis for opinion mining cosmetic product review
【24h】

Feature selection based on Genetic algorithm, particle swarm optimization and principal component analysis for opinion mining cosmetic product review

机译:基于遗传算法,粒子群优化和主成分分析的特征选择在意见化妆品中的应用

获取原文

摘要

Opinion mining is an automation technique of textual data from opinion sentence that produce sentiment information. It is also called sentiment analysis that involves the construction of a system for collecting and classifying opinions about a product review done by understanding, extracting and processing the text in an opinion sentence become positive, negative, and neutral. One of the techniques mostly used by data classification is Support Vector Machine (SVM). SVM is able to identify the separated hyper plane that maximizes the margin between two different classes. However, SVM has a weakness for parameter selection or suitable feature. In this research, the researchers made an improvement toward the previous research using combined method of feature selection in SVM through comparing three-feature selection; Genetic Algorithm, Particle Swarm Optimization, and Principal Component Analysis. It can be determined which one of the best feature selections that improve the classification accuracy of SVM. The dataset was cosmetic products review downloaded from www.amazon.com. Measurement is based on SVM accuracy by adding the feature selection method. While the evaluation used 10 Fold Cross Validation and the accuracy measurement used the Confusion Matrix and ROC Curve. The result of the measurement accuracy of SVM accuracy is obtained with average 82.00% and the average AUC 0.988. After the integration of SVM algorithm and feature selection, Genetic algorithm shows the best results with average accuracy 94.00% and the average AUC 0.984. Particle Swarm Optimization indicates the best results with average accuracy 97.00% and the average AUC 0.988. While Principal Component Analysis indicates the best results with average accuracy 83.00% and the average AUC 0.809. As conclusion, the research of SVM Algorithm showed the best accuracy improvement toward the feature selection of Particle Swarm Optimization integrated with the increased accuracy from 82.00% to 97.00%.
机译:意见挖掘是一种从意见句中产生文本信息的文本数据的自动化技术。也称为情感分析,它涉及构建一个系统,该系统用于收集和分类有关产品评论的意见,这些意见是通过理解,提取和处理意见句中的文本成为肯定,否定和中立的方式来完成的。数据分类最常用的技术之一是支持向量机(SVM)。 SVM能够识别分离的超平面,该平面使两个不同类之间的裕量最大化。但是,SVM在参数选择或适当功能方面存在弱点。在这项研究中,研究人员通过比较三特征选择,在SVM中使用特征选择的组合方法对以前的研究进行了改进。遗传算法,粒子群优化和主成分分析。可以确定哪个最佳特征选择可以提高SVM的分类精度。数据集是从www.amazon.com下载的化妆品评论。通过添加功能选择方法,可基于SVM精度进行测量。评估使用10倍交叉验证,准确性测量使用混淆矩阵和ROC曲线。 SVM精度的测量精度结果为平均值82.00 \%和平均值AUC 0.988。结合支持向量机和特征选择后,遗传算法以平均准确度94.00%,平均AUC 0.984表现出最好的结果。粒子群优化算法以平均准确度97.00%和平均AUC 0.988指示了最佳结果。主成分分析显示最佳结果,平均准确度为83.00 \%,平均AUC为0.809。综上所述,支持向量机算法的研究表明,结合粒子群优化算法的特征选择,最优精度得到了最好的提高,精度从82.00 \%提高到97.00 \%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号