Cost-sensitive classifier for spam detection on news media Twitter accounts

机译：成本敏感的分类器，用于在新闻媒体Twitter帐户上检测垃圾邮件

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Social media are increasingly being used as sources in mainstream news coverage. However, since news is so rapidly updating it is very easy to fall into the trap of believing everything as truth. Spam content usually refers to the information that goes viral and skews users' views on subjects. To this end, this paper introduces a new approach for detecting spam tweets using Cost-Sensitive Classifier that includes Random Forest. Tweets were first annotated manually and then four different sets of features were extracted from them. Afterward, four machine learning algorithms were cross-validated to determine the best base classifier for spam detection. Finally, class imbalanced problem was dealt by resampling and incorporating arbitrary misclassification costs into the learning process. Results showed that the proposed approach helped mitigate overfitting and reduced classification error by achieving an overall accuracy of 89.14% in training and 76.82% in testing.

机译：社交媒体正越来越多地被用作主流新闻报道的来源。但是，由于新闻是如此迅速地更新，因此很容易陷入将一切都视为真理的陷阱。垃圾内容通常是指传播大量信息并歪曲用户对主题的看法的信息。为此，本文介绍了一种使用包括随机森林在内的使用成本敏感分类器检测垃圾邮件推文的新方法。首先手动注释推文，然后从中提取四组不同的功能。之后，对四种机器学习算法进行了交叉验证，以确定用于垃圾邮件检测的最佳基础分类器。最后，通过重新采样并将任意错误分类成本纳入学习过程来解决班级不平衡问题。结果表明，该方法通过在训练中达到89.14％的整体准确度，在测试中达到76.82％的整体准确度，有助于减轻过拟合并减少分类错误。

著录项

来源
《2017 XLIII Latin American Computer Conference》|2017年|1-6|共6页
会议地点 Cordoba(AR)
作者
Georvic Tur; Masun Nabhan Homsi;
展开▼
作者单位

Department of Computer Science and Information Technology, Simon Bolivar University, Valle Sartenejas, Baruta, Edo. Miranda Apartado 89000, Caracas, Venezuela;

Department of Computer Science and Information Technology, Simon Bolívar University, Valle Sartenejas, Baruta, Edo. Miranda Apartado 89000, Caracas, Venezuela;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Training; Feature extraction; Twitter; Labeling; Testing; Radio frequency; Media;

机译：培训;特征提取; Twitter;标签;测试;无线电频率;媒体;;

相似文献

外文文献
中文文献
专利

1. A MULTI-CLASSIFIER APPROACH FOR TWITTER SPAM DETECTION USING INNOVATIVE ANN-FDT ALGORITHM [J] . M.Arunkrishna, B.Mukunthan Indian Journal of Computer Science and Engineering . 2020,第5期

机译：创新Ann-FDT算法的推特垃圾邮件检测多分类方法
2. Twitter spam account detection based on clustering and classification methods [J] . Adewole Kayode Sakariyah, Hang Tao, Wu Wanqing, Journal of supercomputing . 2020,第7期

机译：基于聚类和分类方法的Twitter垃圾邮件帐户检测
3. Detection of spam-posting accounts on Twitter [J] . Inuwa-Dutse Isa, Liptrott Mark, Korkontzelos Ioannis Neurocomputing . 2018,第NOVa13期

机译：在Twitter上检测垃圾邮件发布帐户
4. Cost-sensitive classifier for spam detection on news media Twitter accounts [C] . Georvic Tur, Masun Nabhan Homsi Latin American Computing Conference . 2017

机译：用于新闻媒体Twitter帐户的垃圾邮件检测的成本敏感分类器
5. National television news and newspapers as media salience, Twitter as public salience: An agenda-setting effects analysis. [D] . Vargo, Chris J. 2011

机译：国家电视新闻和报纸作为媒体的关注点，推特作为公众的关注点：议程设置效果分析。
6. News consumption patterns on Twitter: fragmentation study on the online news media network [O] . Ford Lumban Gaol, Ardian Maulana, Tokuro Matsuo 2020

机译：Twitter新闻消费模式：在线新闻媒体网络上的碎片研究
7. Spam Detection on Twitter Using Traditional Classifiers [O] . M. Mccord, M. Chuah 2013

机译：使用传统分类器在Twitter上进行垃圾邮件检测

Cost-sensitive classifier for spam detection on news media Twitter accounts

摘要

著录项

相似文献

相关主题

期刊订阅