Feature engineering for detecting spammers on Twitter: Modelling and analysis

Wafa Herzallah; Hossam Faris; Omar Adwan

首页> 外文期刊>Journal of Information Science >Feature engineering for detecting spammers on Twitter: Modelling and analysis

【24h】

Feature engineering for detecting spammers on Twitter: Modelling and analysis

机译：用于在Twitter上检测垃圾邮件发送者的功能工程：建模和分析

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Twitter is a social networking website that has gained a lot of popularity around the world in the last decade. This popularity made Twitter a common target for spammers and malicious users to spread unwanted advertisements, viruses and phishing attacks. In this article, we review the latest research works to determine the most effective features that were investigated for spam detection in the literature. These features are collected to build a comprehensive data set that can be used to develop more robust and accurate spammer detection models. The new data set is tested using popular classifiers (Naive Bayes, support vector machines, multilayer perceptron neural networks, Decision Trees, Random forests and k -Nearest Neighbour). The prediction performance of these classifiers is evaluated and compared based on different evaluation metrics. Moreover, a further analysis is carried out to identify the features that have higher impact on the accuracy of spam detection. Three different techniques are used and compared for this analysis: change of mean square error (CoM), information gain (IG) and Relief-F method. Top five features identified by each technique are used again to build the detection models. Experimental results show that most of the developed classifiers obtained high evaluation results based on the comprehensive data set constructed in this work. Experiments also reveal the important role of some features like the reputation of the account, average length of the tweet, average mention per tweet, age of the account, and the average time between posts in the process of identifying spammers in the social network.

机译：Twitter是一个社交网站，在过去十年中已在全球范围内广受欢迎。这种流行使Twitter成为垃圾邮件发送者和恶意用户传播不需要的广告，病毒和网络钓鱼攻击的常见目标。在本文中，我们回顾了最新的研究工作，以确定在文献中被调查用于垃圾邮件检测的最有效功能。收集这些功能以构建全面的数据集，该数据集可用于开发更健壮和准确的垃圾邮件发送者检测模型。使用流行的分类器（朴素贝叶斯，支持向量机，多层感知器神经网络，决策树，随机森林和k最近邻）对新数据集进行了测试。这些分类器的预测性能将根据不同的评估指标进行评估和比较。此外，还进行了进一步的分析，以确定对垃圾邮件检测的准确性有较高影响的功能。使用了三种不同的技术并对其进行了比较：均方差（CoM）的变化，信息增益（IG）和Relief-F方法。每种技术确定的前五项功能将再次用于构建检测模型。实验结果表明，基于这项工作构建的综合数据集，大多数已开发的分类器均获得了较高的评价结果。实验还揭示了一些功能的重要作用，例如帐户的信誉，平均鸣叫时间，每个鸣叫的平均提及次数，帐户的年龄以及在社交网络中识别垃圾邮件发送者的平均间隔时间。

著录项

来源
《Journal of Information Science》 |2018年第2期|230-247|共18页
作者
Wafa Herzallah; Hossam Faris; Omar Adwan;
展开▼
作者单位

Business Information Technology, King Abdullah II School of Information Technology, The University of Jordan, Jordan;

Business Information Technology, King Abdullah II School of Information Technology, The University of Jordan, Jordan;

Business Information Technology, King Abdullah II School of Information Technology, The University of Jordan, Jordan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Classifiers; detection; feature engineering; spam; spam features; spammers; Twitter;

机译：分类器;检测;功能工程;垃圾邮件;垃圾邮件功能;垃圾邮件发送者;Twitter;

相似文献

外文文献
中文文献
专利

1. Detecting and Characterizing Arab Spammers Campaigns in Twitter [J] . Reem Alharthi, Areej Alhothali, Kawthar Moria Procedia Computer Science . 2019,第1期

机译：在Twitter中检测和表征阿拉伯垃圾邮件发送者活动
2. Detecting Streaming of Twitter Spam Using Hybrid Method [J] . Murugan N. Senthil, Devi G. Usha Wireless personal communications: An Internaional Journal . 2018,第2期

机译：使用混合方法检测Twitter垃圾邮件流
3. Detecting spamming activities in twitter based on deep-learning technique [J] . TingminWu, ShengWen, Shigang Liu, Concurrency and Computation . 2017,第19期

机译：基于深度学习技术的Twitter垃圾邮件活动检测
4. Binary and Continuous Feature Engineering Analysis on Twitter Data Stream for Classification of Spam Messages [C] . Cinu C. Kiliroor, C. Valliyammai International conference on communication, devices, and computing . 2020

机译：关于垃圾邮件分类的推特数据流的二进制和连续功能工程分析
5. Detecting Abusive Arabic Language Twitter Accounts Using a Multidimensional Analysis Model [D] . Abozinadah, Ehab. 2017

机译：使用多维分析模型检测滥用阿拉伯语Twitter帐户
6. Detecting Binge Drinking and Alcohol-Related Risky Behaviours from Twitter’s Users: An Exploratory Content- and Topology-Based Analysis [O] . Cristina Crocamo, Marco Viviani, Francesco Bartoli, 2020

机译：从Twitter用户检测暴饮酒和与酒精有关的危险行为：基于内容和拓扑的探索性分析
7. Multi-Class Imbalance in Text Classification: A Feature Engineering Approach to Detect Cyberbullying in Twitter [O] . Bandeh Ali Talpur, Declan O’Sullivan 2020

机译：文本分类中的多级不平衡：一种检测Twitter中的网络欺凌的特征工程方法

Feature engineering for detecting spammers on Twitter: Modelling and analysis

摘要

著录项

相似文献

相关主题

期刊订阅