Spam detection of Twitter traffic: A framework based on random forests and non-uniform feature sampling

机译：Twitter流量的垃圾邮件检测：基于随机森林和非均匀特征采样的框架

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Law Enforcement Agencies cover a crucial role in the analysis of open data and need effective techniques to filter troublesome information. In a real scenario, Law Enforcement Agencies analyze Social Networks, i.e. Twitter, monitoring events and profiling accounts. Unfortunately, between the huge amount of internet users, there are people that use microblogs for harassing other people or spreading malicious contents. Users' classification and spammers' identification is a useful technique for relieve Twitter traffic from uninformative content. This work proposes a framework that exploits a non-uniform feature sampling inside a gray box Machine Learning System, using a variant of the Random Forests Algorithm to identify spammers inside Twitter traffic. Experiments are made on a popular Twitter dataset and on a new dataset of Twitter users. The new provided Twitter dataset is made up of users labeled as spammers or legitimate users, described by 54 features. Experimental results demonstrate the effectiveness of enriched feature sampling method.

机译：执法机构在分析开放数据中扮演着至关重要的角色，并且需要有效的技术来过滤麻烦的信息。在实际情况下，执法机构会分析社交网络（即Twitter），监视事件和分析帐户。不幸的是，在庞大的互联网用户之间，有些人使用微博来骚扰他人或传播恶意内容。用户的分类和垃圾邮件发送者的身份识别是一种有用的技术，可以缓解Twitter流量中不包含任何内容的内容。这项工作提出了一个框架，该框架利用灰色盒子机器学习系统内部的非均匀特征采样，使用随机森林算法的一种变体来识别Twitter通信中的垃圾邮件发送者。在流行的Twitter数据集和Twitter用户的新数据集上进行实验。新提供的Twitter数据集由标记为垃圾邮件发送者或合法用户的用户组成，由54个功能描述。实验结果证明了丰富特征采样方法的有效性。

著录项

来源
《Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining》|2016年|811-817|共7页
会议地点 San Francisco(US)
作者
Claudia Meda; Edoardo Ragusa; Christian Gianoglio; Rodolfo Zunino; Augusto Ottaviano; Eugenio Scillia; Roberto Surlinelli;
展开▼
作者单位

Dept. of Electric, Electronic and Telecommunications, Engineering and Naval Architecture DITEN, University of Genoa, Genoa, Italy;

Dept. of Electric, Electronic and Telecommunications, Engineering and Naval Architecture DITEN, University of Genoa, Genoa, Italy;

Dept. of Electric, Electronic and Telecommunications, Engineering and Naval Architecture DITEN, University of Genoa, Genoa, Italy;

Dept. of Electric, Electronic and Telecommunications, Engineering and Naval Architecture DITEN, University of Genoa, Genoa, Italy;

Ministry of the Interior - Department of Public Security, Italian National Police, Genoa - Italy;

Ministry of the Interior - Department of Public Security, Italian National Police, Genoa - Italy;

Ministry of the Interior - Department of Public Security, Italian National Police, Genoa - Italy;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Twitter; Classification algorithms; Training; Vegetation; Feature extraction; Algorithm design and analysis;

机译：Twitter;分类算法;训练;植被;特征提取;算法设计与分析;

相似文献

外文文献
中文文献
专利

1. A hybrid classificationmethod for Twitter spam detection based on differential evolution and random forest [J] . Bazzaz Abkenar Sepideh, Mahdipour Ebrahim, Jameii Seyed Mahdi, Concurrency and computation: practice and experience . 2021,第21期

机译：基于差分演化和随机林的Twitter垃圾邮件检测混合分类方法
2. Statistical Features-Based Real-Time Detection of Drifted Twitter Spam [J] . IEEE transactions on information forensics and security . 2017,第4期

机译：基于统计功能的Twitter Twitter垃圾邮件实时检测
3. An intelligent system for spam detection and identification of the most relevant features based on evolutionary Random Weight Networks [J] . Faris Hossam, Al-Zoubi Ala M., Heidari Ali Asghar, Information Fusion . 2019,第期

机译：一种智能系统，用于垃圾邮件检测和识别基于进化随机重量网络的最相关特征
4. Spam detection of Twitter traffic: A framework based on random forests and non-uniform feature sampling [C] . Claudia Meda, Edoardo Ragusa, Christian Gianoglio, IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining . 2016

机译：Twitter流量的垃圾邮件检测：基于随机林和非统一功能采样的框架
5. Automatic detection of adrenal gland abnormality using the random forest classification framework combined with histogram analysis. [D] . Saiprasad, Ganesh. 2013

机译：使用随机森林分类框架结合直方图分析自动检测肾上腺异常。
6. Multi-feature fusion framework for sarcasm identification on twitter data: A machine learning based approach [O] . Christopher Ifeanyi Eke, Azah Anir Norman, Liyana Shuib 2021

机译：Twitter数据上的讽刺识别多特征融合框架：基于机器学习的方法
7. Statistical Features-Based Real-Time Detection of Drifted Twitter Spam [O] . Chen C, Wang Y, Zhang J, 2017

机译：基于统计功能的Twitter Twitter垃圾邮件实时检测

Spam detection of Twitter traffic: A framework based on random forests and non-uniform feature sampling

摘要

著录项

相似文献

相关主题

期刊订阅