Detection of Possible Illicit Messages Using Natural Language Processing and Computer Vision on Twitter and Linked Websites

Granizo Sergio L.; Valdivieso Caraguay Angel Leonardo; Barona Lopez Lorena Isabel; Hernandez-Alvarez Myriam

首页> 外文期刊>Quality Control, Transactions >Detection of Possible Illicit Messages Using Natural Language Processing and Computer Vision on Twitter and Linked Websites

【24h】

Detection of Possible Illicit Messages Using Natural Language Processing and Computer Vision on Twitter and Linked Websites

机译：在Twitter上使用自然语言处理和计算机视觉检测可能的非法消息和链接网站

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Human trafficking is a global problem that strips away the dignity of millions of victims. Currently, social networks are used to spread this crime through the online environment by using covert messages that serve to promote these illegal services. In this context, since law enforcement resources are limited, it is vital to automatically detect messages that may be related to this crime and could also serve as clues. In this paper, we identify Twitter messages that could promote these illegal services and exploit minors by using natural language processing. The images and the URLs found in suspicious messages were processed and classified by gender and age group, so it is possible to detect photographs of people under 14 years of age. The method that we used is as follows. First, tweets with hashtags related to minors are mined in real-time. These tweets are preprocessed to eliminate noise and misspelled words, and then the tweets are classified as suspicious or not. Moreover, geometric features of the face and torso are selected using Haar models. By applying Support Vector Machine (SVM) and Convolutional Neural Network (CNN), we are able to recognize gender and age group, taking into account torso information and its proportional relationship with the head, or even when the face details are blurred. As a result, using the SVM model with only torso features has a higher performance than CNN.

机译：人口贩运是一种全球问题，这些问题脱离了数百万受害者的尊严。目前，社交网络用于通过使用用于促进这些非法服务的隐秘信息来传播这种犯罪。在这方面，由于执法资源有限，自动检测可能与此犯罪有关的信息至关重要，也可以作为线索。在本文中，我们识别推特邮件，可以通过使用自然语言处理来促进这些非法服务和利用未成年人。可疑信息中发现的图像和URL由性别和年龄组进行处理和分类，因此可以检测14岁以下人员的照片。我们使用的方法如下。首先，与未成年人有关的鸣叫的推文是实时开采的。这些推文是预处理的，以消除噪音和拼写错误的单词，然后推文被归类为可疑。此外，使用HAAR模型选择面部和躯干的几何特征。通过应用支持向量机（SVM）和卷积神经网络（CNN），我们能够识别性别和年龄组，考虑到躯干信息及其与头部的比例关系，甚至在面部细节模糊时。结果，使用仅具有躯干特征的SVM模型具有比CNN更高的性能。

著录项

来源
《Quality Control, Transactions》 |2020年第2020期|44534-44546|共13页
作者
Granizo Sergio L.; Valdivieso Caraguay Angel Leonardo; Barona Lopez Lorena Isabel; Hernandez-Alvarez Myriam;
展开▼
作者单位

Escuela Politec Nacl Dept Informat & Comp Sci DICC Quito 170517 Ecuador;

Escuela Politec Nacl Dept Informat & Comp Sci DICC Quito 170517 Ecuador;

Escuela Politec Nacl Dept Informat & Comp Sci DICC Quito 170517 Ecuador;

Escuela Politec Nacl Dept Informat & Comp Sci DICC Quito 170517 Ecuador;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Twitter; Natural language processing; Tagging; Support vector machines; Torso; Computer vision; CNN; features detection; image classification; natural language processing; SVM;

机译：Twitter;自然语言处理;标记;支持向量机;躯干;计算机视觉;CNN;特征检测;图像分类;自然语言处理;SVM;

相似文献

外文文献
中文文献
专利

1. Extracting health-related causality from twitter messages using natural language processing [J] . Son Doan, Elly W. Yang, Sameer S. Tilak, BMC Medical Informatics and Decision Making . 2019,第3期

机译：使用自然语言处理从Twitter消息中提取与健康相关的因果关系
2. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing [J] . Jian Guo, Sheng Zha, Aston Zhang, Journal of machine learning research . 2020,第a期

机译：gluoncv和gluonnlp：计算机视觉和自然语言处理的深度学习
3. Computer vision and natural language processing: recent approaches in multimedia and robotics [J] . Epaminondas Kapetanios Computing reviews . 2017,第7期

机译：计算机视觉和自然语言处理：多媒体和机器人技术的最新方法
4. A Public Health Surveillance Platform Exploiting Free-Text Sources via Natural Language Processing and Linked Data: Application in Adverse Drug Reaction Signal Detection Using PubMed and Twitter [C] . Pantelis Natsiavas, Nicos Maglaveras, Vassilis Koutkias International workshop on process-oriented information systems in health-care;International workshop on knowledge representation for health care . 2017

机译：通过自然语言处理和链接数据开发自由文本源的公共卫生监视平台：在使用PubMed和Twitter的药物不良反应信号检测中的应用
5. Recursive Deep Learning for Natural Language Processing and Computer Vision [D] . ?Socher, Richard 2014

机译：自然语言处理和计算机视觉递归深度学习
6. Extracting health-related causality from twitter messages using natural language processing [O] . Son Doan, Elly W. Yang, Sameer S. Tilak, 2019

机译：使用自然语言处理从Twitter消息中提取与健康相关的因果关系
7. Big Data, Natural Language Processing, and Deep Learning to Detect and Characterize Illicit COVID-19 Product Sales: Infoveillance Study on Twitter and Instagram [O] . Tim Ken Mackey, Jiawei Li, Vidya Purushothaman, 2020

机译：大数据，自然语言处理和深度学习检测和表征非法Covid-19产品销售：Twitter和Instagram上的Infoveillance研究

Detection of Possible Illicit Messages Using Natural Language Processing and Computer Vision on Twitter and Linked Websites

摘要

著录项

相似文献

相关主题

期刊订阅