首页> 外文期刊>Quality Control, Transactions >Detection of Possible Illicit Messages Using Natural Language Processing and Computer Vision on Twitter and Linked Websites
【24h】

Detection of Possible Illicit Messages Using Natural Language Processing and Computer Vision on Twitter and Linked Websites

机译:在Twitter上使用自然语言处理和计算机视觉检测可能的非法消息和链接网站

获取原文
获取原文并翻译 | 示例
           

摘要

Human trafficking is a global problem that strips away the dignity of millions of victims. Currently, social networks are used to spread this crime through the online environment by using covert messages that serve to promote these illegal services. In this context, since law enforcement resources are limited, it is vital to automatically detect messages that may be related to this crime and could also serve as clues. In this paper, we identify Twitter messages that could promote these illegal services and exploit minors by using natural language processing. The images and the URLs found in suspicious messages were processed and classified by gender and age group, so it is possible to detect photographs of people under 14 years of age. The method that we used is as follows. First, tweets with hashtags related to minors are mined in real-time. These tweets are preprocessed to eliminate noise and misspelled words, and then the tweets are classified as suspicious or not. Moreover, geometric features of the face and torso are selected using Haar models. By applying Support Vector Machine (SVM) and Convolutional Neural Network (CNN), we are able to recognize gender and age group, taking into account torso information and its proportional relationship with the head, or even when the face details are blurred. As a result, using the SVM model with only torso features has a higher performance than CNN.
机译:人口贩运是一种全球问题,这些问题脱离了数百万受害者的尊严。目前,社交网络用于通过使用用于促进这些非法服务的隐秘信息来传播这种犯罪。在这方面,由于执法资源有限,自动检测可能与此犯罪有关的信息至关重要,也可以作为线索。在本文中,我们识别推特邮件,可以通过使用自然语言处理来促进这些非法服务和利用未成年人。可疑信息中发现的图像和URL由性别和年龄组进行处理和分类,因此可以检测14岁以下人员的照片。我们使用的方法如下。首先,与未成年人有关的鸣叫的推文是实时开采的。这些推文是预处理的,以消除噪音和拼写错误的单词,然后推文被归类为可疑。此外,使用HAAR模型选择面部和躯干的几何特征。通过应用支持向量机(SVM)和卷积神经网络(CNN),我们能够识别性别和年龄组,考虑到躯干信息及其与头部的比例关系,甚至在面部细节模糊时。结果,使用仅具有躯干特征的SVM模型具有比CNN更高的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号