Discovering New Sensitive Words Based on Sensitive Information Categorization

机译：基于敏感信息分类发现新的敏感词

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sensitive word detection has popped out nowadays as the prosperity of internet technologies emerges. At the same time, some internet users diffuse sensitive contents which contains unhealthy information. But how to improve sensitive information classification accuracy and find new sensitive words has been an urgent demand in the network information security. On the one hand, the sensitive information classification result inaccurate, on the other hand, all the research methods can not find the new sensitive information, in other word, it does not automatically identify new sensitive information. We mainly improved the existing outstanding machine learning classification algorithm, experimental results show that this method can significantly improve the classification accuracy. Beside, by researching word similarity algorithm base on How Net and CiLin, we can realize expanding the database of sensitive words continually (i.e., discovery the new sensitive word). Through the methodologies mentioned above, we have got a better accuracy and realized new sensitive word discovery technology which will be analyzed and presented in the paper.

机译：如今，随着互联网技术的兴起，灵敏的单词检测已经兴起。同时，一些互联网用户散布包含不健康信息的敏感内容。但是，如何提高敏感信息的分类精度和寻找新的敏感词一直是网络信息安全的迫切需求。一方面，敏感信息的分类结果不准确，另一方面，所有的研究方法都找不到新的敏感信息，也就是说，它并不能自动识别新的敏感信息。我们主要改进了现有优秀的机器学习分类算法，实验结果表明该方法可以显着提高分类精度。此外，通过研究基于How Net和CiLin的词相似度算法，我们可以实现不断扩展敏感词的数据库（即发现新的敏感词）。通过上述方法，我们获得了更好的准确性，并实现了将在本文中进行分析和介绍的新的敏感词发现技术。

著录项

来源
《International Conference on Artificial Intelligence and Security》|2019年|338-346|共9页
会议地点 New York(US)
作者
Panyu Liu; Yangyang Li; Zhiping Cai; Shuhui Chen;
展开▼
作者单位

Innovation Center and Mobile Internet Development and Research Center China Academy of Electronics and Information Technology Beijing 100041 China College of Computer National University of Defense Technology Changsha 410073 Hunan China;

Innovation Center and Mobile Internet Development and Research Center China Academy of Electronics and Information Technology Beijing 100041 China;

College of Computer National University of Defense Technology Changsha 410073 Hunan China;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Sensitive words; Sensitive information classification; Natural language processing; New word discovery;

机译：敏感词；敏感信息分类；自然语言处理；新词发现;

相似文献

外文文献
中文文献
专利

1. A Chinese Message Sensitive Words Filtering System based on DFA and Word2vec [J] . Fei Wu, Yuxiang Cai Procedia Computer Science . 2018,第5期

机译：基于DFA和Word2VEC的中国消息敏感词过滤系统
2. Variations in the environment, energy and macroeconomic interdependencies and related renewable energy transition policies based on sensitive categorization of countries in Africa [J] . Oppong Amos, Jie Ma, Acheampong Kingsley N., Journal of Cleaner Production . 2020,第May10期

机译：基于非洲国家敏感分类的环境，能源和宏观经济相互依赖性及相关可再生能源转型政策的变化
3. Multilabel text categorization based on a new linear classifier learning method and a category-sensitive refinement method [J] . Yu-Chuan Chang, Shyi-Ming Chen, Churn-Jung Liau Expert systems with applications . 2008,第3期

机译：基于一种新的线性分类器学习方法和类别敏感细化方法的多标签文本分类
4. Discovering New Sensitive Words Based on Sensitive Information Categorization [C] . Panyu Liu, Yangyang Li, Zhiping Cai, International Conference on Artificial Intelligence and Security . 2019

机译：根据敏感信息分类发现新的敏感词
5. Tracking changes: A proposal for a linguistically sensitive schema for categorizing textual variation of Hebrew bible texts in light of variant scribal practices among the Judaean Desert psalms witnesses. [D] . Sigrist, David J. 2015

机译：跟踪变化：提议一种语言敏感的模式，根据犹太沙漠圣诗目击者的不同抄写手法，对希伯来圣经文本的文本变化进行分类。
6. Genomic organization of duplicated short wave-sensitive and long wave-sensitive opsin genes in the green swordtail Xiphophorus helleri [O] . Corey T Watson, Krzysztof P Lubieniecki, Ellis Loew, 2010

机译：绿色剑尾Xiphophorus helleri中重复的短波敏感和长波敏感视蛋白基因的基因组组织
7. UNITY IN DIVERSITY: DISCOVERING TOPICS FROM WORDS Information Theoretic Co-clustering for Visual Categorization [O] . Ashish Gupta, Richard Bowden 2013

机译：多样性中的统一：从单词中发现主题视觉分类的信息理论共聚

Discovering New Sensitive Words Based on Sensitive Information Categorization

摘要

著录项

相似文献

相关主题

期刊订阅