首页> 外文会议>International Conference on Artificial Intelligence and Security >Discovering New Sensitive Words Based on Sensitive Information Categorization
【24h】

Discovering New Sensitive Words Based on Sensitive Information Categorization

机译:基于敏感信息分类发现新的敏感词

获取原文

摘要

Sensitive word detection has popped out nowadays as the prosperity of internet technologies emerges. At the same time, some internet users diffuse sensitive contents which contains unhealthy information. But how to improve sensitive information classification accuracy and find new sensitive words has been an urgent demand in the network information security. On the one hand, the sensitive information classification result inaccurate, on the other hand, all the research methods can not find the new sensitive information, in other word, it does not automatically identify new sensitive information. We mainly improved the existing outstanding machine learning classification algorithm, experimental results show that this method can significantly improve the classification accuracy. Beside, by researching word similarity algorithm base on How Net and CiLin, we can realize expanding the database of sensitive words continually (i.e., discovery the new sensitive word). Through the methodologies mentioned above, we have got a better accuracy and realized new sensitive word discovery technology which will be analyzed and presented in the paper.
机译:如今,随着互联网技术的兴起,灵敏的单词检测已经兴起。同时,一些互联网用户散布包含不健康信息的敏感内容。但是,如何提高敏感信息的分类精度和寻找新的敏感词一直是网络信息安全的迫切需求。一方面,敏感信息的分类结果不准确,另一方面,所有的研究方法都找不到新的敏感信息,也就是说,它并不能自动识别新的敏感信息。我们主要改进了现有优秀的机器学习分类算法,实验结果表明该方法可以显着提高分类精度。此外,通过研究基于How Net和CiLin的词相似度算法,我们可以实现不断扩展敏感词的数据库(即发现新的敏感词)。通过上述方法,我们获得了更好的准确性,并实现了将在本文中进行分析和介绍的新的敏感词发现技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号