首页> 中文期刊> 《计算机工程与科学》 >一种扩展Winnowing剽窃检测算法

一种扩展Winnowing剽窃检测算法

         

摘要

Plagiarism is a common problem faced by both academic and education fields.Although commercial plagiarism detection systems are relatively mature in terms of technology,they are not adopted in routine,real-time and lightweight fields such as student assignments detection because of high cost in efficiency and economy.We propose an extending classic Winnowing plagiarism detection algorithm,which can record the location and length while calculating the hash value of a text block.The location and length information in fingerprints can be used to locate and mark plagiarism text block in original documents.We describe algorithms for detecting,locating and plagiarism fingerprints index merging using the extended Winnowing,and performe some functional and performance experiments to test the algorithms.Experiments and actual running results show that the extended Winnowing affects performance slightly,but it can meet the needs of small to medium applications under general hardware configuration.The extended Winnowing algorithm keeps the original features such as high efficiency,reliability and flexibility,and meanwhile gets improved in functionality and enhances its practicability and adaptability.%剽窃是目前学术界和教育界面临的普遍问题,成熟的商业化剽窃检测系统运行时间和经济代价高,不适合实时性、轻量级的学生作业等日常检测.对基于文本指纹的Winnowing剽窃检测算法进行扩展,在提取指纹的同时记录文本定位及其长度信息,给出了指纹提取、文本定位、剽窃指纹索引合并等算法,实现了剽窃文本的检测、定位、标记.实验结果及算法在应用系统中实际运行状况表明,算法的扩展对其性能影响不大,普通硬件配置条件下即可满足中小规模应用需求.扩展算法在原算法轻量级、高效率、可靠性和灵活度高等特点基础上,进一步拓展了Winnowing的功能,增强了原算法的适应性和应用价值.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号