首页> 外国专利> Design and implementation of a text plagiarism detection method using OMUCS and sequence alignment technique

Design and implementation of a text plagiarism detection method using OMUCS and sequence alignment technique

机译:基于OMUCS和序列比对技术的文本窃检测方法的设计与实现

摘要

The present invention relates to a text plagiarism search system for checking whether plagiarism of text and plagiarism checking method, comprising: an original document sentence classification step of classifying an input original document in sentence units; An original document word classification step of classifying sentences classified through the original document sentence classification step in word units; A comparative document sentence classification step of classifying the input comparative document into sentence units; A comparison document word classification step of classifying sentences classified by the comparison document sentence classification step in word units; Comparing the original sentences classified in word units and the comparative text sentences through the original document word classification step and the comparative original document word classification step, and identifying the same words in the original sentence and the comparison text sentence compared to each other; An OMUCS calculation step of applying the same word identified in the same word checking step to an OMUCS [original text] [comparative text] with a modified cosine similarity; And comparing the result output through the OMUCS operation step with a first threshold value, and determining whether or not the similarity of the sentence is determined according to the similarity between the original sentence and the comparative text.
机译:本发明涉及一种用于检查文本是否窃的文本窃搜索系统和窃检查方法,该系统包括:以句子为单位对输入的原始文档进行分类的原始文档句子分类步骤;原始单词分类步骤,将通过原始句子分类步骤分类的句子以单词为单位进行分类;比较文档句子分类步骤,将输入的比较文档分为句子单位;比较文档单词分类步骤,将通过比较文档句子分类步骤分类的句子以单词为单位进行分类;通过原始文档词分类步骤和比较原始文档词分类步骤,将以单词为单位的原始句子与比较文本句子进行比较,并在原始句子和比较文本句子中将彼此比较的单词进行识别; OMUCS计算步骤,将在相同单词检查步骤中识别出的相同单词应用于具有修正余弦相似度的OMUCS [原始文本] [比较文本];然后,将通过OMUCS操作步骤输出的结果与第一阈值进行比较,并根据原始句子与比较文本之间的相似度来确定是否确定了句子的相似度。

著录项

  • 公开/公告号KR100711277B1

    专利类型

  • 公开/公告日2007-04-25

    原文格式PDF

  • 申请/专利权人

    申请/专利号KR20050097563

  • 发明设计人 김지수;한상용;

    申请日2005-10-17

  • 分类号G06F17/21;

  • 国家 KR

  • 入库时间 2022-08-21 20:32:23

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号