...
首页> 外文期刊>Journal of Engineering Research >Strategic enhancement of the collaborative framework for novelty in retrieval from digital textual data corpus by deploying DPSC and RBWM algorithms for forensic analysis
【24h】

Strategic enhancement of the collaborative framework for novelty in retrieval from digital textual data corpus by deploying DPSC and RBWM algorithms for forensic analysis

机译:通过部署DPSC和RBWM算法进行法医分析,从数字文本数据语料库中检索新颖性的协作框架的战略增强

获取原文
           

摘要

This paper proposes two advanced algorithms embedded into an integrated system; one is a Dynamic Path Selection Clustering (DPSC) algorithm for the document clustering and the other is the Rearward Binary Window Match (RBWM) algorithm for the user’s search engine. The DPSC algorithm is derived from the concept of Google’s crawler technique implemented in offline processing and the RBWM algorithm for search engine is derived by utilizing the techniques of other search algorithms. The proposed system is being accomplished for giving an appropriate data structure to the input dataset content. The dataset used as input is the Enron dataset, which is large in volume and unstructured. The system is designed with the help of integrating all the individual and independent units into a system by bringing them under one frame and the units are data preprocessing, document clustering, mapping of clusters and search engine. This system, with fine refining integrated frame, would likely evidence in a better way, since simple definition of the system for data retrieval affects the consistency of irrelevant information retrieval for evidencing to be increased. Though there are plenty of existing systems in forensic department with only simple definition of search engines, without any other processes the irrelevancy in retrieval is seen to a larger extent. Consequently, a design of this integrated system, which is automated in process by using the above well defined configured units, is proposed. This systematic approach is for adequate use of digital textual evidences, which assists in quicker crime identification rate. The outcomes of the proposed system are analyzed by obtaining the precision and recall values and comparing them with the results of Metasearch engines like Dogpile and Metacrawler, to test the efficacy in retrieval rate. ?
机译:本文提出了两种嵌入到集成系统中的高级算法。一种是用于文档聚类的动态路径选择聚类(DPSC)算法,另一种是用于用户搜索引擎的后向二进制窗口匹配(RBWM)算法。 DPSC算法源自离线处理中实现的Google搜寻器技术的概念,而搜索引擎的RBWM算法则是利用其他搜索算法的技术得出的。正在完成所提出的系统,以为输入数据集内容提供适当的数据结构。用作输入的数据集是Enron数据集,该数据量大且结构化。该系统的设计是通过将所有单个和独立的单元整合到一个系统中来进行设计的,这些单元是数据预处理,文档聚类,聚类映射和搜索引擎。该系统具有精细的集成框架,可能会以更好的方式提供证据,因为用于数据检索的系统的简单定义会影响不相关的信息检索的一致性,从而增加证据。尽管法务部门有很多现有系统,仅对搜索引擎进行简单定义,但没有任何其他过程,则在很大程度上发现了与检索无关。因此,提出了该集成系统的设计,该设计通过使用上面定义良好的配置单元而在过程中自动化。这种系统的方法是充分利用数字文本证据,有助于更快地识别犯罪。通过获得精度和查全率值并将其与诸如Dogpile和Metacrawler之类的Metasearch引擎的结果进行比较来分析所提出系统的结果,以测试检索率的有效性。 ?

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号