首页> 外文会议>International Conference on Computing for Geospatial Research and Application >An Efficient Technique for Searching Very Large Files with Fuzzy Criteria Using the Pigeonhole Principle
【24h】

An Efficient Technique for Searching Very Large Files with Fuzzy Criteria Using the Pigeonhole Principle

机译:利用信鸽原理用模糊准则搜索超大文件的有效技术

获取原文

摘要

Big Data is the new term of the exponential growth of data in the Internet. The importance of Big Data is not about how large it is, but about what information you can get from analyzing these data. Such analysis would help many businesses on making smarter decisions, and provide time and cost reduction. Therefore, to make such analysis, you will definitely need to search the large files on Big Data. Big Data is such a construction where sequential search is prohibitively inefficient, in terms of time and energy. Therefore, any new technique that allows very efficient search in very large files is highly demanded. The paper presents an innovative approach for efficient searching with fuzzy criteria in very large information systems(Big Data). Organization of efficient access to a large amount of information by an "approximate" or "fuzzy" indication is a rather complicated Computer Science problem. Usually, the solution of this problem relies on a brute force approach, which results in sequential look-up of the file. In many cases, this substantially undermines system performance. The suggested technique in this paper uses different approach based on the Pigeonhole Principle. It searches binary strings that match the given request approximately. It substantially reduces the sequential search operations and works extremely efficiently from several orders of magnitude including speed, cost and energy. This paper presents a complex developed scheme for the suggested approach using a new data structure, called FuzzyFind Dictionary. The developed scheme provides more accuracy than the basic utilization of the suggested method. It also, works much faster than the sequential search.
机译:大数据是Internet数据呈指数增长的新术语。大数据的重要性不在于它的大小,而在于您从分析这些数据中可以获得哪些信息。这种分析将帮助许多企业做出更明智的决策,并减少时间和成本。因此,要进行此类分析,您肯定需要搜索大数据上的大文件。大数据就是这样一种结构,从时间和精力上来说,顺序搜索的效率非常低。因此,迫切需要能够在非常大的文件中非常有效地进行搜索的任何新技术。本文提出了一种在大型信息系统(大数据)中利用模糊准则进行有效搜索的创新方法。通过“近似”或“模糊”指示来组织对大量信息的有效访问是一个相当复杂的计算机科学问题。通常,此问题的解决方案依赖于蛮力方法,这会导致顺序查找文件。在许多情况下,这大大损害了系统性能。本文中建议的技术基于Pigeonhole原理使用不同的方法。它搜索与给定请求近似匹配的二进制字符串。它极大地减少了顺序搜索操作,并且从速度,成本和精力等几个数量级起极其高效地工作。本文提出了一种使用建议的新方法(称为模糊查找字典)的复杂方法。所开发的方案比所建议方法的基本用途提供了更高的准确性。它也比顺序搜索快得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号