首页> 外国专利> Data Analysis Computer System and Method For Parallelized and Modularized Analysis of Big Data

Data Analysis Computer System and Method For Parallelized and Modularized Analysis of Big Data

机译:大数据并行和模块化分析的数据分析计算机系统和方法

摘要

The focus of the present invention is the modular analysis of Big Data encompassing parallelization, chunking, and distributed analysis applications. Typical application scenarios include: (i) data may not reside in one database but alternatively exist in more non-identical databases, and analysis has to take place in situ rather than combining all databases in one big database; (ii) data exceeding the working memory of the largest available computer and has to be broken into smaller pieces that need be analyzed separately and the results combined; (c) data encompassing several distinct data types that have to be analyzed separately by methods specific to each data type, and the results combined; (iv) data encompassing several distinct data types that have to be analyzed separately by analyst with knowledge/skills specific to each data type, and the results combined; and (v) data analysis that has to take place over time as new data is coming in and results are incrementally improved until analysis objectives are met, or no more data is available. The present Big Data Parallelization/Modularization data analysis system and method—“BDP/M”)) is implemented in general purpose digital computers and is capable of dealing with the above scenarios of Big Data analysis as well as any scenario where parallel, distributed, federated, chunked and serialized Big Data analysis is desired without compromising efficiency and correctness.
机译:本发明的重点是大数据的模块化分析,包括并行化,分块和分布式分析应用程序。典型的应用场景包括:(i)数据可能不驻留在一个数据库中,而是存在于更多不同的数据库中,并且分析必须在原地进行,而不是将所有数据库组合在一个大数据库中; (ii)超出最大可用计算机工作内存的数据,必须分解成较小的部分,需要分别分析并合并结果; (c)包含几种不同数据类型的数据,必须使用每种数据类型专用的方法分别对其进行分析,然后将结果合并; (iv)包含几种不同数据类型的数据,分析人员必须使用每种数据类型特定的知识/技能分别对其进行分析,并将结果组合在一起; (v)随着新数据的到来,随着时间的流逝,必须进行数据分析,并逐步改善结果,直到达到分析目标或没有更多数据可用为止。当前的大数据并行化/模块化数据分析系统和方法(“ BDP / M”))是在通用数字计算机中实现的,并且能够处理上述大数据分析方案以及并行,分布式,需要在不影响效率和正确性的情况下进行联合,分块和序列化的大数据分析。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号