...
首页> 外文期刊>Computer physics communications >ROOT - A C++ framework for petabyte data storage, statistical analysis and visualization
【24h】

ROOT - A C++ framework for petabyte data storage, statistical analysis and visualization

机译:ROOT-用于PB数据存储,统计分析和可视化的C ++框架

获取原文
获取原文并翻译 | 示例
           

摘要

ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community. designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the 17ree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, inClUding linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular. the RooFit package allows the user to perform complex data modeling and fitting while the RooStats library provides abstractions and implementations for advanced statistical tools. Multivariate classification methods based on machine learning techniques are available via the TMVA package. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets. using onthe-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way.
机译:ROOT是在高能物理(HEP)社区中构思的面向对象的C ++框架。为高效存储和分析PB级数据而设计。可以将C ++类的任何实例以与计算机无关的压缩二进制格式存储在ROOT文件中。在ROOT中,通过使用垂直数据存储技术,对17ree对象容器进行了优化,以便对非常大的数据集进行统计数据分析。这些容器可以跨越本地磁盘,Web或许多不同的共享文件系统上的大量文件。为了分析此数据,用户可以从广泛的数学和统计函数中选择,包括线性代数类,数值算法(例如积分和最小化)以及执行回归分析(拟合)的各种方法。特别是。 RooFit包允许用户执行复杂的数据建模和拟合,而RooStats库则提供高级统计工具的抽象和实现。可通过TMVA软件包使用基于机器学习技术的多元分类方法。这些分析工具的核心是直方图类,它提供一维和多维数据的装箱。结果可以以高质量的图形格式(如Postscript和PDF)或位图格式(如JPG或GIF)保存。结果也可以存储到ROOT宏中,从而可以完全重新制作和重做图形。用户通常在运行小数据样本的同时,使用交互式C ++解释器CINT逐步创建其分析宏。开发完成后,他们可以在大型数据集上以完全编译的速度运行这些宏。使用即时编译,或通过创建独立的批处理程序。最后,如果有处理场可用,则用户可以减少本质上并行任务的执行时间-例如HEP中的数据挖掘-通过使用PROOF,它将以透明的方式最佳地将工作分配到可用资源上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号