首页> 美国政府科技报告 >Data Editing on Large Data Sets
【24h】

Data Editing on Large Data Sets

机译:大数据集的数据编辑

获取原文

摘要

The process of analyzing large data sets often includes an early exploratory stage to first, develop a basic understanding of the data and its interrelationships and second, to prepare and cleanup the data for hypothesis formulation and testing. This preliminary phase of the data analysis process usually requires facilities found in research data management systems, text editors, graphics packages, and statistics packages. Also this process usually requires the analyst to write special programs to cleanup and prepare the data for analysis. This paper describes a technique now implemented as a single computational tool, a data editor, which combines a cross section of facilities from the above systems with emphasis on research data base manipulation and subsetting techniques. The data editor provides an interactive environment to explore and manipulate data sets with particular attention to the implications of large data sets. It utilizes a relational data model and a self describing binary data format which allows data transportability to other data analysis packages. Some impacts of editing large data sets will be discussed. A technique for manipulating portions or subsets of large data sets without physical replication is introduced. Also an experimental command structure and operating environment are presented. (ERA citation 06:032800)

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号