Petabyte Scale Data Mining: Dream or Reality?

机译：PB级数据挖掘：梦想还是现实？

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Science is becoming very data intensive. Today's astronomy datasets with tens of millions of galaxies already present substantial challenges for data mining. In less than 10 years the catalogs are expected to grow to billions of objects, and image archives will reach Petabytes. Imagine having a 100GB database in 1996, when disk scanning speeds were 30MB/s, and database tools were immature. Such a task today is trivial, almost manageable with a laptop. We think that the issue of a PB database will be very similar in six years. In this paper we scale our current experiments in data archiving and analysis on the Sloan Digital Sky Survey data six years into the future. We analyze these projections and look at the requirements of performing data mining on such data sets. We conclude that the task scales rather well: we could do the job today, although it would be expensive. There do not seem to be any show-stoppers that would prevent us from storing and using a Petabyte dataset six years from today.

机译：科学正变得非常数据密集。如今，拥有数千万个星系的天文学数据集已经对数据挖掘提出了严峻的挑战。在不到10年的时间里，目录预计将增长到数十亿个对象，并且图像档案将达到PB。想象一下，1996年有一个100GB的数据库，当时磁盘扫描速度为30MB / s，而数据库工具还不成熟。今天的这项任务是微不足道的，几乎可以用笔记本电脑来管理。我们认为，PB数据库的问题将在六年内非常相似。在本文中，我们将在六年后对Sloan Digital Sky Survey数据进行数据归档和分析的现有实验进行扩展。我们分析这些预测，并研究对此类数据集进行数据挖掘的需求。我们得出的结论是，任务的伸缩性相当好：尽管成本很高，但我们今天可以完成工作。似乎没有阻止我们从今天起六年后存储和使用Petabyte数据集的障碍。

著录项

来源
《Conference on Survey and Other Telescope Technologies and Discoveries; Aug 27-28, 2002; Waikoloa, Hawaii, USA》|2002年|p.333-338|共6页
会议地点 Waikoloa HI(US)
作者
Alexander S. Szalay; Jim Gray; Jan Vandenberg;
展开▼
作者单位

Department of Physics and Astronomy, The Johns Hopkins University, Baltimore, MD 21218;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类机械、仪表工业;
关键词
data mining; large-scale computing; databases; spatial statistics;

机译：数据挖掘;大规模计算;数据库;空间统计;

相似文献

外文文献
中文文献
专利

1. Augmented Virtual Reality: Combining Crowd Sensing and Social Data Mining with Large-Scale Simulation Using Mobile Agents for Future Smart Cities [J] . Stefan Bosse, Uwe Engel Proceedings . 2018,第1期

机译：增强虚拟现实：将人群传感和社会数据挖掘与使用移动代理商进行大规模仿真，以便将来的智能城市使用移动代理
2. Remote sensing heritage in a petabyte-scale: satellite data and heritage Earth Engine (c) applications [J] . Agapiou Athos International journal of digital Earth . 2017,第1a3期

机译：Petabyte-Scale的遥感遗产：卫星数据和遗产地球发动机（C）应用
3. Spectroscopic data handling at petabyte scale [J] . Antony N. Davies, Shane R. Ellis, Benjamin Balluff, Spectroscopy Asia . 2016,第2期

机译：PB级光谱数据处理
4. Petabyte Scale Data Mining: Dream or Reality? [C] . Alexander S. Szalay, Jim Gray, Jan Vandenberg Conference on survey and other telescope technologies and discoveries . 2002

机译：Petabyte Scale数据挖掘：梦想或现实？
5. Classification of Driver Daydreaming Using Data Mining Techniques. [D] . Miao, Luda. 2012

机译：使用数据挖掘技术对驾驶员做白日梦的分类。
6. Large-Scale Data Mining of Rapid Residue Detection Assay Data From HTML and PDF Documents: Improving Data Access and Visualization for Veterinarians [O] . Majid Jaberi-Douraki, Soudabeh Taghian Dinani, Nuwan Indika Millagaha Gedara, 2021

机译：来自HTML和PDF文件的快速残留检测测定数据的大规模数据挖掘：改善兽医的数据访问和可视化
7. Petabyte Scale Data Mining: Dream or Reality? [O] . Szalay, Alexander S., Gray, Jim, vandenBerg, Jan 2002

机译：petabyte scale Data mining：梦想还是现实？

Petabyte Scale Data Mining: Dream or Reality?

摘要

著录项

相似文献

相关主题

期刊订阅