Towards Building a Comprehensive Data Mart

机译：建立一个综合的数据集市

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

To uncover new relationships or patterns one must first build a corpus of data or what some call a data mart. However, when we use the internet to build this corpus we must question how we make sure we have collected all the pertinent data and have maximized coverage? There are hundreds of search engines that are available for use on the Internet today. Which one is best? Is one better for one problem and a second better for another? Are meta-search engines better than individual search engines? In this paper we look at one possible approach in developing a methodology to maximize coverage. Before we present this methodology, we first provide motivation towards the need for increased coverage. We next investigate how we can obtain ground truth and what the ground truth can provide us in the way of some insight into the size of the Internet and search engine capabilities. We then conclude our discussion by developing a methodology in which we compare a number of the search engines and how we can increase overall coverage and thus develop a more inclusive data mart.

机译：要发现新的关系或模式，必须首先建立数据语料库或某些人所谓的数据集市。但是，当我们使用互联网建立语料库时，我们必须质疑如何确保我们收集了所有相关数据并最大限度地扩大了覆盖范围？今天，有数百种搜索引擎可在Internet上使用。哪一个最好？一个对一个问题更好，第二个对另一个问题更好吗？元搜索引擎比单个搜索引擎好吗？在本文中，我们着眼于开发一种最大化覆盖率的方法。在介绍这种方法之前，我们首先提供增加覆盖范围的动力。接下来，我们将研究如何获取基本事实以及基本事实可以以某种方式洞察互联网的规模和搜索引擎功能为我们提供什么。然后，我们通过开发一种方法来结束我们的讨论，在该方法中，我们将比较多个搜索引擎以及如何增加总体覆盖率，从而开发更具包容性的数据集市。

著录项

来源
《Conference on Data Mining and Knowledge Discovery: Theory, Tools, and Technology VI; 20040412-20040413; Orlando,FL; US》|2004年|P.236-245|共10页
会议地点 Orlando FL(US)
作者
Douglas Boulware; John Salerno; Richard Bleich; Michael Hinman;
展开▼
作者单位

AFRL/IFEA Rome, NY 13441;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类数据处理、数据处理系统;计算机的应用;
关键词
knowledge discovery; data marts; internet; coverage;

机译：知识发现;数据集市;互联网;覆盖;

相似文献

外文文献
中文文献
专利

1. Building-Up a Comprehensive Database of Flavonoids Based on Nuclear Magnetic Resonance Data [J] . S. Moco, Li-Hong Tseng, M. Spraul, Chromatographia . 2006,第9a10期

机译：基于核磁共振数据建立黄酮类化合物的综合数据库
2. Off-grid and decentralized hybrid renewable electricity systems data analysis platform (OSDAP) A building block of a comprehensive techno-economic approach based on contrastive case studies in Sub-Saharan Africa and Canada [J] . Elkadragy Mohamed M., Alici Mert, Alsersy Ahmed, Journal of Energy Storage . 2021,第Feba期

机译：离网和分散的混合再生电力系统数据分析平台（OSDAP）基于撒哈拉以南非洲和加拿大的对比案例研究的全面技术经济方法的构建块
3. Climate change and energy performance of European residential building stocks - A comprehensive impact assessment using climate big data from the coordinated regional climate downscaling experiment [J] . Yang Yuchen, Javanroodi Kavan, Nik Vahid M. Applied Energy . 2021,第Sepa15期

机译：欧洲住宅楼宇的气候变化和能源绩效 - 采用协调区域气候镇压实验的气候大数据进行全面的影响评估
4. Towards Building a Comprehensive Data Mart [C] . Douglas Boulware, John Salerno, Richard Bleich, Conference on data mining and knowledge discovery: Theory, tools, and technology . 2004

机译：建立一个综合数据集市
5. Building Wal-Mart with resistance: Community political action against a new Wal-Mart Supercenter. [D] . Overfelt, David. 2006

机译：抵抗建设沃尔玛：针对新沃尔玛超级中心的社区政治行动。
6. The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics diagnostic testing and personalized genomic medicine [O] . Peter D. Stenson, Matthew Mort, Edward V. Ball, -1

机译：人类基因突变数据库：建立用于临床和分子遗传学诊断测试和个性化基因组医学的综合突变库
7. Crespo, M.B., Martínez-Azorín, M. amp; Mavrodiev, E.V. (2015) Can a rainbow consist of a single colour? A new comprehensive generic arrangement of the ‘Iris sensu latissimo’ clade (Iridaceae), congruent with morphology and molecular data. Phytotaxa 232: 1–78. [O] . MANUEL B. CRESPO, MARIO MARTÍNEZ-AZORÍN, EVGENY V. MAVRODIEV 2018

机译：Crespo，M.B.，Martínez-Azorín，M.＆Mavrodiev，E.v。（2015）彩虹可以由单色组成吗？ “虹膜Sensu Latissimo”思工（Iridaceae）的新综合通用安排，与形态和分子数据一致。 Phytotaxa 232：1-78。

Towards Building a Comprehensive Data Mart

摘要

著录项

相似文献

相关主题

期刊订阅