...
首页> 外文期刊>Knowledge and information systems >Distributed mining of classification rules
【24h】

Distributed mining of classification rules

机译:分布式挖掘分类规则

获取原文
获取原文并翻译 | 示例
           

摘要

Many successful data-mining techniques and systems have been developed. These techniques usually apply to centralized databases with less restricted requirements on learning and response time. Not so much effort has yet been put into mining distributed databases and real-time issues. In this paper, we investigate issues of fast-distributed data mining. We assume that merging the distributed databases into a single one would either be too costly (distributed case) or the individual fragments would be non-uniform so that mining only one fragment would bias the result (fragmented case). The goal is to classify the objects O of the database into one of several mutually exclusive classes C{sub}i. Our approach to make mining fast and feasible is as follows. From each data site or fragment db{sub}k, only a single rule r{sub}(ik) is generated for each class C{sub}i. A small subset {r{sub}(i1),...,r{sub}(ih)} of these individual rules is selected to form a rule set R{sub}i for each class C{sub}i. These rule subsets represent adequately the hidden knowledge of the entire database. Various selection criteria to form R{sub}i are discussed, both theoretically and experimentally.
机译:已经开发了许多成功的数据挖掘技术和系统。这些技术通常适用于对学习和响应时间要求较少的集中式数据库。挖掘分布式数据库和实时问题还没有付出太多的努力。在本文中,我们研究了快速分布式数据挖掘的问题。我们假设将分布式数据库合并到一个数据库中可能要么成本太高(分布式案例),要么单个片段将不统一,因此仅挖掘一个片段将使结果产生偏差(片段案例)。目的是将数据库的对象O分类为几个互斥类C {sub} i中的一个。我们使采矿快速可行的方法如下。从每个数据站点或片段db {sub} k,为每个类C {sub} i仅生成一个规则r {sub}(ik)。选择这些单独规则的一小子集{r {sub}(i1),...,r {sub}(ih)}以为每个类C {sub} i形成规则集R {sub} i。这些规则子集足以代表整个数据库的隐藏知识。从理论上和实验上都讨论了形成R {sub} i的各种选择标准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号