...
首页> 外文期刊>Information Systems >Data preparation for KDD through automatic reasoning based on description logic
【24h】

Data preparation for KDD through automatic reasoning based on description logic

机译:基于描述逻辑的自动推理的KDD数据准备

获取原文
获取原文并翻译 | 示例
           

摘要

Without data preparation, data mining algorithms cannot operate on data within the knowledge discovery in databases (KDD) process. In fact, the success of later KDD phases largely depends on the data preparation stage. The use of mechanisms for automatically preparing data saves a lot of time and resources within the KDD process. These resources will then be available for use at later, less automatable stages, for example, during results interpretation. We have proposed a general-purpose mechanism applicable to multiple domains in order to improve the data preparation phase in the KDD process. This mechanism processes and automatically converts input data to a suitable format for the application of different data preparation techniques based on a known syntax. It is based on the use of description logic Taking a generic UML2 data model as a reference, this mechanism is able to check whether any XML data source whatsoever can be transformed and modelled as a subsumption or instance of the above UML2 model. Thus it automatically identifies a consistent, non-ambiguous and finite set of XLST transformations which are used to prepare the data for the application of data mining techniques, obviating the need to expend resources on the preliminary preparation and formatting stage. The proposed mechanism was applied on structurally complex data from four different domains. In order to test the validity of the proposal, we have applied data mining techniques to extract knowledge from the prepared data. The sound results of applying our proposal to several different domains confirm that it is applicable to any XML data source, as well as being correct, computationally efficient and saving time during the data preparation phase.
机译:没有数据准备,数据挖掘算法就无法对数据库(KDD)知识发现中的数据进行操作。实际上,后续KDD阶段的成功很大程度上取决于数据准备阶段。使用自动准备数据的机制可以节省KDD流程中的大量时间和资源。这些资源将可用于以后的自动化程度较低的阶段,例如在结果解释期间。为了改善KDD流程中的数据准备阶段,我们提出了适用于多个域的通用机制。该机制基于已知语法处理输入数据并将其自动转换为适合于不同数据准备技术应用的格式。它基于描述逻辑的使用,以通用UML2数据模型为参考,此机制能够检查是否可以将任何XML数据源转换和建模为上述UML2模型的包含或实例。因此,它会自动识别出一致,无歧义且有限的XLST转换集,这些转换用于准备数据以供数据挖掘技术应用,从而避免了在初步准备和格式化阶段花费资源的需求。所提出的机制应用于来自四个不同领域的结构复杂的数据。为了测试该建议的有效性,我们已应用数据挖掘技术从准备的数据中提取知识。将我们的建议应用于多个不同领域的良好结果证实,该建议适用于任何XML数据源,并且在数据准备阶段是正确的,计算有效的并节省了时间。

著录项

  • 来源
    《Information Systems》 |2014年第8期|54-72|共19页
  • 作者单位

    Universidad a Distancia de Madrid, Facultad de Ensenanzas Tecnicas, Camino de la Fonda, 20, 28400 Collado Villalba, Madrid, Spain;

    Universidad a Distancia de Madrid, Facultad de Ensenanzas Tecnicas, Camino de la Fonda, 20, 28400 Collado Villalba, Madrid, Spain;

    Universidad a Distancia de Madrid, Facultad de Ensenanzas Tecnicas, Camino de la Fonda, 20, 28400 Collado Villalba, Madrid, Spain;

    Universidad Politecnica de Madrid, School of Computer Science, Campus de Montegancedo, s 28660 Boadilla del Monte, Madrid, Spain;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    KDD; Data preparation; Data mining; Description logic; Automatic reasoning;

    机译:KDD;数据准备;数据挖掘;描述逻辑;自动推理;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号