首页> 外文会议>Second International Conference on Data Mining, 2nd >A fuzzy-based conceptual KDD approach: The SaintEtiQ system
【24h】

A fuzzy-based conceptual KDD approach: The SaintEtiQ system

机译:基于模糊的概念KDD方法:SaintEtiQ系统

获取原文
获取原文并翻译 | 示例

摘要

Knowledge Discovery in Databases (KDD) systems are basically designed to extract knowledge nuggets from data, i.e. a very precise and hidden knowledge, rather than to provide a global view on database. Moreover, knowledge representation is often unintelligible for the user, such that a post-processing visualization step is necessary. Therefore, we propose a fuzzy-based summarization system named SAINTE-TIQ, providing different levels of summaries covering all the database. Summaries are output concepts of an incremental conceptual clustering algorithm performed on database records. Concept formation is the fundamental activity which structures objects into a concise form of knowledge that can be efficiently used in the future. It includes the classification of new objects based on a subset of their properties (the prediction ability), as well as the qualitative understanding of those objects based on the generated knowledge (the observation ability). In our approach, database records could be either crisp or fuzzy―imprecise, uncertain or missing. Their representation is then extended to fuzzy sets. Moreover, fuzzy background knowledge represented by fuzzy relational thesauri (FRT) on each attribute is essential to the generalization step of the system. Indeed, this fuzzy-based domain knowledge allows us to induce higher-level intents of concepts, representing part of the database. FRT are both built a priori on numerical and nominal attributes by domain experts, and coupled with a fuzzy discretization process performed on numerical attributes with Zadeh's linguistic variables. Using background knowledge into a concept learning process presents the essential advantage of providing a common vocabulary between the user and the system, and introducing a well-known slanted learning rather than producing technical and unintelligible summaries only based on mathematical measures. Furthermore, the fuzzy set-based concept representation allows the system to introduce flexibility in the learning task as well as to improve accuracy of concept descriptions, i.e. database summaries. Finally, another major feature of our approach is that summaries are naturally described by tuples of fuzzy labels, such that they could be stored into the database, analyzed and queried as any other data.
机译:数据库知识发现(KDD)系统基本上旨在从数据中提取知识块,即非常精确和隐藏的知识,而不是提供数据库的全局视图。此外,知识表示对于用户而言通常是难以理解的,因此,后处理可视化步骤是必需的。因此,我们提出了一个基于模糊的摘要系统,名为SAINTE-TIQ,它提供了覆盖所有数据库的不同摘要级别。摘要是对数据库记录执行的增量概念聚类算法的输出概念。概念形成是将对象构造成可以在将来有效使用的简明知识形式的基本活动。它包括基于新对象的属性子集(预测能力)对新对象进行分类,以及基于生成的知识(观察能力)对这些对象进行定性理解。在我们的方法中,数据库记录可能是清晰或模糊的-不精确,不确定或丢失。然后将它们的表示扩展到模糊集。此外,每个属性上以模糊关系叙词表(FRT)表示的模糊背景知识对于系统的泛化步骤至关重要。的确,这种基于模糊的领域知识使我们能够诱导概念的更高层次的意图,代表数据库的一部分。 FRT既是领域专家根据数值和名义属性建立的先验知识,又加上对带有Zadeh语言变量的数值属性执行的模糊离散化过程。在概念学习过程中使用背景知识具有以下基本优势:在用户和系统之间提供通用词汇表,并引入众所周知的倾斜学习方式,而不是仅基于数学度量来产生技术性和难以理解的摘要。此外,基于模糊集的概念表示允许系统在学习任务中引入灵活性以及提高概念描述的准确性,即数据库摘要。最后,我们的方法的另一个主要特征是摘要由模糊标签的元组自然地描述,因此摘要可以存储到数据库中,可以像其他任何数据一样进行分析和查询。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号