首页> 外文期刊>Soft computing: A fusion of foundations, methodologies and applications >A fuzzy similarity-based rough set approach for attribute selection in set-valued information systems
【24h】

A fuzzy similarity-based rough set approach for attribute selection in set-valued information systems

机译:基于模糊的相似性粗糙集方法,用于设定信息系统中的属性选择

获取原文
获取原文并翻译 | 示例
           

摘要

Databases obtained from different search engines, market data, patients' symptoms and behaviours, etc., are some common examples of set-valued data, in which a set of values are correlated with a single entity. In real-world data deluge, various irrelevant attributes lower the ability of experts both in speed and in predictive accuracy due to high dimension and insignificant information, respectively. Attribute selection is the concept of selecting those attributes that ideally are necessary as well as sufficient to better describe the target knowledge. Rough set-based approaches can handle uncertainty available in the real-valued information systems after the discretization process. In this paper, we introduce a novel approach for attribute selection in set-valued information system based on tolerance rough set theory. The fuzzy tolerance relation between two objects using a similarity threshold is defined. We find reducts based on the degree of dependency method for selecting best subsets of attributes in order to obtain higher knowledge from the information system. Analogous results of rough set theory are established in case of the proposed method for validation. Moreover, we present a greedy algorithm along with some illustrative examples to clearly demonstrate our approach without checking for each pair of attributes in set-valued decision systems. Examples for calculating reduct of an incomplete information system are also given by using the proposed approach. Comparisons are performed between the proposed approach and fuzzy rough-assisted attribute selection on a real benchmark dataset as well as with three existing approaches for attribute selection on six real benchmark datasets to show the supremacy of proposed work.
机译:从不同的搜索引擎,市场数据,患者的症状和行为等获得的数据库是设定值数据的一些常见示例,其中一组值与单个实体相关。在真实的数据洪水中,各种无关的属性分别降低了专家的速度和预测准确性,分别是由于高维和微不足道的信息。属性选择是选择理想情况下是必要的那些属性的概念,并足以更好地描述目标知识。基于粗糙集的方法可以在离散化过程之后处理实值信息系统中可用的不确定性。本文基于公差粗糙集理论介绍了基于公差粗糙集理论的集价值信息系统的一个新颖方法。定义了使用相似阈值的两个对象之间的模糊容差关系。我们发现基于用于选择最佳属性子集的依赖性方法的减少,以便从信息系统获取更高的知识。建立了粗糙集理论的类似结果,在提出的验证方法的情况下建立。此外,我们介绍了一种贪婪算法以及一些说明性示例,以清楚地证明我们的方法而不检查设定值决策系统中的每对属性。使用所提出的方法还给出计算不完全信息系统的减少的实施例。在建议的方法和模糊粗辅助的属性选择之间进行比较,以及在六个真实的基准数据集中具有三种现有的属性选择方法,以显示所提出的工作的至高无上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号