针对粗糙集中连续属性的离散化问题,提出了一种基于断点选择的离散化方法.首先对条件属性进行重要性排序,选用有效的启发式规则作为获取近似最优断点的依据;然后以信息熵和决策表的相容度作为约束条件,生成离散化数据.最后采用UCI数据对此算法的性能进行了检验,并与其他算法做了对比实验.实验结果表明此算法是有效的,而且当属性值的出现频率和样本数较多时仍有很高的计算效率.%A near-optimal method of attributes discretization is proposed based on the cut point selection. Firstly, the condition attributes are ordered for their importance and the effective heuristic rules are getted according to the near-optimal cut points. Then the discretization data are generated which is constrained by the information entropy and consistance of decision table. At the last the method is tested using the UCI data sets, meanwhile, the result of comparison with other method is showen.From the experiment ,a conclusion can be drown that this method is effective , especially,it is an effective algogithm on the high frequence of the attribute value and large simple data.
展开▼