首页> 中文期刊> 《漳州师范学院学报(自然科学版)》 >基于双属性节点部分匹配的决策树改进算法

基于双属性节点部分匹配的决策树改进算法

         

摘要

在决策树算法中,即使存在两个“最好”属性,也只是随机选择一个作为根或节点属性。因此,决策树算法产生的分类规则较少。此外,决策树算法采用全匹配测试实例,测试实例最多匹配一条分类规则甚至没有匹配,进而影响分类准确率。针对该问题,提出了基于双属性节点部分匹配的决策树改进算法(DAID3):首先,如果存在两个信息熵相等或相近的“最好”属性,DAID3算法选择两个属性构建节点,它们的属性值及组合作为分枝。因此,每个训练实例可能被多条分类规则覆盖。其次,判断新实例时,在分枝节点上可能匹配到多条路径,为了选择最好路径,为每个分枝节点设置了节点强度。最后,如果不存在一条从根节点到叶子节点的路径全匹配测试实例,则找出部分匹配该实例的路径,返回该路径的终节点强度最大的类标值。为了便于部分匹配时返回强度最大的类标值,为每一个分枝节点设置节点类标值。实验结果表明,与决策树算法相比,DAID3算法具有分类规则多且有更高的分类准确率。%Decision tree algorithm only chooses one attribute as the root or the node attributes, even if there are two best attributes. So the decision tree algorithm produces few rules. Using full-match method, a new instance only matches one rule or none, which decrease the accuracy of the decision tree. Aiming at the problem, we improve decision tree based on double-attribute and part-match (DAID3):First, if two best attributes have same or approximate entropy, DAID3 algorithm selects the two attributes to build the node, and the two attribute values and their combination as the node' branches. So a train instance can be covered one more times. Second, in the branch node, a new instance may be matched by one more branch path. In order to select the best one;we measure each branch node by the strong. Finally, for a new instance, if there is not a path from the root to a leaf to fully match the instance, DAID3 finds the longest part path, and returns the max-strong class label of the end note. So we labeled each branch node by the max-strong class label. Experimental results show that DAID3 algorithm has higher accuracy than the decision tree more.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号