首页> 中文期刊> 《软件学报》 >潜在属性空间树分类器

潜在属性空间树分类器

         

摘要

A framework of latent attribute space tree classifier (LAST) is proposed in this paper. LAST transforms data from the original attribute space into the latent attribute space, which is easier for data separation or more suitable for tree classifier, so that the decision boundary of the traditional decision tree can be extended and its generalization ability can be improved. This paper presents two SVD (singular value decomposition) oblique decision tree (SODT) algorithms based on the LAST framework. SODT first performs SVD on global and/or local data to construct orthogonal latent attribute space. Then, traditional decision tree or tree nodes are built in that space. Finally, SODT obtains the approximately optimal oblique decision tree of the original space. SODT can not only handle datasets with similar or different distribution between global and local data, but also can make full use of the structure information of the labelled and unlabelled data and produce the same classification results no matter how the observations are arranged. Besides, the time complexity of SODT is identical to that of the univariate decision tree. Experimental results show that compared with the traditional univariate decision tree algorithm C4.5 and the oblique decision tree algorithms OC1 and CART-LC, SODT gives higher classification accuracy, more stable decision tree size and comparable tree-construction time as C4.5, which is much less than that of OC1 and CART-LC.%提出一种潜在属性空间树分类器(latent attribute space tree classifier,简称LAST)框架,通过将原属性空间变换到更容易分离数据或更符合决策树分类特点的潜在属性空间,突破传统决策树算法的决策面局限,改善树分类器的泛化性能.在LAST框架下,提出了两种奇异值分解斜决策树(SVD (singular value decomposition) oblique decision tree,简称SODT)算法,通过对全局或局部数据进行奇异值分解,构建正交的潜在属性空间,然后在潜在属性空间内构建传统的单变量决策树或树节点,从而间接获得原空间内近似最优的斜决策树.SODT算法既能够处理整体数据与局部数据分布相同或不同的数据集,又可以充分利用有标签和无标签数据的结构信息,分类结果不受样本随机重排的影响,而且时间复杂度还与单变量决策树算法相同.在复杂数据集上的实验结果表明,与传统的单变量决策树算法和其他斜决策树算法相比,SODT算法的分类准确率更高,构建的决策树大小更稳定,整体分类性能更鲁棒,决策树构建时间与C4.5算法相近,而远小于其他斜决策树算法.

著录项

  • 来源
    《软件学报》 |2009年第7期|1735-1745|共11页
  • 作者

    何萍; 徐晓华; 陈崚;

  • 作者单位

    南京航空航天大学;

    信息科学与技术学院;

    计算机科学与工程系;

    江苏;

    南京;

    210016;

    扬州大学;

    信息工程学院;

    计算机科学与工程系;

    江苏;

    扬州;

    225009;

    南京航空航天大学;

    信息科学与技术学院;

    计算机科学与工程系;

    江苏;

    南京;

    210016;

    扬州大学;

    信息工程学院;

    计算机科学与工程系;

    江苏;

    扬州;

    225009;

  • 原文格式 PDF
  • 正文语种 chi
  • 中图分类 人工智能理论;
  • 关键词

    分类; 决策树; 潜在属性空间; 奇异值分解;

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号