PAC learning a decision tree with pruning.

机译：PAC学习修剪的决策树。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This dissertation investigates various theoretical effects of pruning a decision tree. Empirical results have shown that pruning can improve the accuracy of an induced decision tree. Pruning also leads to more concise rules. First we provide a pruning algorithn based on the rank of a decision tree. A bound on the error due to pruning by the rank of a decision tree is determined under the assumptions of an equally likely distribution over the instance space and a deterministic tree labelling rule. This bound is then used with recent results in learning theory to determine a sample size sufficient for Probably Approximately Correct (PAC) identification of decision trees with pruning. We also discuss other pruning rules and their effects on the error due to pruning. With a nondeterministic tree labelling rule, we show that the upperbound of the average pruning error is less than or equal to 0.5 under an equally likely distribution.;In a realistic learning environment it is often not possible to obtain a large enough sample to guarantee PAC learning. For those cases, we provide several methods for a posterior evaluation of the accuracy of a pruned decision tree. We give a method which estimates a lower bound for the worst possible confidence factor, ;Finally we develop conditions under which pruning is necessary for better prediction accuracy as well as for concept simplification. We give an analysis of the reason why pruning is necessary in realistic learning situations.;We generalize a previous result for larger training sets. A Bayesian analysis shows the the average prediction accuracy of the pruned tree increases, and the effect of description noise becomes stronger as the size of the training set increases. For very large training sets, the pruned tree has the prediction accuracy equal to that of the unpruned tree.

机译：本文研究了修剪决策树的各种理论效果。实验结果表明，修剪可以提高诱导决策树的准确性。修剪还导致更简洁的规则。首先，我们根据决策树的等级提供修剪算法。在实例空间上具有同等可能的分布以及确定性树标记规则的假设下，确定了由于决策树的等级而导致的错误边界。然后将此边界与学习理论中的最新结果一起使用，以确定足以通过修剪对决策树进行大概近似正确（PAC）识别的样本大小。我们还将讨论其他修剪规则以及它们对修剪引起的错误的影响。使用非确定性树标记规则，我们表明在同等可能的分布下，平均修剪误差的上限小于或等于0.5 .;在现实的学习环境中，通常不可能获得足够大的样本来保证PAC学习。对于这些情况，我们提供了几种方法来对修剪后的决策树的准确性进行后评估。我们给出了一种估计最坏可能置信度的下限的方法；最后，我们开发了必须进行修剪以提高预测精度和简化概念的条件。我们给出了在现实的学习环境中为什么必须修剪的原因的分析。贝叶斯分析表明，修剪树的平均预测准确度增加，并且随着训练集大小的增加，描述噪声的影响变得更强。对于非常大的训练集，修剪的树的预测精度等于未修剪的树的预测精度。

著录项

作者
Kim, Hyunsoo.;
展开▼
作者单位

University of Florida.;

展开▼
授予单位 University of Florida.;
学科 Business Administration General.;Business Administration Management.;Artificial Intelligence.;Computer Science.
学位 Ph.D.
年度 1992
页码 167 p.
总页数 167
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. PAC-learning a decision tree with pruning [J] . Hyunsoo Kim, Gary J. Koehler European Journal of Operational Research . 1996,第2期

机译：PAC学习修剪决策树
2. Student modeling method using decision tree learning based on portfolio concept - structure of learning history data-based and condensing method using decision tree learning [J] . Tatsunori Matsui, Toshio Okamoto 電子情報通信学会技術研究報告. 情報セキュリティ. Information Security . 2000,第113期

机译：基于投资组合概念的学生建模方法 - 基于学习历史数据的结构基于学习历史数据的结构，使用决策树学习
3. Student modeling method using decision tree learning based on portfolio concept - structure of learning history data-based and condensing method using decision tree learning [J] . Tatsunori Matsui, Toshio Okamoto 電子情報通信学会技術研究報告. 情報セキュリティ. Information Security . 2000,第113期

机译：基于投资组合概念的学生建模方法 - 基于学习历史数据的结构基于学习历史数据的结构，使用决策树学习
4. A Tighter Error Bound for Decision Tree Learning Using PAC Learnability [C] . Chaithanya Pichuka, Raju S. Bapi, Chakravarthy Bhagvati, Twentieth International Joint Conference on Artificial Intelligence(IJCAI-07) . 2007

机译：使用PAC可学习性的决策树学习更严格的错误界限
5. Asymptotics and Interpretability of Decision Trees and Decision Tree Ensembles [D] . Zhou, Yichen. 2019

机译：决策树和决策树集合的渐近性和可解释性
6. A Framework for Learning from Distributed Data Using Sufficient Statistics and its Application to Learning Decision Trees [O] . Doina Caragea, Adrian Silvescu, Vasant Honavar -1

机译：利用足够的统计数据从分布式数据中学习的框架及其在学习决策树中的应用
7. Theory and applications of agnostic PAC-learning with small decision trees [O] . Peter Auer, Technische Universitaet Graz 1995

机译：小决策树不可知PAC学习的理论与应用
8. Near Real-Time CSG (Constructive Solid Geometry) Rendering Using Tree Normalization and Geometric Pruning. [R] . Goldfeather, J., Molnar, S., Turk, G., 1989

机译：近似实时CsG（构造实体几何）渲染使用树规范化和几何修剪。

PAC learning a decision tree with pruning.

摘要

著录项

相似文献

相关主题

期刊订阅