Decision Trees for Uncertain Data

Tsang Smith; Kao Ben; Yip Kevin Y.; Ho Wai-Shing; Lee Sau Dan

首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Decision Trees for Uncertain Data

【24h】

Decision Trees for Uncertain Data

机译：不确定数据的决策树

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty include measurement/quantization errors, data staleness, and multiple repeated measurements. With uncertainty, the value of a data item is often represented not by one single value, but by multiple values forming a probability distribution. Rather than abstracting uncertain data by statistical derivatives (such as mean and median), we discover that the accuracy of a decision tree classifier can be much improved if the "complete informationȁD; of a data item (taking into account the probability density function (pdf)) is utilized. We extend classical decision tree building algorithms to handle data tuples with uncertain values. Extensive experiments have been conducted which show that the resulting classifiers are more accurate than those using value averages. Since processing pdfs is computationally more costly than processing single values (e.g., averages), decision tree construction on uncertain data is more CPU demanding than that for certain data. To tackle this problem, we propose a series of pruning techniques that can greatly improve construction efficiency.

机译：传统的决策树分类器使用其值已知且精确的数据。我们将此类分类器扩展为处理具有不确定信息的数据。在数据收集过程中，许多应用程序中都会出现值不确定性。不确定性的示例来源包括测量/量化误差，数据陈旧和多次重复测量。由于存在不确定性，数据项的值通常不是由一个单一值表示，而是由形成概率分布的多个值表示。我们发现，不是通过统计导数（例如均值和中位数）来提取不确定数据，而是如果将数据项的“完整信息ȁD”考虑在内（概率密度函数为pdf，则可以大大提高决策树分类器的准确性。我们扩展了经典的决策树构建算法，以处理具有不确定值的数据元组；已进行了广泛的实验，结果表明，所得分类器比使用平均值的分类器更为准确；由于处理pdf的计算量比处理单个数据的费用高值（例如平均值），不确定数据的决策树构造比某些数据对CPU的要求更高，为解决此问题，我们提出了一系列修剪技术，可以大大提高构造效率。

著录项

来源
《Knowledge and Data Engineering, IEEE Transactions on》 |2011年第1期|p.64-78|共15页
作者
Tsang Smith; Kao Ben; Yip Kevin Y.; Ho Wai-Shing; Lee Sau Dan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Uncertain data; classification; data mining.; decision tree;

机译：不确定数据分类数据挖掘决策树;

相似文献

外文文献
中文文献
专利

1. FUDT: A Fuzzy Uncertain Decision Tree Algorithm for Classification of Uncertain Data [J] . S. Meenakshi, V. Venkatachalam Arabian Journal for Science and Engineering . 2015,第11期

机译：FUDT：不确定数据分类的模糊不确定决策树算法
2. Uncertain Data Classification Using Decision Tree Classification Tool With Probability Density Function Modeling Technique [J] . C. Sudarsana Reddy, S. Aquter Babu, Dr. V. Vasu International Journal of Engineering Trends and Technology . 2013,第2期

机译：使用决策树分类工具和概率密度函数建模技术进行不确定的数据分类
3. Uncertain Data Classification Using Decision Tree Classification Tool With Probability Density Function Modeling Technique [J] . C. Sudarsana Reddy, S. Aquter Babu, Dr. V. Vasu International Journal of Engineering Trends and Technology . 2013,第number2期

机译：使用决策树分类工具和概率密度函数建模技术进行不确定的数据分类
4. Learning a fuzzy decision tree from uncertain data [C] . Hang Yu, Jie Lu, Guangquan Zhang International Conference on Intelligent Systems and Knowledge Engineering . 2017

机译：从不确定数据中学习模糊决策树
5. Data mining in databases: An extended decision tree approach and methodology in database environment. [D] . Iliskovic, Sinisa A. 2000

机译：数据库中的数据挖掘：数据库环境中的扩展决策树方法和方法。
6. Path Optimization along Buoys Based on the Shortest Path Tree with Uncertain Atmospheric and Oceanographic Data [O] . Han Xue, Tian Chai 2021

机译：基于浮标的路径优化基于不确定的大气和海洋数据的最短路径树
7. Performance Analysis on Uncertain Data using Decision Tree [O] . Bhosale J.D., Patil B. M 2014

机译：使用决策树的不确定数据性能分析

Decision Trees for Uncertain Data

摘要

著录项

相似文献

相关主题

期刊订阅