...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Test-cost sensitive classification on data with missing values
【24h】

Test-cost sensitive classification on data with missing values

机译:对缺少值的数据进行测试成本敏感的分类

获取原文
获取原文并翻译 | 示例
           

摘要

In the area of cost-sensitive learning, inductive learning algorithms have been extended to handle different types of costs to better represent misclassification errors. Most of the previous works have only focused on how to deal with misclassification costs. In this paper, we address the equally important issue of how to handle the test costs associated with querying the missing values in a test case. When an attribute contains a missing value in a test case, it may or may not be worthwhile to take the extra effort in order to obtain a value for that attribute, or attributes, depending on how much benefit the new value bring about in increasing the accuracy. In this paper, we consider how to integrate test-cost-sensitive learning with the handling of missing values in a unified framework that includes model building and a testing strategy. The testing strategies determine which attributes to perform the test on in order to minimize the sum of the classification costs and test costs. We show how to instantiate this framework in two popular machine learning algorithms: decision trees and naive Bayesian method. We empirically evaluate the test-cost-sensitive methods for handling missing values on several data sets.
机译:在成本敏感型学习领域,归纳学习算法已得到扩展,可以处理不同类型的成本,以更好地表示错误分类错误。以前的大多数工作都只关注于如何处理分类错误的费用。在本文中,我们解决了同等重要的问题,即如何处理与查询测试用例中的缺失值相关的测试成本。当一个属性在测试案例中包含缺失值时,为获得该一个或多个属性的值而付出额外的努力可能或可能不值得,这取决于新值对增加属性的影响。准确性。在本文中,我们考虑如何在一个包含模型构建和测试策略的统一框架中,将对测试成本敏感的学习与对缺失值的处理集成在一起。测试策略确定要执行测试的属性,以最大程度地降低分类成本和测试成本的总和。我们展示了如何在两种流行的机器学习算法中实例化此框架:决策树和朴素贝叶斯方法。我们根据经验评估测试成本敏感的方法来处理几个数据集上的缺失值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号