首页> 外文会议>IEEE International Conference on Information Visualization >Visualizing Distributions and Classification Accuracy
【24h】

Visualizing Distributions and Classification Accuracy

机译:可视化分布和分类准确性

获取原文

摘要

Data mining is the search for novel, actionable information within data. It is important to note that the number of records in the data being analyzed is only one (and perhaps a small) factor in determining the complexity of a given data mining technique. Most complexity in data mining arises from the distribution of values contained in the data - not the number of records. In this paper we utilize straightforward histogram-based visualizations to gain insight into how the performance of a well-studied data mining technique, the naive-Bayes classifier, performs under various discretization schemes for both continuous and discrete values. The resulting visualization system provides users with a tool that describes the underlying model of the data used by the classifier. Exploratory visualizations of the distributions of training data can be selected based on expert domain knowledge and then combined to apply to the test data.
机译:数据挖掘是在数据中搜索新颖的可操作信息。重要的是要注意,正在分析的数据中的记录数量只是确定给定数据挖掘技术的复杂性的一个(且可能是一个小的)因素。数据挖掘中的大多数复杂性来自数据中包含的值的分布 - 而不是记录的数量。在本文中,我们利用基于直接的直方图的可视化来了解如何在用于连续和离散值的各种离散化方案下进行良好研究的数据挖掘技术的性能。生成的可视化系统为用户提供了描述分类器使用的数据的基础模型的工具。可以根据专家领域知识选择培训数据分布的探索性可视化,然后组合以适用于测试数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号