首页> 外文期刊>Information visualization >Quality-based guidance for exploratory dimensionality reduction
【24h】

Quality-based guidance for exploratory dimensionality reduction

机译:基于质量的探索性降维指南

获取原文
获取原文并翻译 | 示例
           

摘要

High-dimensional data sets containing hundreds of variables are difficult to explore, as traditional visualization methods often are unable to represent such data effectively. This is commonly addressed by employing dimensionality reduction prior to visualization. Numerous dimensionality reduction methods are available. However, few reduction approaches take the importance of several structures into account and few provide an overview of structures existing in the full high-dimensional data set. For exploratory analysis, as well as for many other tasks, several structures may be of interest. Exploration of the full high-dimensional data set without reduction may also be desirable. This paper presents flexible methods for exploratory analysis and interactive dimensionality reduction. Automated methods are employed to analyse the variables, using a range of quality metrics, providing one or more measures of 'interestingness' for individual variables. Through ranking, a single value of interestingness is obtained, based on several quality metrics, that is usable as a threshold for the most interesting variables. An interactive environment is presented in which the user is provided with many possibilities to explore and gain understanding of the high-dimensional data set. Guided by this, the analyst can explore the high-dimensional data set and interactively select a subset of the potentially most interesting variables, employing various methods for dimensionality reduction. The system is demonstrated through a use-case analysing data from a DNA sequence-based study of bacterial populations.
机译:包含数百个变量的高维数据集很难探索,因为传统的可视化方法通常无法有效地表示此类数据。这通常通过在可视化之前采用降维来解决。可以使用多种降维方法。但是,很少有归约方法会考虑几种结构的重要性,并且很少能提供完整高维数据集中现有结构的概述。对于探索性分析以及许多其他任务,可能需要几个结构。在不减少的情况下探索完整的高维数据集也可能是理想的。本文提出了探索性分析和交互式降维的灵活方法。自动化方法用于使用一系列质量指标来分析变量,从而为单个变量提供一种或多种“有趣”程度。通过排名,可以基于几个质量指标获得一个有趣的单一值,可用作最有趣变量的阈值。提出了一个交互式环境,其中为用户提供了探索和获得对高维数据集的理解的许多可能性。以此为指导,分析人员可以使用各种降维方法,探索高维数据集并交互地选择潜在最有趣的变量的子集。该系统通过用例分析来自细菌种群的基于DNA序列的研究中的数据进行演示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号