首页> 外文期刊>IEEE intelligent systems & their applications >Scoring levels of categorical variables with heterogeneous data
【24h】

Scoring levels of categorical variables with heterogeneous data

机译:异类数据对分类变量的评分

获取原文
获取原文并翻译 | 示例
           

摘要

Heterogeneous (mixed-type) data present significant challenges in both supervised and unsupervised learning. The situation is even more complicated when nominal variables have several levels (values) that make using indicator variables (for every categorical level) infeasible. With unsupervised learning, several fairly involved, computationally intensive, nonlinear multivariate techniques iteratively alternate data transformations with optimal scoring. These seek to optimize an objective on the basis of a covariance matrix. Our goal is to find a computationally efficient and flexible method for mapping categorical variables to numeric scores in mixed-type data. We attempt to go beyond optimizing second-order statistics (such as covariance) and enable distance-based methods by exploring mutual relationships or bumps of dependencies between variables. This is a new objective for a scoring method that's based on patterns learned from all the available variables.
机译:异构(混合类型)数据在有监督和无监督学习中都提出了重大挑战。当名义变量具有多个级别(值)而无法使用指标变量(对于每个类别级别)时,情况就更加复杂了。在无监督学习的情况下,一些相当复杂的计算密集型非线性多元技术会迭代地交替变换具有最佳评分的数据转换。这些寻求基于协方差矩阵来优化目标。我们的目标是找到一种计算有效且灵活的方法,用于将类别变量映射到混合类型数据中的数字分数。我们试图超越优化二阶统计量(例如协方差)的范围,并通过探索变量之间的相互关系或依赖的颠簸来启用基于距离的方法。这是一种评分方法的新目标,该评分方法基于从所有可用变量中学习到的模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号