Density-Preserving Sampling: Robust and Efficient Alternative to Cross-Validation for Error Estimation

Budka M.; Gabrys B.

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Density-Preserving Sampling: Robust and Efficient Alternative to Cross-Validation for Error Estimation

【24h】

Density-Preserving Sampling: Robust and Efficient Alternative to Cross-Validation for Error Estimation

机译：保持密度的采样：用于误差估计的交叉验证的可靠而有效的选择

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Estimation of the generalization ability of a classification or regression model is an important issue, as it indicates the expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures, such as cross-validation (CV) or bootstrap, are stochastic and, thus, require multiple repetitions in order to produce reliable results, which can be computationally expensive, if not prohibitive. The correntropy-inspired density-preserving sampling (DPS) procedure proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets that are guaranteed to be representative of the input dataset. This allows the production of low-variance error estimates with an accuracy comparable to 10 times repeated CV at a fraction of the computations required by CV. This method can also be used for model ranking and selection. This paper derives the DPS procedure and investigates its usability and performance using a set of public benchmark datasets and standard classifiers.

机译：分类或回归模型的泛化能力的估计是一个重要的问题，因为它表明了以前看不见的数据的预期性能，并且也用于模型选择。当前使用的通用误差估计程序（例如交叉验证（CV）或自举程序）是随机的，因此需要多次重复才能产生可靠的结果，如果不是禁止的话，这可能在计算上昂贵。本文提出的以熵为灵感的密度保持采样（DPS）程序通过将可用数据划分为保证代表输入数据集的子集，从而无需重复误差估计程序。这样就可以产生低方差误差估计，其准确度相当于CV所需计算的一部分，是重复CV的10倍。此方法也可以用于模型排名和选择。本文推导了DPS程序，并使用一组公共基准数据集和标准分类器来研究其可用性和性能。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2013年第1期|p.22-34|共13页
作者
Budka M.; Gabrys B.;
展开▼
作者单位

Smart Technology Research Centre, School of Design, Engineering, and Computing, Bournemouth University, Poole, Dorset, U.K.;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Accuracy; Computational modeling; Error analysis; Joints; Kernel; Standards; Training; Bootstrap; correntropy; cross-validation; error estimation; model selection; sampling;

机译：准确性;计算建模;错误分析;关节;核心;标准;训练;引导程序肾上腺皮质激素交叉验证;误差估计;型号选择;采样;

相似文献

外文文献
中文文献
专利

1. Efficient Estimation and Robust Inference of Linear Regression Models in the Presence of Heteroscedastic Errors and High Leverage Points [J] . MUHAMMAD ASLAM, TAHIRA RIAZ, SAIMA ALTAF Communications in Statistics . 2013,第8a10期

机译：存在异方差和高杠杆点的线性回归模型的有效估计和鲁棒推断
2. Robust estimation and testing in one-way ANOVA for Type II censored samples: skew normal error terms [J] . Celik Nuri, Senoglu Birdal Journal of statistical computation and simulation . 2018,第7a9期

机译：对II型删失样本进行单向方差分析的稳健估计和测试：偏态法向误差项
3. Robust Estimation in Stratified Sampling under Error-in-Variables Super Population Model [J] . Shweta Chauhan, B.V.S. Sisodia, Dhirendra Singh Journal of the Indian Society of Agricultural Statistics . 2018,第1期

机译：在误差中的分层抽样中的鲁棒估计超级群体模型
4. Correntropy-based density-preserving data sampling as an alternative to standard cross-validation [C] . Budka Marcin, Gabrys Bogdan The 2010 International Joint Conference on Neural Networks . 2010

机译：基于熵的密度保持数据采样作为标准交叉验证的替代方法
5. Robust Capital Asset Pricing Model Estimation Through Cross-validation [D] . Sakouvogui, Kekoura. 2018

机译：通过交叉验证的强大资本资产定价模型估计
6. A robust estimation of exon expression to identify alternative spliced genes applied to human tissues and cancer samples [O] . Alberto Risueño, Beatriz Roson-Burgo, Anna Dolnik, 2014

机译：可靠评估外显子表达以鉴定适用于人体组织和癌症样品的可变剪接基因
7. Density Preserving Sampling: Robust and Efficient Alternative to Cross-validation for Error Estimation [O] . Budka Marcin, Gabrys Bogdan 2013

机译：保持密度的采样：鲁棒和有效的交叉验证误差估计的替代方法

Density-Preserving Sampling: Robust and Efficient Alternative to Cross-Validation for Error Estimation

摘要

著录项

相似文献

相关主题

期刊订阅