...
首页> 外文期刊>Computers & operations research >Optimization-based feature selection with adaptive instance sampling
【24h】

Optimization-based feature selection with adaptive instance sampling

机译:基于自适应实例采样的基于优化的特征选择

获取原文
获取原文并翻译 | 示例
           

摘要

Preprocessing the data to filter out redundant and irrelevant features is one of the most important steps in the data mining process. Careful feature selection may improve both the computational time of inducing subsequent models and the quality of those models. Using fewer features often leads to simpler and easier to interpret models, and selecting important feature can lead to important insights into the application. The feature selection problem is inherently a combinatorial optimization problem. This paper builds on a metaheuristic called the nested partitions method that has been shown to be particularly effective for the feature selection problem. Specifically, we focus on the scalability of the method and show that its performance is vastly improved by incorporating random sampling of instances. Furthermore, we develop an adaptive variant of the algorithm that dynamically determines the required sample rate. The adaptive algorithm is shown to perform very well when applied to a set of standard test problems.
机译:预处理数据以滤除冗余和不相关的功能是数据挖掘过程中最重要的步骤之一。仔细的特征选择可以改善引入后续模型的计算时间以及这些模型的质量。使用较少的功能通常可以使模型更容易解释,选择重要的功能可以对应用程序产生重要的见解。特征选择问题本质上是组合优化问题。本文基于一种称为嵌套分区方法的元启发式方法,该方法已被证明对特征选择问题特别有效。具体来说,我们专注于该方法的可伸缩性,并表明通过合并实例的随机抽样极大地提高了其性能。此外,我们开发了一种算法的自适应变体,可以动态确定所需的采样率。当应用于一组标准测试问题时,自适应算法表现出很好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号