首页> 外文会议>International Conference on Artificial Neural Networks >Unsupervised Feature Selection via Local Total-Order Preservation
【24h】

Unsupervised Feature Selection via Local Total-Order Preservation

机译:通过本地总订单保留进行无监督的特征选择

获取原文

摘要

Without class label, unsupervised feature selection methods choose a subset of features that faithfully maintain the intrinsic structure of original data. Conventional methods assume that the exact value of pairwise samples distance used in structure regularization is effective. However, this assumption imposes strict restrictions to feature selection, and it causes more features to be kept for data representation. Motivated by this, we propose Unsupervised Feature Selection via Local Total-order Preservation, called UFSLTP. In particular, we characterize a local structure by a novel total-order relation, which applies the comparison of pairwise samples distance. To achieve a desirable features subset, we map total-order relation into probability space and attempt to preserve the relation by minimizing the differences of the probability distributions calculated before and after feature selection. Due to the inherent nature of machine learning and total-order relation, less features are needed to represent data without adverse effecting on performance. Moreover, we propose two efficient methods, namely Adaptive Neighbors Selection(ANS) and Uniform Neighbors Serialization(UNS), to reduce the computational complexity and improve the method performance. The results of experiments on benchmark datasets demonstrate that the proposed method significantly outperforms the state-of-the-art methods. Compared to the competitors by clustering performance, it averagely achieves 31.01% improvement in terms of NMI and 14.44% in terms of Silhouette Coefficient.
机译:如果没有类标签,则无监督的特征选择方法会选择能忠实维护原始数据固有结构的特征子集。传统方法假定在结构正则化中使用的成对样本距离的精确值是有效的。但是,此假设对特征选择施加了严格的限制,并导致保留更多特征以用于数据表示。为此,我们提出了通过局部总顺序保留的无监督特征选择,称为UFSLTP。特别是,我们通过新颖的总顺序关系来表征局部结构,该关系应用了成对样本距离的比较。为了获得理想的特征子集,我们将总阶关系映射到概率空间中,并尝试通过最小化特征选择前后计算出的概率分布的差异来保留该关系。由于机器学习和总顺序关系的内在本质,需要较少的特征来表示数据而不会对性能产生不利影响。此外,我们提出了两种有效的方法,即自适应邻居选择(ANS)和统一邻居序列化(UNS),以降低计算复杂度并提高方法性能。在基准数据集上进行的实验结果表明,该方法明显优于最新方法。通过聚类性能,与竞争对手相比,它的NMI平均提高了31.01%,轮廓系数提高了14.44%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号