不平衡数据的软子空间聚类算法

程铃钫; 杨天鹏; 陈黎飞

首页> 中文期刊> 《计算机应用》 >不平衡数据的软子空间聚类算法

不平衡数据的软子空间聚类算法

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Aiming at the problem that the current K-means-type soft-subspace algorithms cannot effectively cluster imbalanced data due to uniform effect,a new partition-based algorithm was proposed for soft subspace clustering on imbalanced data.First,a bi-weighting method was proposed,where each attribute was assigned a feature-weight and each cluster was assigned a cluster-weight to measure its importance for clustering.Second,in order to make a trade-off between attributes with different types or those categorical attributes having various numbers of categories,a new distance measurement was then proposed for mixed-type data.Third,an objective function was defined for the strbspace clustering algorithm on imbalanced data based on the bi-weighting method,and the expressions for optimizing both the cluster-weights and feature-weights were derived.A series of experiments were conducted on some real-world data sets and the results demonstrated that the biweighting method used in the new algorithm can learn more accurate soft-subspace for the clusters hidden in the imbalanced data.Compared with the existing K-means-type soft-subspace clustering algorithms,the proposed algorithm yields higher clustering accuracy on imbalanced data,achieving about 50％ improvements on the bioinformatic data used in the experiments.%针对受均匀效应的影响,当前K-means型软子空间算法不能有效聚类不平衡数据的问题,提出一种基于划分的不平衡数据软子空间聚类新算法.首先,提出一种双加权方法,在赋予每个属性一个特征权重的同时,赋予每个簇反映其重要性的一个簇类权重;其次,提出一种混合型数据的新距离度量,以平衡不同类型属性及具有不同符号数目的类属型属性间的差异;第三,定义了基于双加权方法的不平衡数据子空间聚类目标优化函数,给出了优化簇类权重和特征权重的表达式.在实际应用数据集上进行了系列实验,结果表明,新算法使用的双权重方法能够为不平衡数据中的簇类学习更准确的软子空间;与现有的K-means型软子空间算法相比,所提算法提高了不平衡数据的聚类精度,在其中的生物信息学数据上可以取得近50％的提升幅度.

著录项

来源
《计算机应用》 |2017年第10期|2952-2957|共6页
作者
程铃钫; 杨天鹏; 陈黎飞;
展开▼
作者单位

福建农林大学金山学院;

福州350002;

福建师范大学数学与计算机科学学院;

福州350117;

福建师范大学数学与计算机科学学院;

福州350117;

展开▼
原文格式 PDF
正文语种 chi
中图分类 TP274.2;
关键词
软子空间聚类; 不平衡数据; 特征权重; 簇类权重;

相似文献

中文文献
外文文献
专利

1. 不平衡数据软子空间聚类算法在临床医学中的应用与研究 [J] . 程铃钫 ,陈黎飞 ,赖晓燕 . 软件 . 2019,第011期
2. 随机学习萤火虫算法优化的模糊软子空间聚类算法 [J] . 张曦 ,李璠 ,付雪峰 . 江西师范大学学报（自然科学版） . 2021,第002期
3. 头脑风暴算法优化的乳腺MR图像软子空间聚类算法 [J] . 范虹 ,史肖敏 ,姚若侠 . 计算机科学与探索 . 2020,第008期
4. 特征加权优化软子空间聚类算法比传统算法的优越性分析 [J] . 陈晓洁 ,王雯娟 . 赤峰学院学报（自然科学版） . 2016,第014期
5. 基于迁移学习的软子空间聚类算法 [J] . 王丽娟 ,丁世飞 ,丁玲 . 南京大学学报：自然科学版 . 2020,第4期
6. 基于差分演化算法的软子空间聚类 [C] . BI Zhi-Sheng ,毕志升 ,WANG Jia-Hai . 2012中国计算机大会 . 2012
7. 基于花朵授粉算法的软子空间聚类算法优化研究 [A] . 戴娇 . 2017

不平衡数据的软子空间聚类算法

摘要

著录项

相似文献

相关主题

期刊订阅