大规模交易数据库的一种有效聚类算法

陈宁; 陈安; 等

首页> 中文期刊> 《软件学报》 >大规模交易数据库的一种有效聚类算法

大规模交易数据库的一种有效聚类算法

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

研究大规模交易数据库的聚类问题，提出了一种二次聚类算法——CATD.该算法首先将数据库划分成若干分区，在每个分区内利用层次聚类算法进行局部聚类，把交易初步划分成若干亚聚类，亚聚类的个数由聚类间的距离参数控制.然后对所有的亚聚类进行全局聚类，同时识别出噪声.由于采用了分区方法和聚类的支持向量表示法，该算法只需扫描一次数据库，聚类过程在内存中进行，因此能处理大规模的数据库.%Clustering of transactions can find potential useful patterns to improve the product profit. In this paper, a two-step clustering algorithm——CATD is proposed, applicable in large transaction databases. First, the database is divided into partitions in which transactions are partially clustered into a number of subclusters. A hierarchical clustering algorithm is used to control the distance between these subclusters. In the global clustering, a k-medoids clustering algorithm is performed on the subclusters to get a set of k global clusters and identify noise. The algorithm is feasible for large databases because it only scans the original databases once and the clustering process can be performed in main memory due to the partitioning scheme and the support vector representative of subclusters.

著录项

来源
《软件学报》 |2001年第4期|475-484|共10页
作者
陈宁; 陈安; 等;
展开▼
作者单位

中国科学院数学与系统科学研究院;

北京航空航天大学管理学院;

中国科学院数学与系统科学研究院;

Economics and Mathematics Institute;

The Chinese Academy of Sciences;

展开▼
原文格式 PDF
正文语种 chi
中图分类信息处理（信息加工）;
关键词
数据挖掘; 聚类分析; 层次聚类; 单连距离;

相似文献

中文文献
外文文献
专利

1. 混合的大规模数据库中数值型数据聚类算法研究 [J] . 何育朋 . 微电子学与计算机 . 2017,第2期
2. 基于SOM的电子商务中交易数据库二次聚类算法 [J] . 易华容 . 计算机与现代化 . 2009,第012期
3. 一种面向大规模二维点集数据的密度聚类算法 [J] . 王小林 ,付山 ,邰伟鹏 . 安徽工业大学学报（自然科学版） . 2020,第002期
4. 一种大规模分类数据聚类算法及其并行实现 [J] . 丁祥武 ,郭涛 ,王梅 . 计算机研究与发展 . 2016,第005期
5. 一种基于半监督的大规模数据集聚类算法 [J] . 申彦 ,宋顺林 ,朱玉全 . 南京大学学报：自然科学版 . 2011,第4期
6. 电子商务中交易数据库的二次聚类算法 [C] . 陈安 ,中国科学院科技政策与管理科学研究所 ,陈宁 . 第十九届全国数据库学术会议 . 2002
7. 大规模数据集下一种增量谱聚类算法与框架的研究 [A] . 孔滕滕 . 2012

大规模交易数据库的一种有效聚类算法

摘要

著录项

相似文献

相关主题

期刊订阅