A parallel version k-means clustering algorithm,which can make clustering time decrease linearly as the number of host nodes increase,is proposed to meet the requirements of efficient process of large clustering data sets.In order to balance the detection rate and false positive rate better,a re-positioning algorithm is also presented based on the minimum squared error.The re-positioning algorithm can make the detection rate improved by 5%,and the false positive rate decreased by 1.1% compared with Li Na's algorithm.The experimental results show that the proposed algorithm in this paper can not only improve clustering efficiency,but also detect both known and unknown attacks more effectively.%为满足高效聚类大规模数据集的要求,该文提出一种基于k均值算法的并行聚类算法,该并行算法能使聚类时间随节点主机数目的增多,呈近似线性递减。为了更好地平衡检测率与误报率,文章又提出了基于平方误差最小的重定位算法,相比于李娜等人提出的算法,该重定位算法使检测率提升了5%,误报率降低了1.1%。实验结果表明,该文算法不但能够提高聚类效率,而且能够更加有效地检测出已知和未知攻击。
展开▼