MapReduce-based k-prototypes clustering method for big data

机译：基于MapReduce的大数据k原型聚类方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Big data clustering is one of the recently challenging tasks that is used in many application domains. Traditional clustering methods are not able to deal with large-scale of data. Furthermore, Big data are often characterized by the mixed type of data, including numerical and categorical attributes. Thus, we propose in this paper the parallelization of k-prototypes clustering method (MR-KP) using MapReduce model to handle large-scale of mixed data. Experiments results show that MR-KP scales well with increasing data set sizes and achieves a close to linear speedup while maintaining the clustering accuracy.

机译：大数据集群是许多应用程序领域中使用的最近具有挑战性的任务之一。传统的群集方法无法处理大规模数据。此外，大数据通常以混合类型的数据为特征，包括数字和分类属性。因此，我们在本文中提出了使用MapReduce模型并行处理k原型聚类方法（MR-KP）来处理大规模混合数据的方法。实验结果表明，随着数据集大小的增加，MR-KP可以很好地扩展，并且在保持聚类精度的同时，实现了接近线性的加速。

著录项

来源
《IEEE International Conference on Data Science and Advanced Analytics》|2015年|1-7|共7页
会议地点
作者
Ben Haj Kacem Mohamed Aymen; Ben Ncir Chiheb-Eddine; Essoussi Nadia;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Big data; Clustering algorithms; Clustering methods; Computational modeling; Data models; Numerical models; Prototypes; Big data; K-prototypes method; MapReduce model; Mixed data;

机译：大数据;聚类算法;聚类方法;计算建模;数据模型;数值模型;原型;大数据; K原型方法; MapReduce模型;混合数据;

相似文献

外文文献
中文文献
专利

1. One-pass MapReduce-based clustering method for mixed large scale data [J] . Ben HajKacem Mohamed Aymen, Ben Ncir Chiheb-Eddine, Essoussi Nadia Journal of Intelligent Information Systems . 2019,第3期

机译：基于一遍MapReduce的混合大规模数据聚类方法
2. One-pass MapReduce-based clustering method for mixed large scale data [J] . Ben HajKacem Mohamed Aymen, Ben Ncir Chiheb-Eddine, Essoussi Nadia Journal of Intelligent Information Systems . 2019,第3期

机译：基于MapReduce的混合大规模数据的聚类方法
3. Hengam a MapReduce-Based Distributed Data Warehouse for Big Data: A MapReduce-Based Distributed Data Warehouse for Big Data [J] . Mohammadhossein Barkhordari, Mahdi Niamanesh International journal of artificial life research . 2018,第1期

机译：Hengam基于MapReduce的大数据分布式数据仓库：基于MapReduce的大数据分布式数据仓库
4. MapReduce-based K-Prototypes Clustering Method for Big Data [C] . Mohamed Aymen Ben HajKacem, Chiheb-Eddine Ben Ncir, Nadia Essoussi IEEE International Conference on Data Science and Advanced Analytics . 2015

机译：基于MapReduce的K原型群体大数据的聚类方法
5. Performance evaluation of big data placement structures in MapReduce-based data warehouse systems. [D] . Hasan, Mohammad Rakibul. 2016

机译：基于MapReduce的数据仓库系统中大数据放置结构的性能评估。
6. Review of methods for handling confounding by cluster and informative cluster size in clustered data [O] . Shaun Seaman, Menelaos Pavlou, Andrew Copas -1

机译：综述了处理聚类数据中的聚类和信息性聚类大小的混淆方法
7. K-prototypes Algorithm for Clustering Schools Based on The Student Admission Data in IPB University [O] . Sri Sulastri, Lismayani Usman, Utami Dyah Syafitri 2021

机译：基于IPB大学学生入学数据的聚类学校k原型算法

MapReduce-based k-prototypes clustering method for big data

摘要

著录项

相似文献

相关主题

期刊订阅