首页> 外文会议>Smart Cities Symposium >Scalable parallel SVM on cloud clusters for large datasets classification

【24h】

Scalable parallel SVM on cloud clusters for large datasets classification

机译：用于大型数据集分类的云集群上的可扩展并行SVM

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a new parallel support vector machine (PSVM) that is efficient in terms of time complexity. Support vector machine is one of the popular classifiers for analysis of data and classification of patterns. However, SVM requires a large memory (in the range of 100 GB or more) in order to process big-data (i.e., in the range of 1 TB data or more). This paper proposes to execute SVMs in parallel on several clusters to analyze and classify big-data. In this approach, the data are divided to n equal partitions. Each partitioned data is used by an individual cluster to train an SVM. The outcomes of each of the SVMs executed on several clusters are then combined by another SVM referred as final SVM. The inputs to this final SVM are the support vectors (SVs) of the SVMs that were executed on different clusters, while the desired output is the corresponding output of the respective SV. We evaluated our proposed method on high performance computing (HPC) clusters and amazon cloud clusters (ACC) using different benchmark datasets. Experimental results show that the proposed method is efficient in terms of training time with minimal error rate and memory requirement, compared to the existing stand-alone SVM.

机译：本文提出了一种新的并行支持向量机（PSVM），它在时间复杂度方面很有效。支持向量机是用于数据分析和模式分类的流行分类器之一。但是，SVM需要大内存（100 GB或更大）以处理大数据（即1 TB数据或更大）。本文提出在多个群集上并行执行SVM，以对大数据进行分析和分类。在这种方法中，数据被划分为n个相等的分区。每个分区数据由单个群集用于训练SVM。然后，在几个集群上执行的每个SVM的结果将被另一个称为最终SVM的SVM合并。最终SVM的输入是在不同群集上执行的SVM的支持向量（SV），而所需的输出是各个SV的对应输出。我们使用不同的基准数据集评估了我们针对高性能计算（HPC）集群和亚马逊云集群（ACC）提出的方法。实验结果表明，与现有的独立SVM相比，该方法在训练时间方面效率高，错误率和内存需求最小。

著录项

来源
《Smart Cities Symposium》|2019年|1-5|共5页
会议地点
作者
Sarwar M. Haque; Ghazanfar Latif; Rafiul Hasan; Md Arifuzzaman; Shakib S. Shafin; Quazi A. Rahman;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Support Vector Machine; Cloud Computing; Parallel SVM; Cluster Computing; Amazon Web Services;

机译：支持向量机;云计算;并行SVM;集群计算; Amazon Web Services;

相似文献

外文文献
中文文献
专利

1. Parallel incremental power mean SVM for the classification of large-scale image datasets [J] . Thanh-Nghi Doan, Thanh-Nghi Do, Fran?ois Poulet International Journal of Multimedia Information Retrieval . 2014,第2期

机译：并行增量功率均值SVM用于大规模图像数据集的分类
2. Latent-lSVM classification of very high-dimensional and large-scale multi-class datasets [J] . Thanh-Nghi Do, François Poulet Concurrency and computation: practice and experience . 2019,第2期

机译：高维和大规模多类数据集的潜在lSVM分类
3. A fast classification strategy for SVM on the large-scale high-dimensional datasets [J] . Li I-Jing, Wu Jiunn-Lin, Yeh Chih-Hung Pattern Analysis and Applications . 2018,第4期

机译：大规模高维数据集上支持向量机的快速分类策略
4. Scalable Parallel SVM on Cloud Clusters for Large Datasets Classification [C] . Md Sarwar M Haque, Ghazanfar Latif, Md Rafiul Hasan, Smart Cities Symposium . 2020

机译：用于大型数据集分类的云集群上的可扩展并行SVM
5. Scalable parallel computing on clouds: Efficient and scalable architectures to perform pleasingly parallel, MapReduce and iterative data intensive computations on cloud environments. [D] . Gunarathne, Thilina. 2014

机译：云上的可伸缩并行计算：高效且可伸缩的架构，可在云环境上执行令人满意的并行，MapReduce和迭代式数据密集型计算。
6. Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets [O] . D. D. Shrimankar, S. R. Sathe 2016

机译：大型生物数据集基于新图块的并行编程模型对SMP节点和工作站集群的并行算法进行分析
7. ScalParC: A New Scalable and Efficient Parallel Classification Algorithm for Mining Large Datasets [O] . Mahesh V. Joshi, George Karypis, Vipin Kumar 1998

机译：ScalParC：一种用于挖掘大型数据集的新型可扩展且高效的并行分类算法

Scalable parallel SVM on cloud clusters for large datasets classification

摘要

著录项

相似文献

相关主题

期刊订阅