On the Organization of Cluster Voting with Massive Distributed Streams

机译：大规模分布式流的集群投票组织

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data processing is one of the important challenges on Big Data. In this paper we investigate optimal processing algorithm for massive data streams, propose a new processing algorithm called multi-buffer based majority algorithm. The algorithm maintains time complexity of O(n) and selects prevalent elements of frequencies as low as 1%. Our experiments indicate that multi-buffer based majority algorithm has improvements on both accuracy and efficiency. Moreover, we use multibuffer based algorithm to process data streams on single system and distributed system. These experiments indicate that using multi-buffer based algorithm can have better performance on distributed system. Moreover, we give explanations of the experiments' result and indicate several major factors which influence the result accuracy: stream size, element range in the stream, frequency of predominant elements and our buffer sets.

机译：数据处理是大数据上的重要挑战之一。在本文中，我们研究了海量数据流的最佳处理算法，提出了一种新的处理算法，称为基于多缓冲区的多数算法。该算法保持O（n）的时间复杂度，并选择频率低至1％的流行元素。我们的实验表明，基于多缓冲区的多数算法在准确性和效率上都有改进。此外，我们使用基于多缓冲区的算法来处理单个系统和分布式系统上的数据流。这些实验表明，使用基于多缓冲区的算法可以在分布式系统上具有更好的性能。此外，我们对实验结果进行了解释，并指出了影响结果准确性的几个主要因素：数据流大小，数据流中的元素范围，主要元素的频率和我们的缓冲集。

著录项

来源
《International Conference on Computing for Geospatial Research and Application》|2014年|55-62|共8页
会议地点
作者
Alhudhaif Adi; Yan Tong; Berkovich Simon;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
big data clusterization; cloud computing; majority algorithm; stream processing;

机译：大数据集群;云计算;多数算法;流处理;

相似文献

外文文献
中文文献
专利

1. Distributed stream clustering using micro-clusters on Apache Storm [J] . Pasan Karunaratne, Shanika Karunasekera, Aaron Harwood Journal of Parallel and Distributed Computing . 2017,第octa期

机译：在Apache Storm上使用微集群进行分布式流集群
2. Novel document detection for massive data streams using distributed dictionary learning [J] . Kasiviswanathan S.P., Cong G., Melville P., IBM Journal of Research and Development . 2013,第3a4期

机译：使用分布式字典学习对海量数据流进行新颖的文档检测
3. DCPVP: Distributed Clustering Protocol Using Voting and Priority for Wireless Sensor Networks [J] . Hooman Hematkhah, Leonhard M. Reindl, Yousef S. Kavian Sensors . 2015,第3期

机译：DCPVP：使用投票和优先级的无线传感器网络的分布式群集协议
4. On the Organization of Cluster Voting with Massive Distributed Streams [C] . Alhudhaif Adi, Yan Tong, Berkovich Simon International Conference on Computing for Geospatial Research and Application . 2014

机译：论大规模分布式流的集群投票组织
5. Combinatorial Optimization on Massive Datasets: Streaming, Distributed, and Massively Parallel Computation [D] . Assadi, Sepehr. 2018

机译：大规模数据集的组合优化：流式，分布式和大规模并行计算
6. DCPVP: Distributed Clustering Protocol Using Voting and Priority for Wireless Sensor Networks [O] . Hooman Hematkhah, Yousef S. Kavian 2015

机译：DCPVP：使用投票和优先级的无线传感器网络的分布式群集协议
7. Voting Massive Collections of Bayesian Network Classifiers for Data Streams [O] . Remco R. Bouckaert 2009

机译：投票表决贝叶斯网络分类器的大量数据流

On the Organization of Cluster Voting with Massive Distributed Streams

摘要

著录项

相似文献

相关主题

期刊订阅