首页> 外文会议>IEEE International Conference on Parallel and Distributed Systems >Parallelizing Machine Learning Optimization Algorithms on Distributed Data-Parallel Platforms with Parameter Server
【24h】

Parallelizing Machine Learning Optimization Algorithms on Distributed Data-Parallel Platforms with Parameter Server

机译:具有参数服务器的分布式数据并行平台上的并行机器学习优化算法

获取原文

摘要

In the big data era, machine learning optimization algorithms usually need to be designed and implemented on widely-used distributed computing platforms, such as Apache Hadoop, Spark, and Flink. However, these general distributed computing platforms themselves do not focus on parallelizing machine learning optimization algorithms. In this paper, we present a parallel optimization algorithm framework for scalable machine learning, and empirically evaluate the synchronous Elastic Averaging SGD (EASGD) and other distributed SGD-based optimization algorithms. First, we design a distributed machine learning optimization algorithm framework based on Apache Spark by adopting the parameter server. Then, we design and implement the widely-used distributed synchronous EASGD and several other popular SGD-based optimization algorithms, such as Adadelta and Adam, on top of the framework. In addition, we evaluate the performance of synchronous distributed EASGD compared with other distributed optimization algorithms based on the same framework. Finally, to explore the optimal settings of mini-batch size in large-scale distributed optimization, we further analyze the empirical linear scaling rule originally proposed in the single-node environment. Experimental results show that our parallel optimization algorithm framework achieves good flexibility and scalability. And, the distributed synchronous EASGD runs over the proposed framework gains a competitive convergence performance and is about 5.7% faster than other distributed SGD-based optimization algorithms. Experimental results also verified that the empirical linear scaling rule only holds well before the mini-batch size exceeds certain threshold on large-scale benchmarks in the distributed environment.
机译:在大数据时代,通常需要在广泛使用的分布式计算平台上设计和实现机器学习优化算法,例如Apache Hadoop,Spark和Flink。但是,这些一般分布式计算平台本身不会专注于并行机器学习优化算法。在本文中,我们介绍了一种用于可扩展机学习的并行优化算法框架,并经验求解同步弹性平均SGD(EASGD)和基于分布式SGD的优化算法。首先,我们通过采用参数服务器设计基于Apache Spark的分布式机器学习优化算法框架。然后,我们设计并实现广泛使用的分布式同步EASGD和几个其他流行的基于SGD的优化算法,例如Adadelta和Adam,在框架之上。此外,我们评估了与基于同一框架的其他分布式优化算法相比同步分布式EASGD的性能。最后,为了探讨大规模分布式优化中迷你批量大小的最佳设置,我们进一步分析了最初提出的单节点环境中提出的经验线性缩放规则。实验结果表明,我们的并联优化算法框架达到了良好的灵活性和可扩展性。而且,分布式同步EASGD通过所提出的框架运行竞争收敛性能,比其他基于分布式SGD的优化算法快约5.7 %。实验结果还验证了经验线性缩放规则在迷你批量大小超过分布式环境中的大规模基准测试中的某些阈值之前保持良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号