首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Adaptive Distributed Stochastic Gradient Descent for Minimizing Delay in the Presence of Stragglers
【24h】

Adaptive Distributed Stochastic Gradient Descent for Minimizing Delay in the Presence of Stragglers

机译:自适应分布随机梯度下降算法,可最大程度地减少拖曳者在场时的延迟

获取原文

摘要

We consider the setting where a master wants to run a distributed stochastic gradient descent (SGD) algorithm on n workers each having a subset of the data. Distributed SGD may suffer from the effect of stragglers, i.e., slow or unresponsive workers who cause delays. One solution studied in the literature is to wait at each iteration for the responses of the fastest k < n workers before updating the model, where k is a fixed parameter. The choice of the value of k presents a trade-off between the runtime (i.e., convergence rate) of SGD and the error of the model. Towards optimizing the error-runtime trade-off, we investigate distributed SGD with adaptive k. We first design an adaptive policy for varying k that optimizes this trade-off based on an upper bound on the error as a function of the wallclock time which we derive. Then, we propose an algorithm for adaptive distributed SGD that is based on a statistical heuristic. We implement our algorithm and provide numerical simulations which confirm our intuition and theoretical analysis.
机译:我们考虑这样一种设置,即主机要对n个具有数据子集的工人运行分布式随机梯度下降(SGD)算法。分布式SGD可能会受到散乱者的影响,即工作缓慢或反应迟钝的员工会导致延误。文献中研究的一种解决方案是在更新模型之前,每次迭代都要等待最快的k

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号