首页> 外文会议>International Conference on Machine Learning >The Non-IID Data Quagmire of Decentralized Machine Learning
【24h】

The Non-IID Data Quagmire of Decentralized Machine Learning

机译:分散机学习的非IID数据泥潭

获取原文

摘要

Many large-scale machine learning (ML) applications need to perform decentralized learning over datasets generated at different devices and locations. Such datasets pose a significant challenge to decentralized learning because their different contexts result in significant data distribution skew across devices/locations. In this paper, we take a step toward better understanding this challenge by presenting a detailed experimental study of decentralized DNN training on a common type of data skew: skewed distribution of data labels across devices/locations. Our study shows that: (i) skewed data labels are a fundamental and pervasive problem for decentralized learning, causing significant accuracy loss across many ML applications, DNN models, training datasets, and decentralized learning algorithms; (ii) the problem is particularly challenging for DNN models with batch normalization; and (iii) the degree of data skew is a key determinant of the difficulty of the problem. Based on these findings, we present SkewScout, a system-level approach that adapts the communication frequency of decentralized learning algorithms to the (skew-induced) accuracy loss between data partitions. We also show that group normalization can recover much of the accuracy loss of batch normalization.
机译:许多大规模机器学习(ML)应用程序需要在不同设备和位置生成的数据集中进行分散的学习。此类数据集对分散的学习构成重大挑战,因为它们的不同背景导致跨设备/位置的显着数据分布偏差。在本文中,我们通过展示对常见类型数据歪斜类型的分散的DNN训练的详细实验研究更好地了解这一挑战:跨设备/位置的数据标签的偏斜分布。我们的研究表明:(i)歪曲数据标签是分散式学习的基本和普遍的问题,在许多ML应用程序,DNN模型,训练数据集和分散的学习算法中导致显着的准确性损失; (ii)问题对于具有批量归一化的DNN模型尤其具有挑战性; (iii)数据偏差程度是问题难度的关键决定因素。基于这些发现,我们呈现了Skewscout,一种系统级方法,它适应分散的学习算法的通信频率,以在数据分区之间的(歪曲引起的)精度丢失。我们还表明,组标准化可以恢复大部分精度批量归一化损失。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号