...
首页> 外文期刊>IEEE Transactions on Information Theory >On the Fundamental Limits of Coded Data Shuffling for Distributed Machine Learning
【24h】

On the Fundamental Limits of Coded Data Shuffling for Distributed Machine Learning

机译:论分布式机器学习编码数据混洗的基本限制

获取原文
获取原文并翻译 | 示例
           

摘要

We consider the data shuffling problem in a distributed learning system, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access to a database of files. In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining files, assuming the cached files are uncoded. The caches of the worker nodes are updated every iteration, and they should be designed to satisfy any possible unknown permutation of the files in subsequent iterations. For this problem, we characterize the exact load-memory trade-off for worst-case shuffling by deriving the minimum communication load for a given storage capacity per worker node. As a byproduct, the exact load-memory trade-off for any shuffling is characterized when the number of files is equal to the number of worker nodes. We propose a novel deterministic coded shuffling scheme, which improves the state of the art, by exploiting the cache memories to create coded functions that can be decoded by several worker nodes. Then, we prove the optimality of our proposed scheme by deriving a matching lower bound and showing that the placement phase of the proposed coded shuffling scheme is optimal over all shuffles.
机译:我们考虑通过共享链接将主节点连接到一组工作节点的分布式学习系统中的数据洗牌问题,以便将一组文件传送到工作节点。主节点可以访问文件数据库。在每个Shuffling迭代中,每个工作节点都会处理新的文件子集,并且具有多余的存储器,以部分缓存剩余文件,假设缓存的文件未被删除。工作节点的缓存每次迭代都更新,并且应该旨在满足随后的迭代中文件的任何可能的未知排列。对于此问题,我们通过导出每个工作人员节点的给定存储容量的最小通信负载来表征精确的负载存储器折衷。作为副产品,当文件的数量等于工作人数节点的数量时,表征了任何洗机的确切负载存储器折衷。我们提出了一种新颖的确定性编码洗牌方案,其通过利用高速缓冲存储器来创建可以由几个工作节点解码的编码函数来提高现有技术。然后,我们通过导出匹配的下限并表明所提出的编码的混洗方案的放置阶段在所有洗牌中最佳,证明了我们提出的方案的最优性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号