首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Parallelizing Word2Vec in Shared and Distributed Memory
【24h】

Parallelizing Word2Vec in Shared and Distributed Memory

机译:在共享和分布式内存中并行化Word2Vec

获取原文
获取原文并翻译 | 示例
           

摘要

Word2vec is a widely used algorithm for extracting low-dimensional vector representations of words. State-of-the-art algorithms including those by Mikolov et al. [1], [2] have been parallelized for multi-core CPU architectures, but are based on vector-vector operations with "Hogwild" updates that are memory-bandwidth intensive and do not efficiently use computational resources. In this paper, we propose "HogBatch" by improving reuse of various data structures in the algorithm through the use of minibatching and negative sample sharing, hence allowing us to express the problem using matrix multiply operations. We also explore different techniques to distribute word2vec computation across nodes in a computer cluster, and demonstrate good strong scalability up to 32 nodes. The new algorithm is particularly suitable for modern multi-core/many-core architectures, especially Intel's latest Knights Landing processors, and allows us to scale up the computation near linearly across cores and nodes, and process hundreds of millions of words per second, which is the fastest word2vec implementation to the best of our knowledge. We released the source code for reproducible research and general usage.
机译:Word2vec是一种广泛使用的算法,用于提取单词的低维向量表示。最先进的算法,包括Mikolov等人的算法。 [1],[2]已针对多核CPU体系结构进行了并行化,但基于带有“ Hogwild”更新的矢量向量操作,这些更新占用内存带宽,并且无法有效使用计算资源。在本文中,我们通过使用小批处理和负样本共享来改善算法中各种数据结构的重用性来提出“ HogBatch”,从而使我们能够使用矩阵乘法运算来表达问题。我们还探索了将word2vec计算分布在计算机群集中的各个节点上的各种技术,并展示了高达32个节点的良好强大可伸缩性。新算法特别适合现代多核/多核体系结构,尤其是英特尔最新的Knights Landing处理器,它使我们能够跨核和节点线性扩展计算能力,并每秒处理数亿个字,就我们所知,这是最快的word2vec实现。我们发布了可重复研究和一般用途的源代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号