Parallelizing Word2Vec in Shared and Distributed Memory

Ji Shihao; Satish Nadathur; Li Sheng; Dubey Pradeep K.

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Parallelizing Word2Vec in Shared and Distributed Memory

【24h】

Parallelizing Word2Vec in Shared and Distributed Memory

机译：在共享和分布式内存中并行化Word2Vec

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Word2vec is a widely used algorithm for extracting low-dimensional vector representations of words. State-of-the-art algorithms including those by Mikolov et al. [1], [2] have been parallelized for multi-core CPU architectures, but are based on vector-vector operations with "Hogwild" updates that are memory-bandwidth intensive and do not efficiently use computational resources. In this paper, we propose "HogBatch" by improving reuse of various data structures in the algorithm through the use of minibatching and negative sample sharing, hence allowing us to express the problem using matrix multiply operations. We also explore different techniques to distribute word2vec computation across nodes in a computer cluster, and demonstrate good strong scalability up to 32 nodes. The new algorithm is particularly suitable for modern multi-core/many-core architectures, especially Intel's latest Knights Landing processors, and allows us to scale up the computation near linearly across cores and nodes, and process hundreds of millions of words per second, which is the fastest word2vec implementation to the best of our knowledge. We released the source code for reproducible research and general usage.

机译：Word2vec是一种广泛使用的算法，用于提取单词的低维向量表示。最先进的算法，包括Mikolov等人的算法。 [1]，[2]已针对多核CPU体系结构进行了并行化，但基于带有“ Hogwild”更新的矢量向量操作，这些更新占用内存带宽，并且无法有效使用计算资源。在本文中，我们通过使用小批处理和负样本共享来改善算法中各种数据结构的重用性来提出“ HogBatch”，从而使我们能够使用矩阵乘法运算来表达问题。我们还探索了将word2vec计算分布在计算机群集中的各个节点上的各种技术，并展示了高达32个节点的良好强大可伸缩性。新算法特别适合现代多核/多核体系结构，尤其是英特尔最新的Knights Landing处理器，它使我们能够跨核和节点线性扩展计算能力，并每秒处理数亿个字，就我们所知，这是最快的word2vec实现。我们发布了可重复研究和一般用途的源代码。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2019年第9期|2090-2100|共11页
作者
Ji Shihao; Satish Nadathur; Li Sheng; Dubey Pradeep K.;
展开▼
作者单位

Georgia State Univ Dept Comp Sci Atlanta GA 30303 USA;

Intel Labs Santa Clara CA 95054 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Word2Vec; parallel algorithms; distributed computing; multi-core and many-core systems;

机译：Word2Vec;并行算法;分布式计算多核和多核系统;

相似文献

外文文献
中文文献
专利

1. Exploiting Distributed-Memory and Shared-Memory Parallelism on Clusters of SMPs with Data Parallel Programs [J] . Siegfried Benkner, Viera Sipkova International journal of parallel programming . 2003,第1期

机译：利用数据并行程序在SMP群集上利用分布式内存和共享内存并行性
2. PARALLEL MP2-ENERGY EVALUATION - SIMULATED SHARED MEMORY APPROACH ON DISTRIBUTED MEMORY PARALLEL MACHINES [J] . Limaye AC. Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological . 1997,第4期

机译：并行MP2能量评估-分布式内存并行机上的模拟共享内存方法
3. Parallelizing with BDSC, a resource-constrained scheduling algorithm for shared and distributed memory systems [J] . Dounia Khaldi, Pierre Jouvelot, Corinne Ancourt Parallel Computing . 2015,第jana期

机译：与BDSC并行，一种用于共享和分布式内存系统的资源受限的调度算法
4. A Distributed-Memory Parallelization of a Shared-Memory Parallel Ensemble Kalman Filter [C] . Rostami M. Ali, Bucker H. Martin, Vogt Christian, International Symposium on Symbolic and Numeric Algorithms for Scientific Computing . 2014

机译：共享内存并行集合卡尔曼滤波器的分布式内存并行化
5. Impact of shared memory and distributed memory platforms on the design and performance of parallel evolutionary algorithms. [D] . James, Tabitha Lynn. 2002

机译：共享内存和分布式内存平台对并行进化算法的设计和性能的影响。
6. Performance of parallel FDTD method for shared- and distributed-memory architectures: Application tobioelectromagnetics [O] . Miguel Ruiz-Cabello N., Maksims Abaļenkovs, Luis M. Diaz Angulo, 2020

机译：共享和分布式内存架构并行FDTD方法的性能：应用脚踏电磁
7. Parallelizing Word2Vec in Shared and Distributed Memory [O] . Shihao Ji, Nadathur Satish, Sheng Li, 2019

机译：在共享和分布式内存中并行化Word2VEC
8. Comparison of distributed memory and virtual shared memory parallel programming models [R] . Keane, J. A., Grant, A. J., Xu, M. Q. 1993

机译：分布式内存与虚拟共享内存并行编程模型的比较

Parallelizing Word2Vec in Shared and Distributed Memory

摘要

著录项

相似文献

相关主题

期刊订阅