首页> 美国卫生研究院文献>other >Large Scale Document Inversion using a Multi-threaded Computing System
【2h】

Large Scale Document Inversion using a Multi-threaded Computing System

机译:使用多线程计算系统进行大规模文档反演

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Current microprocessor architecture is moving towards multi-core/multi-threaded systems. This trend has led to a surge of interest in using multi-threaded computing devices, such as the Graphics Processing Unit (GPU), for general purpose computing. We can utilize the GPU in computation as a massive parallel coprocessor because the GPU consists of multiple cores. The GPU is also an affordable, attractive, and user-programmable commodity. Nowadays a lot of information has been flooded into the digital domain around the world. Huge volume of data, such as digital libraries, social networking services, e-commerce product data, and reviews, etc., is produced or collected every moment with dramatic growth in size. Although the inverted index is a useful data structure that can be used for full text searches or document retrieval, a large number of documents will require a tremendous amount of time to create the index. The performance of document inversion can be improved by multi-thread or multi-core GPU. Our approach is to implement a linear-time, hash-based, single program multiple data (SPMD), document inversion algorithm on the NVIDIA GPU/CUDA programming platform utilizing the huge computational power of the GPU, to develop high performance solutions for document indexing. Our proposed parallel document inversion system shows 2-3 times faster performance than a sequential system on two different test datasets from PubMed abstract and e-commerce product reviews.CCS Concepts•Information systems➝Information retrieval • Computing methodologies➝Massively parallel and high-performance simulations.
机译:当前的微处理器体系结构正在朝着多核/多线程系统发展。这种趋势引起了人们对使用多线程计算设备(例如图形处理单元(GPU))进行通用计算的兴趣激增。由于GPU由多个内核组成,因此我们可以在计算中将GPU用作大型并行协处理器。 GPU还是价格合理,有吸引力且用户可编程的商品。如今,许多信息已泛滥到全世界的数字领域。每时每刻都会产生或收集大量数据,例如数字图书馆,社交网络服务,电子商务产品数据和评论等,并且其大小会急剧增长。尽管倒排索引是可以用于全文搜索或文档检索的有用数据结构,但是大量文档将需要大量时间来创建索引。文档反转的性能可以通过多线程或多核GPU来提高。我们的方法是利用GPU的强大计算能力在NVIDIA GPU / CUDA编程平台上实现线性时间,基于哈希的单程序多数据(SPMD)文档反转算法,以开发用于文档索引的高性能解决方案。我们提出的并行文档倒置系统在PubMed抽象和电子商务产品评论的两个不同测试数据集上的性能比顺序系统快2-3倍。CCS概念•信息系统➝信息检索•计算方法论parallel大规模并行且高性能模拟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号