首页> 美国卫生研究院文献>Frontiers in Neuroinformatics >NeuroPigPen: A Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications Using Apache Pig
【2h】

NeuroPigPen: A Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications Using Apache Pig

机译:NeuroPigPen:使用Apache Pig处理神经科学应用中的电生理信号数据的可扩展工具包

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The recent advances in neurological imaging and sensing technologies have led to rapid increase in the volume, rate of data generation, and variety of neuroscience data. This “neuroscience Big data” represents a significant opportunity for the biomedical research community to design experiments using data with greater timescale, large number of attributes, and statistically significant data size. The results from these new data-driven research techniques can advance our understanding of complex neurological disorders, help model long-term effects of brain injuries, and provide new insights into dynamics of brain networks. However, many existing neuroinformatics data processing and analysis tools were not built to manage large volume of data, which makes it difficult for researchers to effectively leverage this available data to advance their research. We introduce a new toolkit called NeuroPigPen that was developed using Apache Hadoop and Pig data flow language to address the challenges posed by large-scale electrophysiological signal data. NeuroPigPen is a modular toolkit that can process large volumes of electrophysiological signal data, such as Electroencephalogram (EEG), Electrocardiogram (ECG), and blood oxygen levels (SpO2), using a new distributed storage model called Cloudwave Signal Format (CSF) that supports easy partitioning and storage of signal data on commodity hardware. NeuroPigPen was developed with three design principles: (a) Scalability—the ability to efficiently process increasing volumes of data; (b) Adaptability—the toolkit can be deployed across different computing configurations; and (c) Ease of programming—the toolkit can be easily used to compose multi-step data processing pipelines using high-level programming constructs. The NeuroPigPen toolkit was evaluated using 750 GB of electrophysiological signal data over a variety of Hadoop cluster configurations ranging from 3 to 30 Data nodes. The evaluation results demonstrate that the toolkit is highly scalable and adaptable, which makes it suitable for use in neuroscience applications as a scalable data processing toolkit. As part of the ongoing extension of NeuroPigPen, we are developing new modules to support statistical functions to analyze signal data for brain connectivity research. In addition, the toolkit is being extended to allow integration with scientific workflow systems. NeuroPigPen is released under BSD license at: .
机译:神经影像学和传感技术的最新进展已导致数据量,数据生成速率和各种神经科学数据的迅速增加。这种“神经科学大数据”为生物医学研究界提供了使用更大时间尺度,大量属性以及具有统计意义的数据大小来设计实验的重要机会。这些新的数据驱动研究技术的结果可以增进我们对复杂神经系统疾病的理解,有助于对脑损伤的长期影响进行建模,并提供对脑网络动力学的新见解。但是,许多现有的神经信息学数据处理和分析工具并不是用来管理大量数据的,这使得研究人员很难有效地利用这些可用数据来推进他们的研究。我们引入了一个名为NeuroPigPen的新工具包,该工具包是使用Apache Hadoop和Pig数据流语言开发的,旨在解决大规模电生理信号数据带来的挑战。 NeuroPigPen是一个模块化工具包,可以使用称为Cloudwave Signal Format(CSF)的新分布式存储模型来处理大量的电生理信号数据,例如脑电图(EEG),心电图(ECG)和血氧水平(SpO2)。易于在商用硬件上分割和存储信号数据。 NeuroPigPen的开发遵循三个设计原则:(a)可扩展性-有效处理不断增长的数据量的能力; (b)适应性-该工具包可部署在不同的计算配置中; (c)易于编程-该工具包可使用高级编程结构轻松地用于组成多步数据处理管道。 NeuroPigPen工具包使用750 GB的电生理信号数据通过3至30个数据节点的各种Hadoop群集配置进行了评估。评估结果表明,该工具包具有高度可伸缩性和适应性,使其适合作为可伸缩数据处理工具包用于神经科学应用程序。作为NeuroPigPen正在进行的扩展的一部分,我们正在开发新的模块,以支持统计功能来分析信号数据以进行大脑连接研究。此外,该工具包正在扩展以允许与科学工作流程系统集成。 NeuroPigPen在BSD许可下发布于:。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号