首页> 美国卫生研究院文献>Frontiers in Neuroinformatics >NeuroPigPen: A Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications Using Apache Pig

【2h】

NeuroPigPen: A Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications Using Apache Pig

机译：NeuroPigPen：使用Apache Pig处理神经科学应用中的电生理信号数据的可扩展工具包

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

The recent advances in neurological imaging and sensing technologies have led to rapid increase in the volume, rate of data generation, and variety of neuroscience data. This “neuroscience Big data” represents a significant opportunity for the biomedical research community to design experiments using data with greater timescale, large number of attributes, and statistically significant data size. The results from these new data-driven research techniques can advance our understanding of complex neurological disorders, help model long-term effects of brain injuries, and provide new insights into dynamics of brain networks. However, many existing neuroinformatics data processing and analysis tools were not built to manage large volume of data, which makes it difficult for researchers to effectively leverage this available data to advance their research. We introduce a new toolkit called NeuroPigPen that was developed using Apache Hadoop and Pig data flow language to address the challenges posed by large-scale electrophysiological signal data. NeuroPigPen is a modular toolkit that can process large volumes of electrophysiological signal data, such as Electroencephalogram (EEG), Electrocardiogram (ECG), and blood oxygen levels (SpO2), using a new distributed storage model called Cloudwave Signal Format (CSF) that supports easy partitioning and storage of signal data on commodity hardware. NeuroPigPen was developed with three design principles: (a) Scalability—the ability to efficiently process increasing volumes of data; (b) Adaptability—the toolkit can be deployed across different computing configurations; and (c) Ease of programming—the toolkit can be easily used to compose multi-step data processing pipelines using high-level programming constructs. The NeuroPigPen toolkit was evaluated using 750 GB of electrophysiological signal data over a variety of Hadoop cluster configurations ranging from 3 to 30 Data nodes. The evaluation results demonstrate that the toolkit is highly scalable and adaptable, which makes it suitable for use in neuroscience applications as a scalable data processing toolkit. As part of the ongoing extension of NeuroPigPen, we are developing new modules to support statistical functions to analyze signal data for brain connectivity research. In addition, the toolkit is being extended to allow integration with scientific workflow systems. NeuroPigPen is released under BSD license at: .

机译：神经影像学和传感技术的最新进展已导致数据量，数据生成速率和各种神经科学数据的迅速增加。这种“神经科学大数据”为生物医学研究界提供了使用更大时间尺度，大量属性以及具有统计意义的数据大小来设计实验的重要机会。这些新的数据驱动研究技术的结果可以增进我们对复杂神经系统疾病的理解，有助于对脑损伤的长期影响进行建模，并提供对脑网络动力学的新见解。但是，许多现有的神经信息学数据处理和分析工具并不是用来管理大量数据的，这使得研究人员很难有效地利用这些可用数据来推进他们的研究。我们引入了一个名为NeuroPigPen的新工具包，该工具包是使用Apache Hadoop和Pig数据流语言开发的，旨在解决大规模电生理信号数据带来的挑战。 NeuroPigPen是一个模块化工具包，可以使用称为Cloudwave Signal Format（CSF）的新分布式存储模型来处理大量的电生理信号数据，例如脑电图（EEG），心电图（ECG）和血氧水平（SpO2）。易于在商用硬件上分割和存储信号数据。 NeuroPigPen的开发遵循三个设计原则：（a）可扩展性-有效处理不断增长的数据量的能力；（b）适应性－该工具包可部署在不同的计算配置中；（c）易于编程-该工具包可使用高级编程结构轻松地用于组成多步数据处理管道。 NeuroPigPen工具包使用750 GB的电生理信号数据通过3至30个数据节点的各种Hadoop群集配置进行了评估。评估结果表明，该工具包具有高度可伸缩性和适应性，使其适合作为可伸缩数据处理工具包用于神经科学应用程序。作为NeuroPigPen正在进行的扩展的一部分，我们正在开发新的模块，以支持统计功能来分析信号数据以进行大脑连接研究。此外，该工具包正在扩展以允许与科学工作流程系统集成。 NeuroPigPen在BSD许可下发布于：。

著录项

期刊名称 Frontiers in Neuroinformatics
作者
Satya S. Sahoo; Annan Wei; Joshua Valdez; Li Wang; Bilal Zonjy; Curtis Tatsuoka; Kenneth A. Loparo; Samden D. Lhatoo;
展开▼
作者单位

展开▼
年(卷),期 2016(10),-1
年度 2016
页码 18
总页数 12
原文格式 PDF
正文语种
中图分类神经科学;
关键词
data flow language Apache Pig electrophysiological signal data neuroscience MapReduce;

机译：数据流语言;Apache Pig;电生理信号数据;神经科学;MapReduce;

相似文献

外文文献
中文文献
专利

1. NeuroPigPen: A Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications Using Apache Pig [J] . Satya S. Sahoo, Annan Wei, Joshua Valdez, Frontiers in Neuroinformatics . 2016,第2016期

机译：NeuroPigPen：使用Apache Pig处理神经科学应用中的电生理信号数据的可扩展工具包
2. A comparison on scalability for batch big data processing on Apache Spark and Apache Flink [J] . Diego García-Gil, Sergio Ramírez-Gallego, Salvador García, Big Data Analytics . 2017,第1期

机译：Apache Spark和Apache Flink上批处理大数据处理的可伸缩性比较
3. Big Data Approaches for the Analysis of Large-Scale fMRI Data Using Apache Spark and GPU Processing: A Demonstration on Resting-State fMRI Data from the Human Connectome Project [J] . Roland N. Boubela, Klaudius Kalcher, Wolfgang Huf, Frontiers in Neuroscience . 2015,第1期

机译：使用Apache Spark和GPU处理分析大型fMRI数据的大数据方法：来自人类Connectome项目的静态fMRI数据的演示
4. Hadoop-EDF: Large-scale Distributed Processing of Electrophysiological Signal Data in Hadoop MapReduce [C] . Yuanyuan Wu, Xiaojin Li, Jinze Liu, IEEE International Conference on Bioinformatics and Biomedicine . 2019

机译：Hadoop-EDF：Hadoop MapReduce中的电生理信号数据的大规模分布式处理
5. An advanced signal processing toolkit for Java applications [D] . Shah, Vijay Pravin 2002

机译：适用于Java应用程序的高级信号处理工具包
6. Big Data Approaches for the Analysis of Large-Scale fMRI Data Using Apache Spark and GPU Processing: A Demonstration on Resting-State fMRI Data from the Human Connectome Project [O] . Roland N. Boubela, Klaudius Kalcher, Wolfgang Huf, 2015

机译：使用Apache Spark和GPU处理的大数据分析方法用于大规模fMRI数据：来自人类Connectome项目的静态fMRI数据的演示
7. NeuroPigPen: A Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications Using Apache Pig [O] . Satya S. Sahoo, Annan Wei, Joshua Valdez, 2016

机译：Neuropigpen：使用apache pig处理神经科学应用中的电生理信号数据的可扩展工具包

NeuroPigPen: A Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications Using Apache Pig

摘要

著录项

相似文献

相关主题

期刊订阅