In this paper we present and analyze a modular implementation of the Multilayer Perceptron (MLP) network in the view of the recent paradigm shift called the multicore era. The implementation parallelizes the pass of an input through the network by distributing the neurons of a given layer among parallel executed modules. Each module has full connection among its local neurons and an adjustable number of remote connections with other modules. We analyzed the parallel speedup and parallel efficiency for different total number of synapses and number of remote connections per module. The fully connected case showed weak scalability for future parallel architectures with non-uniform memory access. The results for the proposed implementation showed that it is quite scalable for a small number of remote connections.
展开▼