A Novel Approach for Efficient Training of Deep Neural Networks

D.T.V. Dharmajee Rao, K.V. Ramana


Deep Neural Network training algorithms consumes long training time, especially when the number of hidden layers and nodes is large. Matrix multiplication is the key operation carried out at every node of each layer for several hundreds of thousands of times during the training of Deep Neural Network. Blocking is a well-proven optimization technique to improve the performance of matrix multiplication. Blocked Matrix multiplication algorithms can easily be parallelized to accelerate the performance further. This paper proposes a novel approach of implementing Parallel Blocked Matrix multiplication algorithms to reduce the long training time. The proposed approach was implemented using a parallel programming model OpenMP with collapse() clause for the multiplication of input and weight matrices of Backpropagation and Boltzmann Machine Algorithms for training Deep Neural Network and tested on multi-core processor system. Experimental results showed that the proposed approach achieved approximately two times speedup than classic algorithms.


Deep Neural Network; Parallel Blocked Matrix multiplication; Backpropagation and Boltzmann Machine algorithms; OpenMP; Multi-core processor system

Full Text:


DOI: http://doi.org/10.11591/ijeecs.v11.i3.pp954-961


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

shopify stats IJEECS visitor statistics