Deep Learning Based Static Hand Gesture Recognition

Dina Satybaldina, Gulzia Kalymova


Hand gesture recognition becomes a popular topic of deep learning and provides many application fields for bridging the human–computer barrier and has a positive impact on our daily life. The primary idea of our project is a static gesture acquisition from depth camera and to process the input images to train the deep convolutional neural network pre-trained on ImageNet dataset. Proposed system consists of gesture capture device (Intel® RealSense™ depth camera D435), pre-processing and image segmentation algorithms, feature extraction algorithm and object classification. For pre-processing and image segmentation algorithms computer vision methods from the OpenCV and Intel Real Sense libraries are used. The subsystem for features extracting and gestures classification is based on the modified VGG-16 by using the TensorFlow&Keras deep learning framework. Performance of the static gestures recognition system is evaluated using maching learning metrics. Experimental results show that the proposed model, trained on a database of 2000 images, provides high recognition accuracy both at the training and testing stages.


Gesture Recognition; Deep Learning; Computer Vision; Machine Learning; Convolutional Neural Network; VGG-16


S. R. Sree, S. B. Vyshnavi, N. Jayapandian, "Real-World Application of Machine Learning and Deep Learning," in 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT), IEEE, 2019, pp. 1069-1073.

T. V. Janahiraman & P. Subramaniam, "Gender Classification Based on Asian Faces using Deep Learning, " in 2019 IEEE 9th International Conference on System Engineering and Technology (ICSET), pp. 84-89, 2019.

R. I. Bendjillali, M. Beladgham, K. Merit & A. Taleb-Ahmed, "Illumination-robust face recognition based on deep convolutional neural networks architectures, " Indonesian Journal of Electrical Engineering and Computer Science, 18(2), pp. 1015-1027, 2020.

Ahmed Kadem Hamed AlSaedi, Abbas H. Hassin AlAsadi, “A new hand gestures recognition system”, Indonesian Journal of Electrical Engineering and Computer Science, 18(1), pp. 49-55, 2020.

P. K. Pisharady, M. Saerbeck, "Recent methods and databases in vision-based hand gesture recognition: A review," Computer Vision and Image Understanding 141, pp. 152-165, 2015.

B. K. Chakraborty, D. Sarma, M. K. Bhuyan & K. F. MacDorman, "Review of constraints on vision-based gesture recognition for human–computer interaction," IET Computer Vision 12 (1), pp. 3-15, 2017.

B. Liao, J. Li, Z. Ju, G. Ouyang, "Hand gesture recognition with generalized hough transform and DC-CNN using realsense, " in 2018 Eighth International Conference on Information Science and Technology (ICIST), IEEE, 2018, pp. 84-90.

M. B. Holte, T. B. Moeslund, P. Fihl, "Fusion of range and intensity information for view invariant gesture recognition," in 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE, 2008, pp. 1-7.

M. Van den Bergh, D. Carton, R. De Nijs, N. Mitsou, C. Landsiedel, K. Kuehnlenz, D. Wollherr, L. Van Gool, M. Buss, "Real-time 3D hand gesture interaction with a robot for understanding directions from humans," in 2011 Ro-Man, pp. 357-362, IEEE 2011.

Zhou Ren, Junsong Yuan, Zhengyou Zhang, "Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera," in Proceedings of the 19th ACM international conference on Multimedia, pp. 1093-1096, 2011.

Di Wu, Fan Zhu, Ling Shao, "One shot learning gesture recognition from rgbd images," in 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE, 2012, pp. 7-12.

C. Keskin, F. Kirac, Y. Kara, L. Akarun, "Randomized decision forests for static and dynamic hand shape classification," in 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE, 2012, pp. 31-36.

L. Keselman, J.I. Woodfill, A. Grunnet-Jepsen, A. Bhowmik, "Intel R RealSense TM Stereoscopic Depth Cameras," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.1-10, 2017.

Intel® RealSense™ SDK 2.0. last accessed 2020/01/24.

Intel RealSense D400 Series Product Family. Datasheet. 2019 Intel Corporation. Document Number: 337029-007. last accessed 2020/01/24.

R. D. Bock, "Low-cost 3D security camera. Autonomous Systems: Sensors, Vehicles, Security, and the Internet of Everything," International Society for Optics and Photonics, 10643, pp. 106430E, 2018.

Q. Fang, M. Kyrarini, D. Ristic-Durrant, A. Gräser, "RGB-D Camera based 3D Human Mouth Detection and Tracking Towards Robotic Feeding Assistance," in Proceedings of the 11th Pervasive Technologies Related to Assistive Environments Conference, pp. 391-396, 2018.

H. Aoki, A. Suzuki, T. Shiga, "Study on Non-Contact Heart Beat Measurement Method by Using Depth Sensor," in World Congress on Medical Physics and Biomedical Engineering, Springer, Singapore, 2019, pp. 341-345.

T. N. Syed, L. Jizhan, Z. Xin, Z. Shengyi, Y. Yan, S. H. A. Mohamed, I. A. Lakhiar, "Seedling-lump integrated non-destructive monitoring for automatic transplanting with Intel RealSense depth camera. Artificial Intelligence in Agriculture," 3, pp. 18-32, 2019.

V. Chernov, J. Alander, V. Bochko, "Integer-based accurate conversion between RGB and HSV color spaces. Computers & Electrical Engineering 46," pp. 328-337, 2015.

K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv: 1409," 1556, 2014.

E. Rezende, G. Ruppert, T. Carvalho, A. Theophilo, F. Ramos & P. de Geus, "Malicious software classification using VGG16 deep neural network’s bottleneck features. Information Technology-New Generations," Springer, Cham, 2018, pp. 51-59.

Z. Liu, J. Wu, L. Fu, Y. Majeed, Y. Feng, R. Li & Y. Cui, "Improved kiwifruit detection using pre-trained VGG16 with RGB and NIR information fusion. IEEE Access," pp.2327-2336, 2019.

T. Mantecón, C.R. del Blanco, F. Jaureguizar, N. García, "Hand Gesture Recognition using Infrared Imagery Provided by Leap Motion Controller," in J. Blanc-talon et al. (eds.): acivs 2016, lncs 10016, Springer, Heidelberg 2016, pp. 47–57.

S. Visa, B. Ramsay, A. L. Ralescu & E. Van Der Knaap, "Confusion Matrix-based Feature Selection. MAICS 710," pp. 120-127, 2011.

Total views : 24 times


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

shopify stats IJEECS visitor statistics