Open Access Open Access  Restricted Access Subscription Access

Chinese Sign Language Recognition Based on Two-Stream CNN and LSTM Network


Affiliations
1 School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China
 

Sign language recognition is the use of computer technology to convert sign language into text or speech to facilitate the communication between deaf-mute people and normal people. This paper takes Chinese sign language words as the research object, and proposes a new method of sign language recognition based on two-stream 3D-CNN and LSTM network. First, the key frame extraction algorithm is used to remove redundant data frames in the original data, and then a two-stream3D-CNN is used to learn local hand change features and global trajectory features at the same time, and aggregated as the feature input of the video clip to the LSTM codec network. In order to focus on the video frames that express the meaning of sign language, a time attention mechanism is introduced in the LSTM encoding and decoding network. On the DEVISIGN-D sign language data set, an experiment was compared with three sign language recognition algorithms, the experimental results show that the method can identify Chinese isolated words sign language very well, with an accuracy rate of 98.4%.

Keywords

3D Convolutional Neural Network, Attention Mechanism, Long and Short-Term Memory Network, Sign Language Recognition, Key Frame.
User
Notifications
Font Size

  • . Alex K, Ilya S, et al, Image Net Classification with Deep Convolutional Neural Networks, Neural Information Processing Systems (NIPS), Lake Tahoe, USA, 2012,1097-1105.
  • . Ji S W, Xu W, et al, 3D convolutional neural networks for human action recognition, Pattern Analysis and Machine Intelligence,35(1),2013,221-231.
  • . Tang A,Lu K,et al, A real-time handposture recognition system using deep neural networks, ACM Transactions on Intelligent Systems and Technology, 6(2),2015: 1–23.
  • . Hossen M A,Sultana S,et al, Bengali sign language recognition using deep Convolutional Neural Network, International Conference on Informatics, Informatics,Elec-tronics& Vision, Kitak yushu, Japan, 2018,369–373.
  • . Zhang H W,Hu Y,Zou Y J,et al, Fingerspelling Identification for American Sign Language Based on Resnet-18,Int. J. Advanced Networking and Applications,1(13),2021,4816-4820.
  • . Kim S, Lee KB,Ji Y H,An effective sign language learning with object detection based ROI segmentation, IEEE International Conference on Robotic Computing (IRC), Laguna Hills, USA,2018,330-333.
  • . Hu H z, Zhou W G,Li H Q, Hand-model-aware sign language recognition, Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, MenloPark, 2021, 1558-1566.
  • . Huang J,Zhou W,Li H,et al,Sign language recognition using 3d convolutional neural networks, Proceedings of Multimedia and Expo(ICME),Turin,Italy, 2015,1-6.
  • . Li Y, Miao Q, et al, Large-scale gesture recognition with a fusion of RGB-D data based on saliency theory and C3D model, IEEE Transactions on Circuits and Systems for Video Technology, 28(10) ,2018,2956–2964.
  • . Liang Z J, Liao S B, et al, 3D convolutional neural networks for dynamic signl anguage recognition, The Computer Journal,61(11), 2018,1724–1736.
  • . Kai Z, Zhang K J, et al, Real-time sign language recognition based on video stream, International Journal of Systems, Control and Communications,12(2),2021,158-174.
  • . Liu T, Zhou W G, Li H Q, Sign language recogntion with long short-term memory, IEEE International Conference: on Image Processing (ICI P),Phoenix,USA,2016,2871–2875.
  • . Huang S L, Mao C S, et al, A novel chinese sign language recognition method based on key frame-centered clips, IEEE Signal Processing Letters, 25(3),2018, 442–446.
  • . Lin C, Wan J, Liang Y Y, etal. Large-scale isolated gesture recognition using a refined fused model based on maskedResC3D network and skeleton LST M, The 13th IEEE International Conference on Automatic Face & Gesture Recognition , Xi’an, China, 2018,52–58.
  • . Liao Y Q , Xiong P W , et al,Dynamic sign language recognition based on videosequence with BLSTM -3D residual networks , IEEE Access, 7, 2019,38044–38054.
  • . Jangyodsuk P, Conly C, Athitsos V, Sign language recognition using dynamic time warping and hand shape distance based on histogram of oriented gradient features, Proceedings of Proceedings of the 7th International Conference on Pervasive Technologies Related to Assistive Environments, Petra ,2014,1 -6.
  • . Bochkovskiy A ,Wang C ,YLiao H ,YOLOv4:Optimal speed and accuracy of object detection , Computer Vision and Pattern Recognition, 17(9),2020, 198 -215.
  • . Huang J, Zhou W, et al, Attention -Based 3D -CNNs for Large -Vocabulary Sign Language Recognition,IEEE Transactions on Circuits and Systems for Video Technology, 29(9) ,2018, 2822 -2832.
  • . Du T,Lubomir B, Rob F, et al,Learning Spatiotemporal Features with 3D Convolutional Networks,2015 International Conference on Computer Vision, Santiago, Chile, 2015,4489 -4497.
  • . Huang J, Zhou W, et al, Attention -Based 3D -CNNs for Large -Vocabulary Sign Language Recognition, IEEE Transactions on Circuits and Systems for Video Technology, 29(9),2018,2822 -2832.
  • . Du T,Lubomir B, Rob F, et al, Learning Spatiotemporal Features with 3D Convolutional Networks,2015 International Conference on Computer Vision, Santiago, Chile, 2015,4489 -4497.
  • . Mao C S, Research on Chinese Sign Language Word Recognition Method Based on Convolutional Networks and Long Short Term Memory Networks, master diss., University of Science and Technology of China,Hefei,2018.
  • . Wang H J, Chai X J, Hong X P, et al, Isolated Sign Language Recognition With Grassmann Covariance Matrices ,ACM Transactions on Accessible Computing, 8(4),2016, 14 -21.
  • . Wang H,Chai X,Zhou Y,et al,Fast sign language recognition benefited from low rank approximation,The11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition,ljubljana, Slovenia,2015,1 -6.
  • . Huang J , Zhou W ,LiH ,et al, Sign Language Recognition using 3D convolutional neural networks, IEEE International Conference on Multimedia and Expo ( ICME), San Diego, CA, USA,2018:1 - 6 .
  • . Han N J,Research on Sign Language Recognition Method Based on Deep Learning ,master diss.,Jilin University,Changchun, 2021.

Abstract Views: 94

PDF Views: 0




  • Chinese Sign Language Recognition Based on Two-Stream CNN and LSTM Network

Abstract Views: 94  |  PDF Views: 0

Authors

LUO Yin
School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China
HU Ying
School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China
LIUDi-kun
School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China
LIRui
School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China
YANG Meng-hao
School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China

Abstract


Sign language recognition is the use of computer technology to convert sign language into text or speech to facilitate the communication between deaf-mute people and normal people. This paper takes Chinese sign language words as the research object, and proposes a new method of sign language recognition based on two-stream 3D-CNN and LSTM network. First, the key frame extraction algorithm is used to remove redundant data frames in the original data, and then a two-stream3D-CNN is used to learn local hand change features and global trajectory features at the same time, and aggregated as the feature input of the video clip to the LSTM codec network. In order to focus on the video frames that express the meaning of sign language, a time attention mechanism is introduced in the LSTM encoding and decoding network. On the DEVISIGN-D sign language data set, an experiment was compared with three sign language recognition algorithms, the experimental results show that the method can identify Chinese isolated words sign language very well, with an accuracy rate of 98.4%.

Keywords


3D Convolutional Neural Network, Attention Mechanism, Long and Short-Term Memory Network, Sign Language Recognition, Key Frame.

References