Open Access Open Access  Restricted Access Subscription Access

Sign Language Recognition Using Deep CNN with Normalised Keyframe Extraction and Prediction Using LSTM


Affiliations
1 Department of Computer Technology, MIT, Anna University, Chennai 600 044, Tamilnadu, India
 

Sign Language Recognition (SLR) targets interpreting the signs so as to facilitate communication between hearing or speaking disabled people and normal people. This makes communication between normal people and signers effective and seamless. The scarcely available key information regarding the gestures is the key to recognise the signs. To implement continuous sign language gesture recognition, gestures are identified from the video using Deep Convolutional Neural Network. Recurrent Neural Network- Long Short-Term Memory verifies the semantics of the gesture sequence, which eventually will be converted into speech. The problem of constructing meaningful sentences from continuous gestures inspired the proposed system to develop a model based on it. The model is designed to increase the effectiveness of the classification by processing only the principal elements. The keyframes are identified and processed for classification. Validation of sentences can be done O(N). The sentences are converted into voiceover to have elegant communication between impaired and normal people. The model obtained an accuracy of 89.24% while training over Convolutional Neural Network to detect gestures and performed better than other pre-trained models and an accuracy of 89.99% while training over Recurrent Neural Network- Long Short-Term Memory to predict the next word using grammar phrases. This keyframe-to-voice conversion, forming proper sentences, enthrals people to have harmonious communication.

Keywords

Deaf-Mute People, Gesture Recognition, Indian Sign Language, Relationship Signs, Signer.
User
Notifications
Font Size

  • Kim S, Park G, Yim S, Choi S & Choi S, Gesturerecognizing hand-held interface with vibro tactile feedback for 3D interaction, IEEE Trans Consum Electron, 55 (2009) 1169–1177, DOI:10.1109/TCE.2009.5277972.
  • Soumya R M, Deepthi K, Goutam S & Anirban S, A feature weighting technique on SVM for human action recognition, J Sci Ind Res, 79(7) (2020) 626–630, DOI:http://nopr.niscpr.res.in/handle/123456789/54986.
  • Jayanthi P, Ponsy R K B, Swetha K & Subash S A, Real time static and dynamic sign language recognition using deep learning, J Sci Ind Res, 81(11) (2022) 1186–1194, DOI:https://doi.org/10.56042/jsir.v81i11.52657.
  • Jayanthi P & Sathia P R K B, Gesture recognition based on deep convolutional neural network, Proc Int Conf Adv (IEEE) 2018, 367–372, DOI:10.1109/ICoAC44903.2018.8939060.
  • Palak M, Pawanesh A & Parveen K L, Scene based classification of aerial images using convolution neural networks, J Sci Ind Res, 79(12) (2020) 1087–1094, DOI:http:// nopr.niscpr.res.in/handle/123456789/55729.
  • Mohandes M, Deriche M & Liu J, Image-based and sensor-based approaches to Arabic sign language recognition, IEEE Trans Hum Mach Syst, 44 (2014) 551–557, DOI:10. 1109/THMS. 2014.2318280.
  • Kritika N & Madhu S, Automated isolated digit recognition system: an approach using HMM, J Sci Ind Res, 70(4) (2011) 270–272, DOI:http://nopr.niscpr.res.in/handle/123456789/11585
  • Elmezain M, Al-Hamadi A, Appenrodt J & Michaelis B, A Hidden Markov Model-based continuous gesture recognition system for hand motion trajectory, Proc Int Conf Pattern Recognition (Tampa, FL) 2008, 1–4, DOI:10.1109/ICPR.2008.4761080.
  • He K, Zhang X, Ren R & Sun J, Spatial pyramid pooling in deep convolutional networks for visual recognition, Proc Comput Vis ECCV (Zurich) 2014, 346–361, DOI:https://doi.org /10. 1007/978-3-319-10578-9_23
  • Kishore P V V, Prasad M V D, Prasad C R & Rahul R, 4-Camera model for sign language recognition using elliptical Fourier descriptors and ANN, Proc Int Conf Signal Proc Commun Eng Syst (Guntur, India) 2015, 34–38, DOI:10.13140/RG.2.1.4220.8803.
  • Starner T & Pentland A, Real-time American sign language recognition from video using Hidden markov models in motion-based recognition, Comput Image Vis, 12 (1997) 227–243, DOI:10.1109/ISCV.1995.477012.
  • Pankajakshan P C & Thilagavathi B, Sign language recognition system, Proc Int Conf on Inno in Infor Embedded and Commun Syst (Coimbatore, India) 2015, 2–5, DOI:10.1109/IC IIECS.2015.7192910
  • Dardas N H & Georganas N D, Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques, IEEE Trans Instrum Meas, 60 (2011) 3592–3607, DOI:10.1109/TIM.2011.2161140.
  • Adithya V, Vinod P R & Gopalakrishnan U, Artificial neural network based method for Indian sign language recognition, Proc Int Conf Info & Commun Tech IEEE (Thuckalay, Tamil Nadu, India) 2013, 1080–1085, DOI:10.1109/CICT.2013.6558259.
  • Gaolin F & Wen G, A SRN/HMM system for signer-independent continuous sign language recognition, Proc Int Conf on Automatic Face Gesture Recognition IEEE (Washington-DC, USA) 2002, 312–317, DOI:10.1007/3-540-47873-6_8.
  • Haque P, Das B & Kaspy N N, Two-handed Bangla sign language recognition using principal component analysis (PCA) and KNN algorithm, Proc Int Conf on Electr Comput Commun Eng (Cox’s Bazar, Bangladesh) 2019, 1–4, DOI:10.1109/ECACE.2019.8679185.
  • Bao P, Maqueda A I, Del-Blanco C R & Garcıa N, Tiny hand gesture recognition without localization via a deep convolutional network, IEEE Trans Consum Electron, 63 (2017) 251–257, DOI:10.1109/TCE.2017.014971
  • Kumar E K, Kishore P V V, Sastry A S C S, Kumar M T K & Kumar D A, Training CNNs for 3-D sign language recognition with color texture coded joint angular displacement maps, IEEE Signal Process Lett, 25 (2018) 645–649, DOI:10.1109/LSP.2018.2817179
  • Bantupalli K & Xie Y, American sign language recognition using deep learning and computer vision, Proc Int Conf on Big Data IEEE (Seattle, WA, USA) 2018, 4896–4899, DOI:10.1109/BigData.2018.8622141.
  • Islam M R, Mitu U K, Bhuiyan R A & Shin J, Hand gesture feature extraction using deep convolutional neural network for recognizing American sign language, Proc Int Conf on Frontiers of Signal Proc (Poitiers, France) 2018, 115–119, DOI:10.1109/ICFSP. 2018.8552044.
  • Molchanov P, Gupta S, Kim K & Kautz J, Hand gesture recognition with 3D convolutional neural networks, Proc Conf Comput Vis. Pattern Recognit (Boston, MA) 2015, 1–7, DOI:10.1109/CVPRW.2015.7301342.
  • Rung-Huei L & Ming O, A real-time continuous gesture recognition system for sign language, Proc Int Conf on Automatic Face and Gesture Recognition IEEE (Nara) 1998, 558–567, DOI:10.1109/AFGR.1998.671007.
  • Wang H, Leu M C & Oz C, American sign language recognition using multi-dimensional hidden Markov models, J Inf Sci Eng, 22(5) (2006) 1109–1123.
  • Pradeep K, Himaanshu G, Partha P R & Debi P D, Coupled HMM based multi-sensor data fusion for sign language recognition, Pattern Recognit Lett, 86 (2017) 1–8, DOI: 10.1016/ j.patrec.2016.12.004
  • Mittal A, Kumar P, Roy P P, Balasubramanian B & Chaudhuri B B, A modified LSTM model for continuous sign language recognition using leap motion, IEEE Sens J, 19 (2019) 7056–7063, DOI : 10.1109/JSEN.2019.2909837.
  • Chuan C, Regina E & Guardino C, American sign language recognition using leap motion sensor, Proc Int Conf on Mach Learn Appl (Detroit, MI) 2014, 541–544, DOI:10.1109/ICMLA.2014.110
  • Kumar P, Roy P P & Dogra D P, Independent Bayesian classifier combination based sign language recognition using facial expression, J Inform Sci, 428 (2018) 30–48, DOI:10.1016/ j.ins.2017.10.046
  • Naglot D & Kulkarni M, Real time sign language recognition using the leap motion controller, Proc Int Conf on Inventive Comput Tech (Coimbatore, Tamilnadu) 2016, 1–5, DOI:10.1109/INVENTIVE.2016.7830097.
  • Hisham B & Hamouda A, Arabic sign language recognition using Ada-Boosting based on a leap motion controller, Int J Inf Tecnol, 13 (2021) 1221–1234, DOI: https://doi.org/10.1007/s41870-020-00518-5
  • Huang S, Mao C, Tao J & Ye Z, A novel Chinese sign language recognition method based on keyframe-centered clips, IEEE Signal Process Lett, 25 (2018) 442–446, DOI:10.1109/ LSP.2018.2797228
  • Zhu G, Zhang L, Shen P & Song J, Multimodal gesture recognition using 3-d convolution and convolutional LSTM, IEEE Access, 5 (2017) 4517–4524, DOI:10.1109/ACCESS.2017.2684186.
  • Man G & Sun X, Interested keyframe extraction of commodity video based on adaptive clustering annotation, Appl Sci, 12 (2022) 1502, DOI:https://doi.org/10.3390/app12031502.

Abstract Views: 142

PDF Views: 78




  • Sign Language Recognition Using Deep CNN with Normalised Keyframe Extraction and Prediction Using LSTM

Abstract Views: 142  |  PDF Views: 78

Authors

Jayanthi P
Department of Computer Technology, MIT, Anna University, Chennai 600 044, Tamilnadu, India
Ponsy R K Sathia Bhama
Department of Computer Technology, MIT, Anna University, Chennai 600 044, Tamilnadu, India
B Madhubalasri
Department of Computer Technology, MIT, Anna University, Chennai 600 044, Tamilnadu, India

Abstract


Sign Language Recognition (SLR) targets interpreting the signs so as to facilitate communication between hearing or speaking disabled people and normal people. This makes communication between normal people and signers effective and seamless. The scarcely available key information regarding the gestures is the key to recognise the signs. To implement continuous sign language gesture recognition, gestures are identified from the video using Deep Convolutional Neural Network. Recurrent Neural Network- Long Short-Term Memory verifies the semantics of the gesture sequence, which eventually will be converted into speech. The problem of constructing meaningful sentences from continuous gestures inspired the proposed system to develop a model based on it. The model is designed to increase the effectiveness of the classification by processing only the principal elements. The keyframes are identified and processed for classification. Validation of sentences can be done O(N). The sentences are converted into voiceover to have elegant communication between impaired and normal people. The model obtained an accuracy of 89.24% while training over Convolutional Neural Network to detect gestures and performed better than other pre-trained models and an accuracy of 89.99% while training over Recurrent Neural Network- Long Short-Term Memory to predict the next word using grammar phrases. This keyframe-to-voice conversion, forming proper sentences, enthrals people to have harmonious communication.

Keywords


Deaf-Mute People, Gesture Recognition, Indian Sign Language, Relationship Signs, Signer.

References