Open Access Open Access  Restricted Access Subscription Access

A Two-Stage Image Frame Extraction Model -ISLKE for Live Gesture Analysis on Indian Sign Language


Affiliations
1 CSE, GIT, GITAM University (Deemed to be), Gandhi Nagar, Rushikonda, Visakhapatnam, Andhra Pradesh 530 045, India
 

The new industry revolution focused on Smart and interconnected technologies along with the Robotics and Artificial Intelligence, Machine Learning, Data analytics etc. on the real time data to produce the value-added products. The ways the goods are being produced are aligned with the people’s life style which is witnessed in terms of wearable smart devices, digital assistants, self-driving cars etc. Over the last few years, an evident capturing of the true potential of Industry 4.0 in health service domain is also observed. In the same context, Sign Language Recognition- a breakthrough in the live video processing domain, helps the deaf and mute communities grab the attention of many researchers. From the research insights, it is clearly evident that precise extraction and interpretation of the gesture data along with an addressal of the prevailing limitations is a crucial task. This has driven the work to come out with a unique keyframe extraction model focusing on the preciseness of the interpretation. The proposed model ISLKE deals with a clustering-based two stage keyframe extraction process. It has experimented on daily usage vocabulary of Indian Sign Language (ISL) and attained an average accuracy of 96% in comparison to the ground-truth facts. It is also observed that with the two-stage approach, filtering of uninformative frames has reduced complexity and computational efforts. These key leads, help in the further development of commercial communication applications in order to reach the speech and hearing disorder communities.

Keywords

Classification, Clustering, Featured Learning, Image Processing, Region of Interest, Video Summarization.
User
Notifications
Font Size

  • Apostolidis E, Adamantidou E, Metsai A I, Mezaris V & Patras I, Video summarization using deep neural networks: a survey, Proc IEEE, 109(11) 2021 1838–1863, doi: 10.1109/JPROC.2021.3117472.
  • Basavarajaiah M & Sharma P, Survey of compressed domain video summarization techniques, ACM-Comput Surv, 52(6) (2020) doi: https://doi.org/10.1145/3355398.
  • Workie A, Sharma R & Chung Y K, Digital video summarization techniques: A survey, Int J Eng Res Technol, 09(01) (2020) http://dx.doi.org/10.17577/ijertv9is010026.
  • Nama T & Deb S, Teleportation of human body kinematics for a tangible humanoid robot control, In Cognitive Computing for Human-Robot Interaction (Academic Press) 2021, 231–251, https://doi.org/10.1016/B978-0-323-85769-7.00011-2.
  • Schoeffmann K, Del Fabro M & Szkaliczki T, Keyframe extraction in endoscopic video, Multimed Tools Appl, 74 2015) 11187–11206, https://doi.org/10.1007/s11042-014-2224-7.
  • Parihar A S, Mittal R, Jain P & Himanshu, Survey and comparison of video summarization techniques, 5th Int Conf Comput Commun Signal Process (Chennai, India) 2021, 268–272, doi: 10.1109/ICCCSP52374.2021.9465347.
  • Shi Y, Yang H, Gong M, Liu X & Xia Y, A fast and robust key frame extraction method for video copyright protection, J Electr Comput Eng, (2017) 1–7, https://doi.org/10.1155/2017/1231794.
  • Huang S, Mao C, Tao J & Ye Z, A novel Chinese sign language recognition method based on keyframe-centered clips, IEEE Signal Processing Letters, 25(3) (2018) 442–446, doi: 10.1109/LSP.2018.2797228.
  • Pan W, Zhang X & Ye Z, Attention-based sign language recognition network utilizing keyframe sampling and skeletal features, IEEE Access, 8 (2020) 215592–215602, doi: 10.1109/ACCESS.2020.3041115.
  • Yan Y, Li Z, Tao Q, Liu C & Zhang R, Research on dynamic sign language algorithm based on sign language trajectory and key frame extraction, IEEE 2nd Int Conf Electron Technol (ICET) (Chengdu, China) 2019, 509–514, doi: 10.1109/ELTECH.2019.8839587.
  • Elahi G M M E & Yang Y-H, Online learnable keyframe extraction in videos and its application with semantic word vector in action recognition, Pattern Recognit, 122 (2022) https://doi.org/10.1016/j.patcog.2021.108273.
  • Haq H B, Asif M & Ahmad M B, Video summarization techniques: a review, Int J Sci Technol Res, 9(11) (2020) 146–153.
  • L. Xuan & Z. Hong, An improved canny edge detection algorithm (IEEE) 2017, 275–278, doi: 10.1109/ICSESS.2017.8342913.
  • Song R, Zhang Z & Liu H, Edge connection based canny edge detection algorithm, Pattern Recognit Image Anal, 27 (2017) 740–747, https://doi.org/10.1134/S1054661817040162.
  • Hu H, Wang W, Zhou W, Zhao W & Li H, Model-aware gesture-to-gesture translation, IEEE/CVF Conf Comput Vis Patt Recognit (Nashville, TN, USA) 2021, 16423-16432, doi: 10.1109/CVPR46437.2021.01616.
  • Gupta A, Mohatta S, Maurya J, Perla R, Hebbalaguppe R & Hassan E, Hand gesture based region marking for telesupport using wearables, IEEE Conf Comput Vis Pattern Recogn Work (Honolulu, HI, USA) 2017, 386–392, doi: 10.1109/CVPRW.2017.53.
  • Bakheet S & Al-Hamadi, Robust hand gesture recognition using multiple shape-oriented visual cues, EURASIP J Image Video Process, 26 (2021) 1–18, https://doi.org/10.1186/s13640-021-00567-1.
  • Lee D L & You W S, Recognition of complex static hand gestures by using the wristband-based contour features, IET Image Process, 12(1) (2018) 80–87, doi: 10.1049/iet-ipr.2016.1139www.ietdl.org.
  • Ponnarassery P K, Agnihotram G & Naik P, Human pose simulation and detection in real time using video streaming data, S N Comput Sci 1, 148 (2020), https://doi.org/10.1007/s42979-020-00153-8.
  • Awasthi V & Agnihotri A, A structural support vector machine approach for biometric recognition, Int J Comput Info Eng, 15(4) (2021) 273–280.

Abstract Views: 82

PDF Views: 59




  • A Two-Stage Image Frame Extraction Model -ISLKE for Live Gesture Analysis on Indian Sign Language

Abstract Views: 82  |  PDF Views: 59

Authors

Hyma J
CSE, GIT, GITAM University (Deemed to be), Gandhi Nagar, Rushikonda, Visakhapatnam, Andhra Pradesh 530 045, India
Rajamani P
CSE, GIT, GITAM University (Deemed to be), Gandhi Nagar, Rushikonda, Visakhapatnam, Andhra Pradesh 530 045, India

Abstract


The new industry revolution focused on Smart and interconnected technologies along with the Robotics and Artificial Intelligence, Machine Learning, Data analytics etc. on the real time data to produce the value-added products. The ways the goods are being produced are aligned with the people’s life style which is witnessed in terms of wearable smart devices, digital assistants, self-driving cars etc. Over the last few years, an evident capturing of the true potential of Industry 4.0 in health service domain is also observed. In the same context, Sign Language Recognition- a breakthrough in the live video processing domain, helps the deaf and mute communities grab the attention of many researchers. From the research insights, it is clearly evident that precise extraction and interpretation of the gesture data along with an addressal of the prevailing limitations is a crucial task. This has driven the work to come out with a unique keyframe extraction model focusing on the preciseness of the interpretation. The proposed model ISLKE deals with a clustering-based two stage keyframe extraction process. It has experimented on daily usage vocabulary of Indian Sign Language (ISL) and attained an average accuracy of 96% in comparison to the ground-truth facts. It is also observed that with the two-stage approach, filtering of uninformative frames has reduced complexity and computational efforts. These key leads, help in the further development of commercial communication applications in order to reach the speech and hearing disorder communities.

Keywords


Classification, Clustering, Featured Learning, Image Processing, Region of Interest, Video Summarization.

References