Open Access
Subscription Access
Open Access
Subscription Access
Assamese Connected Digit Recognition System
Subscribe/Renew Journal
In this work, we present the development of a connected digit recognition system in Assamese language. Assamese is an under-resourced language of North-East India that is widely spoken in the state of Assam. The text corpus used in this work, consists of a sequence 7 digits spoken in continuous manner. In order to capture the variations in phonetic context, the sequence of digits were arranged in such a way that, each digit occur in all the 7 positions. The speech corpus used in this work was collected from 11 native Assamese speakers out of which 5 were female while 6 were male. Mel Frequency Cepstral Coefficient (MFCC) features have been used as front-end features. We have explored the Subspace Gaussian Mixture Model (SGMM) based acoustic modeling approach in addition to the Gaussian Mixture Model (GMM) within the Hidden Markov Model (HMM) framework. Accuracies of 95.7% and 95.9% are achieved in GMM-HMM and SGMM-HMM systems respectively.
Keywords
Assamese Language, Digit Recognition, SGMM-HMM.
User
Subscription
Login to verify subscription
Font Size
Information
- D. C. Wyld, B. Saxena, and C. Wahi, “Hindi digits recognition system on speech data collected in different natural noise environments,” Computer Science and Information Technology, vol. 5, pp. 23-30, 2015. DOI: 10.5121/csit.2015.50303.
- C. Kurian, and K. Balakrishnan, “Connected digit speech recognition system for Malayalam language,” Sadhana, vol. 38, no. 6, December 2013. DOI: 10.1007/s12046-013-0160-2.
- S. Karpagavalli, R. Deepika, P. Kokila, K. U. Rani, and E. Chandra, “Isolated Tamil digit speech recognition using template-based and HMM-based approaches,” in P. V. Krishna, M. R. Babu, and E. Ariwa, (eds.), Global Trends in Information Systems and Software Applications, ObCom 2011. Communications in Computer and Information Science, vol. 270, Springer, Berlin, Heidelberg, 2012.
- D. S. Kulkarni, R. R. Deshmukh, V. J. L. Patil, P. P. Shrishrimal, S. D. Waghmare, and A. M. Oirere, “Marathi isolated digit recognition system using HTK,” IJCA Proceedings on International Conference on Cognitive Knowledge Engineering, ICKE, vol. 2016, no. 2, pp. 42-45, January 2018.
- B. D. Sarma, A. Dey, W. Lalhminghlui, P. Sarma, and S. R. M. Prasanna, “Robust Mizo digit recognition using data augmentation and tonal information,” 9th International Conference on Speech Prosody, Poland, 13-16 June 2018.
- H. Sarma, N. Saharia, and U. Sharma, “Development and analysis of speech recognition systems for Assamese language using HTK,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 17, no. 1, pp. 7:1-7:14, November 2017. DOI: 10.1145/3137055.
- M. Sarma, K. Dutta, and K. K. Sarma, “Assamese numeral corpus for speech recognition using cooperative ANN architecture,” International Journal of Electrical and Electronics Engineering, vol. 3, no. 8, pp. 456-465, 2009.
- B. D. Sarma, M. Sarma, M. Sarma and S. R. M. Prasanna, “Development of Assamese phonetic engine: Some issues,” 2013 Annual IEEE India Conference (INDICON), pp. 1-6, Mumbai, India, 2013. DOI: 10.1109/INDCON.2013.6725966.
- S. Shahnawazuddin, D. Thotappa, B. D. Sarma, A. Deka, S. R. M. Prasanna and R. Sinha, “Assamese spoken query system to access the price of agricultural commodities,” 2013 National Conference on Communications (NCC), pp. 1-5, New Delhi, India, 2013. DOI: 10.1109/NCC.2013.6488011.
- A. Dey, S. Shahnawazuddin, Deepak K. T., S. Imani, S. R. M. Prasanna, and R. Sinha, “Enhancements in Assamese spoken query system: Enabling background noise suppression and flexible queries,” 2016 Twenty Second National Conference on Communication (NCC), pp. 1-6, Guwahati, India, 2016. DOI: 10.1109/NCC.2016.7561193.
- S. Shahnawazuddin, D. Thotappa, A. Dey, S. Imani, S. R. M. Prasanna, and R. Sinha, “Improvements in IITG Assamese spoken query system: Background noise suppression and alternate acoustic modeling,” Journal of Signal Processing Systems, vol. 88, no. 1, pp. 91-102, 2017.
- https://en.wikipedia.org/wiki/Assamese_language
- Indian Language Speech sound Label set (ILSL12) (Version 2.1.6). Available: https://www.iitm.ac.in/donlab/tts/downloads/cls/cls v2.1.6.pdf
- A. Y. Mon, W. Pa, and Y. K. Thu, “Building HMM-SGMM continuous automatic speech recognition on Myanmar web news,” Proc. of 15th International Conference on Computer Applications (ICCA 2017), pp. 446-453, 2017.
- D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, ….. K. Vesely, “The Kaldi speech recognition toolkit,” in IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2011), p. 4, IEEE Signal Processing Society, Hawaïï, USA, 11-15 December 2011.
- http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mf ccs/
Abstract Views: 329
PDF Views: 0