Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Voice Text Concurrent Transmission Based on Locale


Affiliations
1 Deenbandhu Chhotu Ram University of Science & Technology, Murthal, Haryana, India
     

   Subscribe/Renew Journal


Among human beings, speech is considered to be the principal mode of communication as it is natural as well as efficient way of exchanging one’s views, thoughts and information with other(s). This paper takes a tour of ASR system where the user can type text on computer screen not by using keyboard but by providing voice input through his android mobile phone.

Keywords

Speech Recognition System, MFCC, HMM, N-Gram Dataset, LPC, ASR.
Subscription Login to verify subscription
User
Notifications
Font Size


  • Morris, A. C., Maier, V. & Green, P. (1999). From WER and RIL to MER and WIL: Improved evaluation measures for connected speech recognition. Institute of Phonetics Saarland University, Germany.
  • Lee, S., Kang, S. & Hanseok, K. O., Jongseong, Y. & Minseok, K. (2013). Dialogue Enabling Speech-toText User Assistive Agent with Auditory Perceptual Beam forming for Hearing-Impaired. International Conference on Consumer Electronics (ICCE).
  • Minematsu, N., Saito, D. & Hirose, K. (2008). Experimental Study of Structure to Speech Conversion. 9th International Conference on Signal Processing (pp. 651-654).
  • Gyorgy, S. A., Kos Mate, T. & Klara, V. (1997). Automatic Speech to Text Transformation of Spontaneous Job Interviews on the HuComTech Database. Budapest University of Technology and Economics.
  • Jaiswal, P. K. & Mishra, P. K. (2012). A review of speech pattern recognition survey. International Journal of Computer Science and Technology, 3(1), 709-713.
  • Bansal, A., Kant, K. & Chauhan, K. (2013). A review of speech recognition system. COMPTECH: An International Journal of Computer Sciences, January, 3(6), 709-713.
  • Acero, A. (2000). An Overview of Text-to-Speech Synthesis. Speech Technology Group Microsoft Corporation Redmond. Proceedings of 2000 IEEE Workshop on Speech Coding.
  • Lamel, L., Gauvain, J. L., Le, V. B., Oparin, I. & Meng, S. (2011). Improved Models for Mandarin Speech-to-text Transcription. IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4660-4663).
  • Adell, J., Aguero, P. D. & Bonafonte, A. (2006). Database Pruning for Unsupervised Building of Text-To-speech Voices. Department of Signal Theory and Communications. TALP Research Center.
  • Tuerk, C., Monaco, P. & Robinson, T. (1991). The Development of A Connectionist Multiple-voice Text-to-speech System. Cambridge University Engineering Department.
  • Amarasekara, M. S., Bandara, K. M. N. S., Yithana, B. Y. A. I., Silva, O. H. & Jayakody. (2013). Real Time Interactive Voice Communication. The 8th International Conference on Computer Science & Education (ICCSE 2013) April, (pp. 26-28).
  • McCowan, I., Moore, D., Dines, J., Gatica-Perez, J., Flynn, M., Wellner, P. & Bourlard, H. (2005). On The Use of Information Retrieval Measures for Speech Recognition Evaluation. IDIAP Research Report.
  • Alter, R. (1968). Utilization of Contextual Constraints in Automatic Speech Recognition. IEEE Transactions on Audio and Electroacoustics, March, 16(1), 6-11.
  • Diehl, F., Gales, M. J. F., Tomalin, M. & Woodland, P. C. (2008). Phonetic Pronunciations for Arabic Speech-to-text Systems. Engineering Department, Cambridge University, Trumpington St., Cambridge.
  • Furnkranz, J. (1998). A Study Using n-gram Features for Text Categorization. Austrian Research Institute for Artificial Intelligence. Technical Report OEFAI-TR-98-30.
  • Mukhopadhayay, A., Chakraborty, S., Choudhary, M., Lahiri, A., Dey, S. & Basu, A. (2006). Shruti: An Embedded Text-to Speech System for Indian Languages. IEEE Proceedings on Software Engineering, April, 153(2), 75-79.
  • Henry, A. P. & Devaraj, C. G. (2004). Alaigal- A Tamil Speech Recognition. Tamil Internet Singapore.
  • Abushariah, A. A. M., Gunawan, T. S., Abushariah, M. A. M. & Khalifa, O. O. (2010). English Digits Speech Recognition System Based on Hidden Markov Models. International Conference on Computer & Communication Engineering (pp. 978).
  • Al-Qatab, B. A. Q. & Ainon, R. N. (2010). Arabic Speech Recognition Using Hidden Markov Model Toolkit (HTK).
  • Singh, B., Kapur, N. & Kaur, P. (2012). Speech recognition with hidden Markov model: A review. International Journal of Advanced Research in Computer Science and Software Engineering, 75(2), 8-15.
  • Heerden, C. V., Barnard, E., Feld, M. & Miller, C. (2010). Combining Regression & Classification Methods for Improving Automatic Speaker Age Recognition. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing.
  • Corfield, C. (2012). Demystifying Speech Recognition. nVoq White Paper.
  • Yeo, C. Y., Al-Haddad, S. A. R. & Ng, N. G. (2011). Animal Voice Recognition for Identification (ID) Detection System. IEEE 7th International Colloquium on Signal Processing & it's Applications.
  • Kaur, J., Nidhi. & Kaur, R. (2012). Issues involved in speech to text conversion. International Journal of Computational Engineering Research, March-April, 2(2), 512-515.
  • Sharma, R. & Wason, G. (2012). Speech recognition & synthesis tool: Assistive technology for physically disabled persons. International Journal of Computer Science & Telecommunications, April, 3(4), 86-91.
  • Muhammad, G., Alotaibi, Y. A. & Huda, M. N. (2009). Automatic Speech Recognition for Bangla Digits.
  • Zhiyan, H., Shxian, L. & Jian, W. (2012). Speech Emotion Recognition System based on Integrating Feature and Improved HMM. The 2nd International Conference on Computer Application and System Modeling.
  • Sarfaraz, H., Hussain, S., Bokhari, R., Raza, A. A., Ullah, I., Sarfaraz, Z., Pervez, S., Mustafa, A., Javed, I. & Praveen, R. (2010). Large Vocabulary Continuous Speech Recognition for Urdu. International Conference on the Frontiers of Information Technologies (December, pp. 21-23), Islamabad, Pakistan.
  • Kumar, K. & Aggarwal, R. K. (2011). Hindi Speech Recognition System using HTK. International Journal of Computing and Business Research, 2(2).
  • Myers, L. (2004). An Exploration of Voice Biometrics. SANS Institute Infosec.
  • Mangu, L. & Padmanabhan, M. (2001). Error Corrective Mechanisms for Speech Recognition. International Conference on Acoustics, Speech and Signal Processing.
  • Chandrashekhar, M., & Ponnavaikko, M. (2008). Tamil Speech Recognition: A Complete Model. Electronic Journal Technical Acoustics (pp. 1-15)
  • Jackson, M. (2005). Automatic Speech Recognition: Human Computer Interface for Kinyarwanda Language. Master Thesis, Faculty of Computing & Information Technology, Makerere University.
  • Akram, M. U., & Arif, M. (2004). Design of an Urdu Speech Recognizer Based Upon Acoustic Phonetic Modeling Approach. In proceedings of 8th International Conference on Multitopic INMIC.
  • Bohac, M. (2012). Performance Comparison of Several Techniques to Detect Words in Audio Streams and Audio Scene. IEEE 54th International Symposium ELMAR-2012.
  • Pleva, M., Ondas, S., Juhar, J., Cizmar, A., Papaj, J., Dobos, L. (2011). Speech & Mobile Technologies for Cognitive Communication & Information Systems. IEEE Proceedings of 2nd International Conference on Cognitive Info Communications (CogInfoCom).
  • Feld, M., Barnard, E., Heerden, C. V., & Muller, C. (2009). Multilingual Speaker Age Recognition: Recognition Analyses on the Lwazi Corpus. IEEE Workshop on Automatic Speech Recognition & University (pp. 534-539).
  • Grimm, M., Kroschel, K., & Narayanana, S. (2008). The Vera Am Mittag German Audio-Visual Emotional Speech Database. In proceedings of the Multimedia & Expo.
  • Abushariah, M. A. M., Zainuddin, R. N. A. R., Abmhara, M., Khalifa, O. O. (2007). Human Computer Interaction Using Isolated Words Speech Recognition Technology. International Conference on Intelligent & Advanced Systems.
  • Ilyas, M. Z., Samad, S. A., Hussain, A., Ishak, K. A. (2007). Speaker Verification using Vector Quantization and Hidden Markov Model. The 5th Student Conference on Research & Development.
  • Morgan, N. (2012). Deep and Wide: Multiple Layers in Automatic Speech Recognition. IEEE Transactions on Audio, Speech & Language Processing, January, 20(1) 7-13.
  • Soluade, O. A. (2009). A Comparative Analysis of Speech Recognition Platforms. Communications of the IIMA.
  • Chen, O. T. C., Gu, J. J., Lu, P. T. & Ke, J. Y. (2012). Emotion Inspired Age and Gender Recognition Systems.
  • Radha, V., Vimala, C., & Krishnaveni, M. (2012). Continuous Speech Recognition System for Tamil Language using Monophone-based Hidden Markov Model. Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology (pp. 227-231).
  • Bettelheim, R., & Steele, D. (2010). Speech and Command Recognition. FreeScale White Paper.
  • Basu, S., Neti, C., Rajput, N., Senior, A., Subramaniam, L., & Verma, A. (1999). Audio-Visual Large Vocabulary Continuous Speech Recognition in Broadcast Domain. IEEE 3rd Workshop on Multimedia Signal Processing (pp. 475-481).
  • Mandal, S., Das, B., & Mitra, P. (2010). ShrutiII: A Vernacular Speech Recognition System in Bengali and an Application for Visually Impaired Community. IEEE Student's Technology Symposium (pp. 229-233).
  • Primorac, S., & Russo, M. (2012). Android Application for Sending SMS messages with Speech Recognition Interface.
  • Gaikward, S. K., Gawali, B. W., & Yannawar, P. (2010). A Review on Speech Recognition Technique. International Journal of Computer Applications, 10(3), 16-24.
  • Speech Recognition Technology Choices. (2010). A Vocollect White Paper.
  • Tiwari, V. (2010). MFCC and it's applications in speaker recognition. International Journal on Emerging Technique, 1(1), 19-22.
  • Vimala C., & Radha, V. (2012). A review on speech recognition challenges and approaches. World of Computer Science and Information Technology Journal, 2(1), 1-7.
  • Xiaofeng, W., & Nakatsu, R. (2010). Vision-aided Speech Recognition System for a Small Four-Legged Robot. International Conference on Audio Language & Image Processing (1073-1078).
  • Jian, Y., & Jin, J. (2012). An Interactive Interface between Human & Computer based on Pattern & Speech Recognition. International Conference on Systems & Informatics.
  • Basil, Y. & Semaan, P. (2012). ASR context-sensitive error correction based on microsoft n-gram dataset. Journal of Computing, 4(1), 34-42.
  • Hachkar, Z., Mounir, B., Farchi, A., Abbadi, J. E. (2011). Comparison of MFCC and PLP parameterization. Canadian Journal on Artificial Intelligence, Machine Learning & Pattern Recognition, April, 2(3), 41-55.
  • Lishuang, Z., & Zhiyan, H. (2010). Speech Recognition System Based on Integrating Feature and HMM.
  • Ng, T., Zhang, B., Nguyen, K., & Nguyen, L. (2008). Progress in the Bbn 2007 Mandarin Speech to Text System. BBN Technologies, 10 Moulton Street, Cambridge.
  • Rajalakshmi, R., & Revathy, A. (2012). Comparison of MFCC and PLP in Speaker Identification using GMM. International Conference on Computing and Control Engineering.

Abstract Views: 310

PDF Views: 0




  • Voice Text Concurrent Transmission Based on Locale

Abstract Views: 310  |  PDF Views: 0

Authors

Jyoti Madan
Deenbandhu Chhotu Ram University of Science & Technology, Murthal, Haryana, India
Ajmer Singh
Deenbandhu Chhotu Ram University of Science & Technology, Murthal, Haryana, India

Abstract


Among human beings, speech is considered to be the principal mode of communication as it is natural as well as efficient way of exchanging one’s views, thoughts and information with other(s). This paper takes a tour of ASR system where the user can type text on computer screen not by using keyboard but by providing voice input through his android mobile phone.

Keywords


Speech Recognition System, MFCC, HMM, N-Gram Dataset, LPC, ASR.

References