Open Access
Subscription Access
Affective Model Based Speech Emotion Recognition Using Deep Learning Techniques
Subscribe/Renew Journal
Human beings express emotions in multiple ways. Some common ways that emotions are expressed are through writing, speech, facial expression, body language or gesture. In general, it is believed that emotions are, first and foremost, internal feelings and experience. Speech is a powerful form of communication that is accompanied by the speaker's emotions. Specific prosodic signs, such as pitch variation, frequency, speech speed, rhythm, and voice quality, are accessible to speakers to express and listeners to interpret and decode the full spoken message. This paper aims to establish an affective model based speech emotion recognition system using deep learning techniques such as RNNwith LSTMon German and English Language datasets.
Keywords
Emotion recognition, RNN, Speech, Neural Network.
User
Subscription
Login to verify subscription
Font Size
Information
- O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed, and H. Arshad, “State-of-the-art in artificial neural network applications: A survey,” Heliyon, vol. 4, no. 11, 2018.
- D. Kollias, M. Yu, A. Tagaris, G. Leontidis, A. Stafylopatis, and S. Kollias, “Adaptation and contextualization of deep neural network models,” In 2017 IEEE Symposium Series on Computational Intelligence (SSCI)(pp.1–8).IEEE.doi : https://doi.org/10.1109/SSCI.2017.8280975
- K. Han, D. Yu, and I. Tashev, “Speech emotion recognition using deep neural network and extreme learning machine,” In Fifteenth annual conference of the international speech communication association, pp. 223–227, 2014. Retrieved from https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/IS140441.pdf
- A. M. Badshah, N. Rahim, N. Ullah, J. Ahmad, K. Muhammad, M. Y. Lee, and S. W. Baik, “Deep features-based speech emotion recognition for smart affective services,” Multimedia Tools and Applications, vol. 78, pp.5571–5589, 2017.Doi : https://doi.org/10.1007/s11042-017-5292-7
- Y. Saito, S. Takamichi, and H. Saruwatari, “Statistical parametric speech synthesis incorporating generative adversarial networks,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 1, pp. 84– 96, 2018. Doi: 10.1109/TASLP.2017.2761547
- X. Zhou, G. Junqi, and R. Bie, “Deep learning based affective model for speech emotion recognition,” In 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), pp.841– 846. IEEE.
- T. Zhang, and j. Wu, “Speech emotion recognition with i-vector feature and RNN model,” In 2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), pp. 524 –528, 2015.IEEE. Doi:https://doi.org/10.1109/ChinaSIP.2015.7230458
- K. Y. Huang, C. H. Wu, T. H. Yang, M. H. Su, and J. H. Chou, (2016, December). Speech emotion recognition u Chou, “Speech emotion recognition using autoencoder bottleneck features and LSTM,” In 2016 International Conference on Orange Technologies (ICOT), pp. 1– 4, IEEE. Doi: https://doi.org/10.1109/ICOT.2016.8278965
- S. An, Z. Ling, and L. Dai, “Emotional statistical parametric speech synthesis using LSTM-RNNs,” In 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPAASC), pp.1613–1616.IEEE. Doi:https://doi.org/10.1109/APSIPA.2017.8282282
- S. L. Rose, L. A. Kumar, and D. K. Renuka, Deep learning using Python, 2019.
Abstract Views: 339
PDF Views: 0