Open Access Open Access  Restricted Access Subscription Access

Efficient Feature Extraction for Fear State Analysis from Human Voice


Affiliations
1 Siksha ‘O’ Anusandhan University, Near PNB Bank Jagmohan Nagar, Khandagiri, Bhubaneswar - 751030, Odisha, India
 

Background/Objectives: Analysis of human speech emotion has been continued since long. As the study and recognition helps the society in many respects, we intend to analyze the similar type of emotions. Methods/Statistical Analysis: ‘Fear’ and ‘Nervousness’ are being analyzed in comparison with normal voice. The correlation between these two emotions found to be very close. These voices belong to Oriya language. The popular features of speech, Mel-frequency cepstral coefficients (MFCCs) are used. As the fundamental frequency is unique from voice to voice, it is a suitable feature in case of similar voice signals. Findings: The combination of these two features outperformed the single feature based classification. In addition, the performance has been measured using log-likelihood ratio parameter. For recognition purpose, Gaussian mixture model (GMM) has been selected, and tested for these features. Novelty/Improvement: The individual MFCCs show 81.33%, whereas the combined features show 86.01% of accuracy. It is clearly evidenced in the result section.

Keywords

Correlation Coefficient, Feature Extraction, Fear State, Gaussian Mixture Model, Human Voice, Log-likelihood Ratio, Mel-frequncy Cepstral Coefficient.
User

Abstract Views: 175

PDF Views: 0




  • Efficient Feature Extraction for Fear State Analysis from Human Voice

Abstract Views: 175  |  PDF Views: 0

Authors

Palo Hemanta Kumar
Siksha ‘O’ Anusandhan University, Near PNB Bank Jagmohan Nagar, Khandagiri, Bhubaneswar - 751030, Odisha, India
Mihir N. Mohanty
Siksha ‘O’ Anusandhan University, Near PNB Bank Jagmohan Nagar, Khandagiri, Bhubaneswar - 751030, Odisha, India

Abstract


Background/Objectives: Analysis of human speech emotion has been continued since long. As the study and recognition helps the society in many respects, we intend to analyze the similar type of emotions. Methods/Statistical Analysis: ‘Fear’ and ‘Nervousness’ are being analyzed in comparison with normal voice. The correlation between these two emotions found to be very close. These voices belong to Oriya language. The popular features of speech, Mel-frequency cepstral coefficients (MFCCs) are used. As the fundamental frequency is unique from voice to voice, it is a suitable feature in case of similar voice signals. Findings: The combination of these two features outperformed the single feature based classification. In addition, the performance has been measured using log-likelihood ratio parameter. For recognition purpose, Gaussian mixture model (GMM) has been selected, and tested for these features. Novelty/Improvement: The individual MFCCs show 81.33%, whereas the combined features show 86.01% of accuracy. It is clearly evidenced in the result section.

Keywords


Correlation Coefficient, Feature Extraction, Fear State, Gaussian Mixture Model, Human Voice, Log-likelihood Ratio, Mel-frequncy Cepstral Coefficient.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i38%2F126930