Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Text-Independent Speaker Identification Using Residual Feature Extraction Technique


Affiliations
1 Mepco Schlenk Engineering College, Sivakasi-626005, India
     

   Subscribe/Renew Journal


The Mel Frequency Cepstral Coefficients (MFCC) parameters are derived mainly to represent the spectral envelope or formant structure of the vocal tract system. In this paper, a new feature extraction technique WOCOR is proposed to capture the spectro temporal source excitation characteristics embedded in the linear predictive (LP) residual signal. The vocal Source Wavelet Octave Coefficients Of Residues (WOCOR) information contains pitch frequency and phase in the residual signal. WOCOR features are called vocal source feature because they are dependent on the source of the speech namely the pitch being generated by the vocal folds. WOCOR is generated by applying pitch synchronous wavelet transform to the residual signal. Pitch Synchronous wavelet transform is used to capture the spectro temporal characteristics of the excitation signal. Experimental evaluation is carried out on TIMIT database with 630 speakers using Gaussian Mixture Model (GMM) and Naive Bayesian Classifier. Experimental results show that, speaker identification based on GMM modeling out performs Naive Bayesian classifier based speaker identification. Comparatively an increased in speaker identification efficiency of 6.69% is achieved with GMM modeling for WOCOR feature extraction.

Keywords

Naive Bayesian Classifier, Feature Extraction, GMM, Speaker Identification, WOCOR.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 234

PDF Views: 2




  • Text-Independent Speaker Identification Using Residual Feature Extraction Technique

Abstract Views: 234  |  PDF Views: 2

Authors

S. Selva Nidhyananthan
Mepco Schlenk Engineering College, Sivakasi-626005, India
R. Shantha Selva Kumari
Mepco Schlenk Engineering College, Sivakasi-626005, India
G. Jaffino
Mepco Schlenk Engineering College, Sivakasi-626005, India

Abstract


The Mel Frequency Cepstral Coefficients (MFCC) parameters are derived mainly to represent the spectral envelope or formant structure of the vocal tract system. In this paper, a new feature extraction technique WOCOR is proposed to capture the spectro temporal source excitation characteristics embedded in the linear predictive (LP) residual signal. The vocal Source Wavelet Octave Coefficients Of Residues (WOCOR) information contains pitch frequency and phase in the residual signal. WOCOR features are called vocal source feature because they are dependent on the source of the speech namely the pitch being generated by the vocal folds. WOCOR is generated by applying pitch synchronous wavelet transform to the residual signal. Pitch Synchronous wavelet transform is used to capture the spectro temporal characteristics of the excitation signal. Experimental evaluation is carried out on TIMIT database with 630 speakers using Gaussian Mixture Model (GMM) and Naive Bayesian Classifier. Experimental results show that, speaker identification based on GMM modeling out performs Naive Bayesian classifier based speaker identification. Comparatively an increased in speaker identification efficiency of 6.69% is achieved with GMM modeling for WOCOR feature extraction.

Keywords


Naive Bayesian Classifier, Feature Extraction, GMM, Speaker Identification, WOCOR.