Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

An Automatic Tool for High Quality Arabic Speech Synthesis


Affiliations
1 Department of Physics, Laboratory of Signal Processing, Tunis-1060, Tunisia
     

   Subscribe/Renew Journal


Speech synthesis TTS (text-to-speech) is the process of converting the written text into machine generated synthetic speech. Concatenative speech synthesis systems render speech by concatenating pre-recorded speech units. This work describes the Arabic TTS synthesis system. This system uses an automatic tool based on concatenation of the Arabic diphone with MBROLA synthesizer. The quality of a synthesized speech is improved by analyzing the spectrum features of voice source in various F0 ranges and timbres in detail. It generates speech synthesis based on estimation and optimization of the Arabic prosody by classifying the voice source into different types. The developed model enhances the quality of the naturalness, and the intelligibility of speech synthesis in various speaking environment.

Keywords

Analysis, Synthesis, Diphone, Prosody, Formant, Pitch, Timbre, Mbrola, Arabic Speech.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 231

PDF Views: 2




  • An Automatic Tool for High Quality Arabic Speech Synthesis

Abstract Views: 231  |  PDF Views: 2

Authors

Abdelkader Chabchoub
Department of Physics, Laboratory of Signal Processing, Tunis-1060, Tunisia
Adnen Cherif
Department of Physics, Laboratory of Signal Processing, Tunis-1060, Tunisia

Abstract


Speech synthesis TTS (text-to-speech) is the process of converting the written text into machine generated synthetic speech. Concatenative speech synthesis systems render speech by concatenating pre-recorded speech units. This work describes the Arabic TTS synthesis system. This system uses an automatic tool based on concatenation of the Arabic diphone with MBROLA synthesizer. The quality of a synthesized speech is improved by analyzing the spectrum features of voice source in various F0 ranges and timbres in detail. It generates speech synthesis based on estimation and optimization of the Arabic prosody by classifying the voice source into different types. The developed model enhances the quality of the naturalness, and the intelligibility of speech synthesis in various speaking environment.

Keywords


Analysis, Synthesis, Diphone, Prosody, Formant, Pitch, Timbre, Mbrola, Arabic Speech.