Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Deep Bidirectional RNNs Using Gated Recurrent Units & Long Short-Term Memory Units for Building Acoustic Models for Automatic Speech Recognition


Affiliations
1 Sardar Patel Institute of Technology, Andheri, Mumbai, Maharashtra, India
     

   Subscribe/Renew Journal


Deep Neural Networks are gaining popularity to train speech dataset for speech recognition. A lot of work has been done with various neural network models, starting right from conventional convolutional neural networks to deep recurrent neural networks. Research has led us to arrive at the conclusion that bidirectional RNNs are suited for speech recognition. It has been seen that bidirectional RNNs provide greater accuracy as compared to deep RNNs and unidirectional RNNs. Units that are used with bidirectional RNNs are usually Long Short-Term Memory units. They have their own advantages and disadvantages. Gated Recurrent Units can also be used. In this paper we have tried to experiment and compare between deep bidirectional models using GRU units and LSTM units.

Keywords

Acoustic Modeling, Automatic Speech Recognition, Bidirectional RNN, Convolutional Neural Networks, Deep Recurrent Neural Networks, Gated Recurrent Unit, Keras, Long Short-Term Memory (LSTM), MFCC, Recurrent Neural Networks, TimeDistributed Dense, TensorFlow, Spectrogram.
User
Subscription Login to verify subscription
Notifications
Font Size


  • Deep Bidirectional RNNs Using Gated Recurrent Units & Long Short-Term Memory Units for Building Acoustic Models for Automatic Speech Recognition

Abstract Views: 703  |  PDF Views: 1

Authors

Madhuri Jain
Sardar Patel Institute of Technology, Andheri, Mumbai, Maharashtra, India
Nishita Dutta
Sardar Patel Institute of Technology, Andheri, Mumbai, Maharashtra, India
Dnyaneshwari Bhirud
Sardar Patel Institute of Technology, Andheri, Mumbai, Maharashtra, India
Nikahat Mulla
Sardar Patel Institute of Technology, Andheri, Mumbai, Maharashtra, India

Abstract


Deep Neural Networks are gaining popularity to train speech dataset for speech recognition. A lot of work has been done with various neural network models, starting right from conventional convolutional neural networks to deep recurrent neural networks. Research has led us to arrive at the conclusion that bidirectional RNNs are suited for speech recognition. It has been seen that bidirectional RNNs provide greater accuracy as compared to deep RNNs and unidirectional RNNs. Units that are used with bidirectional RNNs are usually Long Short-Term Memory units. They have their own advantages and disadvantages. Gated Recurrent Units can also be used. In this paper we have tried to experiment and compare between deep bidirectional models using GRU units and LSTM units.

Keywords


Acoustic Modeling, Automatic Speech Recognition, Bidirectional RNN, Convolutional Neural Networks, Deep Recurrent Neural Networks, Gated Recurrent Unit, Keras, Long Short-Term Memory (LSTM), MFCC, Recurrent Neural Networks, TimeDistributed Dense, TensorFlow, Spectrogram.

References