Open Access Open Access  Restricted Access Subscription Access

Phoneme-Based Imagined Vowel Identification from Electroencephalographic Sub-Band Oscillations during Speech Imagery Procedures


Affiliations
1 Centre for Healthcare Technologies, Department of Biomedical Engineering, Sri Sivasubramaniya Nadar College of Engineering, Chennai 603 110 Tamil Nadu, India
 

Speech Imagery (SI) corresponds to imagining speaking an intended speech or a segment of speech. Decoding the SI process aids in building speech-based neural prosthetic devices. Though SI-based research has been carried out to decode imagined speech for more than a decade, there is a lag in achieving the naturalness of the spoken language. This is because the words are built as the combination of phonemes in any natural language, but the research so far has been involving the SI of vowels only. Hence, this work focuses on identifying the vowels from EEG signals acquired while imagining the corresponding phonemes. The acquisition process was repeated for multiple trials. The EEG signals were decomposed into five sub-band frequencies to analyze the activity during SI tasks. The energy coefficients extracted from the sub-band frequencies were employed in training the Recurrent Neural Network to classify the English vowels. Further, to emphasize the importance of training the classifier with multi-trial data, the results were compared with that of the single-trial data acquired from the same set of participants, and an accuracy of 84.5% and 88.9% were achieved for single and multi-trial protocols, respectively. The analysis using multi-trial data was able to achieve 4.4% higher accuracy when compared to single-trial data. Higher activations in the theta band during the speech imagery tasks and higher Classification accuracy while applying theta band features show the capability of using the theta band features in imagined speech decoding tasks.

Keywords

Electroencephalography, Imagined Vowel Identification, Phoneme, Recurrent Neural Network, Speech Imagery.
User
Notifications
Font Size

  • Curley W H, Forgacs P B, Voss H U, Conte M M & Schiff N D, Characterization of EEG signals revealing covert cognition in the injured brain, Brain, 141(5) (2018) 1404–421.
  • Tian X & Poeppel D, Mental imagery of speech: linking motor and perceptual systems through internal simulation and estimation, Front Hum Neurosci, 6 (2012) 314.
  • Conti E, Calderoni S, Marchi V, Muratori F, Cioni G & Guzzetta A, The first 1000 days of the autistic brain: a systematic review of diffusion imaging studies, Front Hum Neurosci, 9 (2015) 159.
  • Guenther F H, Brumberg J S, Wright E J, Nieto-Castanon A, Tourville J A, Panko M, Law R, Siebert S A, Bartels J L, Andreasen D S, Ehirim P, Hui Mao & Philip RK, A wireless brain-machine interface for real-time speech synthesis, PloS one, 4(12) (2009) 8218.
  • DaSalla C S, Kambara H, Sato M & Koike Y, Single-trial classification of vowel speech imagery using common spatial patterns, Neural Netw, 22(9) (2009) 1334–1339.
  • Nuñez A I R, Yue Q, Pasalar S & Martin R C, The role of left vs. right superior temporal gyrus in speech perception: An fMRI-guided TMS study, Brain Lang, 209 (2020) 104838.
  • Guy V, Soriani M H, Bruno M, Papadopoulo T, Desnuelle C & Clerc M, Brain computer interface with the P300 speller: usability for disabled people with amyotrophic lateral sclerosis, Annals of physical and rehabilitation medicine, 61(1) (2018) 5–11.
  • Uzawa S, Takiguchi T, Ariki Y & Nakagawa S, Spatiotemporal properties of magnetic fields induced by auditory speech sound imagery and perception, in 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE) 2017, 2542–2545.
  • Min B, Kim J, Park H J & Lee B, Vowel imagery decoding toward silent speech BCI using extreme learning machine with electroencephalogram, BiMed Research International, 2016.
  • Zhao S & Rudzicz F, Classifying phonological categories in imagined and articulated speech, in IEEE Int Conf Acoust Speech Signal Process (IEEE) 2015, 992–996.
  • Sun P & Qin J, Neural networks based EEG-speech models, arXiv preprint arXiv:1612.05369 (2016).
  • Saha P, Abdul-Mageed M & Fels S, Speak your mind! towards imagined speech recognition with hierarchical deep learning, arXiv preprint arXiv:1904.05746 (2019).
  • Islam M M & Shuvo M M H, DenseNet based speech imagery EEG signal classification using Gramian angular field, in 5th Int Conf Adv Electr Eng (IEEE) 2019, 149–154.
  • Nguyen C H, Karavas G K & Artemiadis P, Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features, J Neural Eng, 15(1) (2019) 016002.
  • González-Castañeda E F, Torres-García A A, Reyes-García C A & Villaseñor-Pineda L, Sonification and textification: Proposing methods for classifying unspoken words from EEG signals, Biomed Signal Process Control, 37 (2017) 82–91.
  • Ramirez-Quintana J A, Macias-Macias J M, Ramirez-Alonso G, Chacon-Murguia M I, & Corral-Martinez L F, A novel deep capsule neural network for vowel imagery patterns from EEG signals, Biomed Signal Process Control, 81 (2023) 104500.
  • Macías-Macías J M, Ramírez-Quintana J A, Chacón-Murguía M I, Torres-García A A, & Corral-Martínez L F, Interpretation of a deep analysis of speech imagery features extracted by a capsule neural network, Comput Biol Med, (2023) 106909.
  • Pan H, Li Z, Tian C, Wang L, Fu Y, Qin X, & Liu F, The LightGBM-based classification algorithm for Chinese characters speech imagery BCI system, Cogn Neurodyn, (2022) 1–12.
  • Sandhya C, Sree R A & Kavitha A, Analysis of speech imagery using consonant-vowel syllable speech pairs and brain connectivity estimators, in 2nd Int Conf Biomed Syst Signals Images, February 2015.
  • Sree R A & Kavitha A, Vowel classification from imagined speech using sub-band EEG frequencies and deep belief networks, in 4th Int Conf Signal Process Commun Network (IEEE) 2017, 1–4.
  • Sandhya C & Kavitha A, Analysis of speech imagery using brain connectivity estimators on consonant-vowel-consonant words, Int J Biomed Eng Technol, 30(4) (2019) 329–343.
  • Chengaiyan S, Retnapandian A S & Anandan K, Identification of vowels in consonant–vowel–consonant words from speech imagery based EEG signals, Cogn Neurodyn, 14(1) (2020) 1–19.
  • Chen J, Jiang D, Zhang Y & Zhang P, Emotion recognition from spatiotemporal EEG representations with hybrid convolutional recurrent neural networks via wearable multichannel headset, Comput Commun, 154 (2020) 58—65.
  • Yu W, Kim I Y & Mechefske C, Analysis of different RNN autoencoder variants for time series classification and machine prognostics, Mech Syst Signal Process, 149 (2021) 107322.
  • Browarska N, Kawala-Sterniuk A, Zygarlicki J, Podpora M, Pelc M, Martinek R & Gorzelańczyk E J, Comparison of smoothing filters’ influence on quality of data recorded with the emotiv EPOC flex brain–computer interface headset during audio stimulation, Brain Sci, 11(1) (2021) 98.
  • Reichert C, Dürschmid S, Bartsch M V, Hopf J M, Heinze H J & Hinrichs H, Decoding the covert shift of spatial attention from electroencephalographic signals permits reliable control of a brain-computer interface, J Neural Eng, 17(5) (2020) 056012.
  • Xiong Q, Zhang X, Wang W F & Gu Y, A parallel algorithm framework for feature extraction of EEG signals on MPI, Comput Math Methods Med, 2020.
  • Shajil N, Mohan S, Srinivasan P, Arivudaiyanambi J & Murrugesan A A, Multiclass classification of spatially filtered motor imagery EEG signals using convolutional neural network for bci based applications, J Med Biol Eng, 40(5) (2020) 663–672.
  • Sandhya C, Srinidhi G, Vaishali R, Visali M & Kavitha A, Analysis of speech imagery using brain connectivity estimators, in IEEE 14th Int Conf Cognit Info Cognit Comput (IEEE) 2015, 352–359.
  • Van Dijk H, Schoffelen J M, Oostenveld R & Jensen O, Prestimulus oscillatory activity in the alpha band predicts visual discrimination ability, J Neurosci, 28(8) (2008) 1816–1823.
  • Summerfield C & Mangels J A, Coherent theta-band EEG activity predicts item-context binding during encoding, Neuroimage, 24(3) (2005) 692–703.
  • Wang Z, Tong Y & Heng X, Phase-locking value based graph convolutional neural networks for emotion recognition, IEEE Access, 7 (2019) 93711–93722.
  • Tsai F F, Fan S Z, Lin Y S, Huang N E & Yeh J R, Investigating power density and the degree of nonlinearity in intrinsic components of anesthesia EEG by the hilbert-huang transform: an example using ketamine and alfentanil, PloS one, 11(12) (2016) e0168108.
  • Yi W, Qiu S, Wang K, Qi H, Zhang L, Zhou P & Ming D, Evaluation of EEG oscillatory patterns and cognitive process during simple and compound limb motor imagery, PloS one, 9(12) (2014) e114853.
  • Daubechies I, The wavelet transform, time-frequency localization and signal analysis, IEEE Trans Inf Theory, 36(5) (1990) 961—1005.
  • Alomari M H, Awada E A, Samaha A & Alkamha K, Wavelet-based feature extraction for the analysis of EEG signals associated with imagined fists and feet movements, Comput Inf Sci, 7(2) (2014) 17.
  • Elman J L, Finding structure in time, Cogn Sci, 14(2) (1990) 179–211.
  • Hoshi I, Shimobaba T, Kakue T & Ito T, Single-pixel imaging using a recurrent neural network combined with convolutional layers, Opt Express, 28(23) (2020) 34069–34078.

Abstract Views: 63

PDF Views: 54




  • Phoneme-Based Imagined Vowel Identification from Electroencephalographic Sub-Band Oscillations during Speech Imagery Procedures

Abstract Views: 63  |  PDF Views: 54

Authors

Anandha Sree Retnapandian
Centre for Healthcare Technologies, Department of Biomedical Engineering, Sri Sivasubramaniya Nadar College of Engineering, Chennai 603 110 Tamil Nadu, India
Kavitha Anandan
Centre for Healthcare Technologies, Department of Biomedical Engineering, Sri Sivasubramaniya Nadar College of Engineering, Chennai 603 110 Tamil Nadu, India

Abstract


Speech Imagery (SI) corresponds to imagining speaking an intended speech or a segment of speech. Decoding the SI process aids in building speech-based neural prosthetic devices. Though SI-based research has been carried out to decode imagined speech for more than a decade, there is a lag in achieving the naturalness of the spoken language. This is because the words are built as the combination of phonemes in any natural language, but the research so far has been involving the SI of vowels only. Hence, this work focuses on identifying the vowels from EEG signals acquired while imagining the corresponding phonemes. The acquisition process was repeated for multiple trials. The EEG signals were decomposed into five sub-band frequencies to analyze the activity during SI tasks. The energy coefficients extracted from the sub-band frequencies were employed in training the Recurrent Neural Network to classify the English vowels. Further, to emphasize the importance of training the classifier with multi-trial data, the results were compared with that of the single-trial data acquired from the same set of participants, and an accuracy of 84.5% and 88.9% were achieved for single and multi-trial protocols, respectively. The analysis using multi-trial data was able to achieve 4.4% higher accuracy when compared to single-trial data. Higher activations in the theta band during the speech imagery tasks and higher Classification accuracy while applying theta band features show the capability of using the theta band features in imagined speech decoding tasks.

Keywords


Electroencephalography, Imagined Vowel Identification, Phoneme, Recurrent Neural Network, Speech Imagery.

References