Gregory Clell Burnett
March 1999
Department of Applied Science
The physiological basis of Glottal Electromagnetic Micropower Sensors (GEMS) and their use in defining an excitation function for the human vocal tract
Abstract
The definition, use, and physiological basis of Glottal Electromagnetic Micropower Sensors (GEMS) is presented. These sensors are a new type of low power (< 20 milliwatts radiated) microwave regime (900 MHz to 2.5 GHz) multi-purpose motion sensor developed at the Lawrence Livermore National Laboratory. The GEMS are sensitive to movement in an adjustable field of view (FOV) surrounding the antennae. In this thesis, the GEMS has been utilized for speech research, targeted to receive motion signals from the subglottal region of the trachea. The GEMS signal is analyzed to determine the physiological source of the signal, and this information is used to calculate the subglottal pressure, effectively an excitation function for the human vocal tract. For the first time, an excitation function may be calculated in near real time using a noninvasive procedure.
Several experiments and models are presented to demonstrate that the GEMS signal is representative of the motion of the subglottal posterior wall of the trachea as it vibrates in response to the pressure changes caused by the folds as they modulate the airflow supplied by the lungs. The vibrational properties of the tracheal wall are modeled using a lumped-element circuit model.
Taking the output of the vocal tract to be the audio pressure captured by a microphone and the input to be the subglottal pressure, the transfer function of the vocal tract (including the nasal cavities) can be approximated every 10-30 milliseconds using an autoregressive moving-average model. Unlike the currently utilized method of transfer function approximation, this new method only involves noninvasive GEMS measurements and digital signal processing and does not demand the difficult task of obtaining precise physical measurements of the tract and subsequent estimation of the transfer function using its cross-sectional area.
The ability to measure the physical motion of the trachea enables a significant number of potential applications, ranging from very accurate pitch detection to speech synthesis, speaker verification, and speech recognition.