Document Detail


Speech synthesis by glottal excited linear prediction.
MedLine Citation:
PMID:  7963019     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
This paper describes a linear predictive (LP) speech synthesis procedure that resynthesizes speech using a 6th-order polynomial waveform to model the glottal excitation. The coefficients of the polynomial model form a vector that represents the glottal excitation waveform for one pitch period. A glottal excitation code book with 32 entries for voiced excitation is designed and trained using two sentences spoken by different speakers. The purpose for using this approach is to demonstrate that quantization of the glottal excitation waveform does not significantly degrade the quality of speech synthesized with a glottal excitation linear predictive (GELP) synthesizer. This implementation of the LP synthesizer is patterned after both a pitch-excited LP speech synthesizer and a code excited linear predictive (CELP) speech coder. In addition to the glottal excitation codebook, we use a stochastic codebook with 256 entries for unvoiced noise excitation. Analysis techniques are described for constructing both codebooks. The GELP synthesizer, which resynthesizes speech with high quality, provides the speech scientist a simple speech synthesis procedure that uses established analysis techniques, that is able to reproduce all speed sounds, and yet also has an excitation model waveform that is related to the derivative of the glottal flow and the integral of the residue. It is conjectured that the glottal excitation codebook approach could provide a mechanism for quantitatively comparing the differences in glottal excitation codebooks for male and female speakers and for speakers with vocal disorders and for speakers with different voice types such as breathy and vocal fry voices. Conceivably, one could also convert the voice of a speaker with one voice type, e.g., breathy, to the voice of a speaker with another voice type, e.g., vocal fry, by synthesizing speech using the vocal tract LP parameters for the speaker with the breathy voice excited by the glottal excitation codebook trained for vocal fry.
Authors:
D G Childers; H T Hu
Related Documents :
3230899 - Automatic phonetogram recording supplemented with acoustical voice-quality parameters.
12023559 - Rasch models overview.
24397099 - Police response time to road crashes in south-east of iran.
25253999 - Incorporating spatial variability within epidemiological studies of environmental expos...
25466219 - Species distribution modelling for rhipicephalus microplus (acari: ixodidae) in benin, ...
20550269 - Updating signal typing in voice: addition of type 4 signals.
9873919 - Segmentation of 2-d and 3-d objects from mri volume data using constrained elastic defo...
10524509 - An application of the ncrp screening techniques to atmospheric radon releases from the ...
25440949 - A new class of nonlinear rauch-tung-striebel cubature kalman smoothers.
Publication Detail:
Type:  Clinical Trial; Comparative Study; Journal Article; Randomized Controlled Trial; Research Support, U.S. Gov't, Non-P.H.S.; Research Support, U.S. Gov't, P.H.S.    
Journal Detail:
Title:  The Journal of the Acoustical Society of America     Volume:  96     ISSN:  0001-4966     ISO Abbreviation:  J. Acoust. Soc. Am.     Publication Date:  1994 Oct 
Date Detail:
Created Date:  1994-12-20     Completed Date:  1994-12-20     Revised Date:  2007-11-14    
Medline Journal Info:
Nlm Unique ID:  7503051     Medline TA:  J Acoust Soc Am     Country:  UNITED STATES    
Other Details:
Languages:  eng     Pagination:  2026-36     Citation Subset:  IM    
Affiliation:
Department of Electrical Engineering, University of Florida, Gainesville 32611-2024.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Communication Aids for Disabled*
Equipment Design
Female
Glottis*
Humans
Male
Models, Anatomic
Sex Factors
Sound Spectrography
Speech*
Speech Acoustics
Speech Perception*
Stochastic Processes
Voice Quality
Grant Support
ID/Acronym/Agency:
NIDCD DC 00577/DC/NIDCD NIH HHS

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Sensorimotor model of bat echolocation and prey capture.
Next Document:  Low-pass filtering in amplitude modulation detection associated with vowel and consonant identificat...