Document Detail


Developing a bilingual communication aid for a Japanese ALS patient using voice conversion technique.
MedLine Citation:
PMID:  18532521     Owner:  NLM     Status:  In-Data-Review    
Abstract/OtherAbstract:
A bilingual communication aid for a Japanese amyotrophic lateral sclerosis (ALS) patient has been developed. From our previous research, a corpus-based speech synthesis method was ideal for synthesizing speech with voice quality identifiable as the patient's own. However, a recording of a large amount of speech, which is a burden for the patient, is required for such system. In this study, a voice conversion technique was applied so that a smaller amount of recording is needed for synthesis. An English speech synthesis system with the patient's voice was developed using Festival, a corpus-based speech synthesizer with voice conversion technique. Two methods for Japanese speech synthesis were attempted using HTS toolkit. The first used an acoustic model built from all 503 recordings of the patient. The second used an acoustic model built from 503 wavefiles of which voice was converted to the patient's from a native speaker's. The latter method requires fewer recordings of the patient's. The result of the perceptual experiment showed that the voice synthesized with the latter was perceived to have a closer voice quality to the patient's natural speech. Last, GUI on windows was developed for the patient to synthesize speech by typing in the text.
Authors:
Akemi Iida; Shimpei Kajima; Keiichi Yasu; John M Kominek; Yasuhiro Aikawa; Takayuki Arai
Publication Detail:
Type:  Journal Article    
Journal Detail:
Title:  The Journal of the Acoustical Society of America     Volume:  123     ISSN:  1520-8524     ISO Abbreviation:  J. Acoust. Soc. Am.     Publication Date:  2008 May 
Date Detail:
Created Date:  2008-06-05     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  7503051     Medline TA:  J Acoust Soc Am     Country:  United States    
Other Details:
Languages:  eng     Pagination:  3884     Citation Subset:  IM    
Affiliation:
School of Media Science, Tokyo University of Technology, 1404-1, Katakura-cho, Hachiouji, 192-0982 Tokyo, Japan, ake@media.teu.ac.jp.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Individual differences in perception of emotions from nonsense speech.
Next Document:  Vocal tract normalization in articulatory space using thin-plate spline method.