This paper describes the Japanese waveform-based speech synthesis that has been successfully added to the ANSER (Automatic answer Network System for Electronic Request) system, which is widely used for banking services in Japan. This method can produce highly intelligible speech comparable to natural voice. Its key features include a waveform dictionary containing specific waveforms for efficient pitch control, Japanese syllable unit-based waveform-CV, accurate accent control, and efficient waveform concatenation based on signal interpolation. A high intelligibility of 90% was attained (compared with 79% for the current LSP-CVC method) for 500 Japanese family names used in actual service.
展开▼