A conventional HMM-based speech synthesis system for Hanoi Vietnamese often suffers from hoarse quality due to incomplete F0 parameterization of glottalized tones. Since estimating F0 from glottalized waveform is rather problematic for usual F0 extractors, we propose a pitch marking algorithm where pitch marks are propagated from regular regions of a speech signal to glottalized ones, from which complete F0 contours for the glottalized tones are derived. The proposed F0 parameterization scheme was confirmed to significantly reduce the hoarseness whilst slightly improving the tone naturalness of synthetic speech by both objective and listening tests. The pitch marking algorithm works as a refinement step based on the results of an F0 extractor. Therefore, the proposed scheme can be combined with any F0 extractor.
展开▼