In this paper, we investigate emotionally charged hot-spot jVj-words from a corpus that is based on recordings of puppet plays in Slovak. The potential of these hot-spot words for detecting emotion in larger utterances was tested. More specifically, we tested the effect of prosodic and voice quality characteristics and the presence or absence of lexical context on the perception of emotions that the jVj-words convey. We found that the lexical cues present in the context are better predictors for the perception of emotions than the prosodic and voice quality features in the jVj-words themselves. Nevertheless, both prosodic as well as voice quality features are useful and complementary in detecting emotion of individual words from the speech signal as well as of larger utterances. Finally, we argue that a corpus based on recordings of puppet plays presents a novel and advantageous approach to the collection of data for emotional speech research.
展开▼