Unsupervised language model adaptation for speech recognition is challenging, particularly for complicated tasks such the transcription of broadcast news (BN) data. This paper presents an unsupervised adaptation method for language modeling based on information retrieval techniques. The method is designed for the broadcast news transcription task where the topics of the audio data cannot be predicted in advance. Experiments are carried out using the LIMSI American English BN transcription system and the NIST 1999 BN evaluation sets. The unsupervised adaptation method reduces the perplexity by 7% relative to the baseline LM and yields a 2% relative improvement for a 10xRT system.
展开▼