...
首页> 外文期刊>EURASIP journal on audio, speech, and music processing >Statistical analysis of orthographic and phonemic language corpus for word-based and phoneme-based Polish language modelling
【24h】

Statistical analysis of orthographic and phonemic language corpus for word-based and phoneme-based Polish language modelling

机译:基于单词和音素的波兰语语言建模的正字法和音位语料库的统计分析

获取原文
           

摘要

This article presents the original results of Polish language statistical analysis, based on the orthographic and phonemic language corpus. Phonemic language corpus for Polish was developed by using automatic grapheme-to-phoneme conversion of the source orthographic language corpus, obtained from the National Corpus of Polish (NCP). The corpus contains the most frequently used Polish words, written with the use of phonemic notation. Performed statistical analysis of Polish language based on phonemic language corpus, includes frequency of occurrence calculation of the orthographic and phonemic language components, as well as their sequence. Statistical language data, obtained as a result of performed statistical analysis, enable to develop statistical word-based and phoneme-based language models for Polish. Applying these language models can effectively contribute to efficiency improvement of automatic speech recognition for Polish.
机译:本文介绍了基于正字法和音位语料库的波兰语语言统计分析的原始结果。波兰语的语音语言语料库是通过使用从波兰国家语料库(NCP)获得的源正字法语言语料库的自动字形到音素转换而开发的。语料库包含最常用的波兰语单词,并使用音位符号书写。根据音位语言语料库对波兰语语言进行统计分析,包括拼字和音位语言成分的出现频率计算及其顺序。通过执行统计分析而获得的统计语言数据可以为波兰语开发基于统计词和基于音素的语言模型。应用这些语言模型可以有效地提高波兰语自动语音识别的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号