首页> 外文会议>Conference on Speech Technology and Human-Computer Dialogue >A study on the statistical structure of words and of word digrams in a literary romanian corpus
【24h】

A study on the statistical structure of words and of word digrams in a literary romanian corpus

机译:文学罗马尼语中文词汇与词汇统计结构研究

获取原文

摘要

By resuming and extending an original method for verifying natural language stationarity, the paper presents a study on statistical structures of words and word digrams (groups of two successive words) in printed Romanian. The paper also contains an evaluation of natural language redundancy based on word digrams. The experimental study was carried out on a literary linguistic corpus of novels and short stories summing up over 12.5 million words with orthography and punctuation marks.
机译:通过恢复和扩展用于验证自然语言的原始方法,该论文提出了印刷罗马尼亚印刷罗马尼人的单词和词汇(两组连续单词组)的统计结构研究。 本文还包含基于Word Digrams的自然语言冗余的评估。 实验研究是对小说的文学语言语料库进行,短篇小说总结了超过1250万字的拼写术和标点符号。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号