首页> 外文会议>International Conference on Asia-Pacific Digital Libraries >Word Segmentation for Text in Japanese Ancient Writings Based on Probability of Character N-Grams
【24h】

Word Segmentation for Text in Japanese Ancient Writings Based on Probability of Character N-Grams

机译:基于字符N-GRAM概率的日本古代着作文本的词分割

获取原文

摘要

Currently, there are few available tools to separate ancient Japanese sentences into words. Therefore, it is difficult to extract archaic Japanese words from Japanese ancient writings. We propose a method of word segmentation for Japanese ancient writings. We calculate the likelihood of character n-grams to be words, and extract character n-grams with higher likelihood as archaic Japanese words. We conducted word separation experiments using the term likelihood with the proposed method.
机译:目前,很少有可用的工具将古代日本句子分开为单词。因此,很难从日本古代着作中提取古老的日语词。我们提出了一种关于日本古代着作词分割的方法。我们计算字符n-gram的可能性是单词,提取具有更高可能性的字符n-gram作为古代日语单词。我们使用该方法使用术语似然进行了字分离实验。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号