Word Segmentation for Text in Japanese Ancient Writings Based on Probability of Character N-Grams

机译：基于字符N-GRAM概率的日本古代着作文本的词分割

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Currently, there are few available tools to separate ancient Japanese sentences into words. Therefore, it is difficult to extract archaic Japanese words from Japanese ancient writings. We propose a method of word segmentation for Japanese ancient writings. We calculate the likelihood of character n-grams to be words, and extract character n-grams with higher likelihood as archaic Japanese words. We conducted word separation experiments using the term likelihood with the proposed method.

机译：目前，很少有可用的工具将古代日本句子分开为单词。因此，很难从日本古代着作中提取古老的日语词。我们提出了一种关于日本古代着作词分割的方法。我们计算字符n-gram的可能性是单词，提取具有更高可能性的字符n-gram作为古代日语单词。我们使用该方法使用术语似然进行了字分离实验。

著录项

来源
《International Conference on Asia-Pacific Digital Libraries》|2012年||共4页
会议地点
作者
Mamoru Yoshimura; Fuminori Kimura; Akira Maeda;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G250.76-53;
关键词

相似文献

外文文献
中文文献
专利

1. Development of a medical-text parsing algorithm based on character adjacent probability distribution for Japanese radiology reports. [J] . Nishimoto N, Terae S, Uesugi M, Methods of information in medicine . 2008,第6期

机译：基于字符相邻概率分布的医学影像分析算法的开发，用于日本放射学报告。
2. Text-Line and Character Segmentation for Off-line Recognition of Handwritten Japanese Text [J] . Kha Cong Nguyen, Nakagawa Masaki 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2015,第517期

机译：文本行和字符分割，用于手写日语文本的离线识别
3. Applying particle swarm optimisation to the morphological segmentation of words from Ancient Greek texts [J] . Tambouratzis George Pattern Analysis and Applications . 2016,第4期

机译：应用粒子群优化技术对古希腊文字词的形态学分割
4. Word Segmentation for Text in Japanese Ancient Writings Based on Probability of Character N-Grams [C] . Mamoru Yoshimura, Fuminori Kimura, Akira Maeda International Conference on Asia-Pacific Digital Libraries . 2012

机译：基于字符N-GRAM概率的日本古代着作文本的词分割
5. Skillful means: Ancient process control as exemplified by the manufacture of Japanese swords, Nihonto. [D] . Kedzie, Daniel P. 2011

机译：熟练的方法：古代过程控制，例如日本剑杆Nihonto的制造。
6. Words prediction based on N-gram model for free-text entry in electronic health records [O] . Azita Yazdani, Reza Safdari, Ali Golkar, 2019

机译：基于N-GRAM模型的电子健康记录中自由文本输入的单词预测
7. A New Method of N-gram Statistics for Large Number of n and Automatic Extraction of Words and Phrases from Large Text Data of Japanese [O] . Makoto Nagao, Shinsuke Mori 1994

机译：日语大文本数据中大量n和N语法自动提取的新方法

Word Segmentation for Text in Japanese Ancient Writings Based on Probability of Character N-Grams

摘要

著录项

相似文献

相关主题

期刊订阅