首页> 外文期刊>International Journal of Computer Processing of Oriental Languages >Transliteration Using a Network of Phoneme Chunks
【24h】

Transliteration Using a Network of Phoneme Chunks

机译:使用音素块网络进行音译

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we present methods of transliteration and back-transliteration. In Korean technical documents and web documents, many English words and Japanese words are transliterated into Korean words. These transliterated words are usually technical terms and proper nouns, so it is hard to find them in a dictionary. Therefore an automatic transliteration system is needed. Previous transliteration models restrict an information length to two or three letters per letter. However, most transliteration phenomena cannot be explained with a single standard rule especially in Korean. Various rules such as the origin of a word and profession of users are applied to each transliteration. The restriction of information length may lose the discriminative information of each transliteration rule. In this paper, we propose the methods that find similar words which have the longest overlap with an input word. To find similar words without the loss of each transliteration rule, phoneme chunks that do not have a length limit are used. By merging phoneme chunks, an input word is transliterated. With our proposed method, we could get 86% character accuracy and 53% word accuracy in an English-to-Korean transliteration test.
机译:在本文中,我们介绍了音译和反音译的方法。在韩国技术文档和网络文档中,许多英语单词和日语单词都被音译为朝鲜语单词。这些音译词通常是技术术语和专有名词,因此很难在词典中找到它们。因此,需要一个自动音译系统。以前的音译模型将信息长度限制为每个字母两个或三个字母。但是,大多数音译现象不能用一个标准规则来解释,尤其是在韩文中。每个音译都应采用各种规则,例如单词的起源和用户的职业。信息长度的限制可能会丢失每个音译规则的区分信息。在本文中,我们提出了寻找与输入词重叠时间最长的相似词的方法。为了在不丢失每个音译规则的情况下找到相似的单词,使用没有长度限制的音素块。通过合并音素块,对输入单词进行音译。使用我们提出的方法,在英语到韩语的音译测试中,我们可以获得86%的字符准确度和53%的单词准确度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号