首页> 外文会议>International conference on Asian language processing >Enhancing the Quality of Phrase-Table in Statistical Machine Translation for Less-Common and Low-Resource Languages
【24h】

Enhancing the Quality of Phrase-Table in Statistical Machine Translation for Less-Common and Low-Resource Languages

机译:在少见和少资源的语言中提高统计机器翻译中的短语表质量

获取原文

摘要

The phrase-table plays an important role in traditional phrase-based statistical machine translation (SMT) system. During translation, a phrase-based SMT system relies heavily on phrase-table to generate outputs. In this paper, we propose two methods for enhancing the quality of phrase-table. The first method is to recompute phrase-table weights by using vector representations similarity. The remaining method is to enrich the phrase-table by integrating new phrase-pairs from an extended dictionary and projections of word vector presentations on the target-language space. Our methods produce an attainment of up to 0.21 and 0.44 BLEU scores on in-domain and cross-domain (Asian Language Treebank - ALT) English - Vietnamese datasets respectively.
机译:短语表在传统的基于短语的统计机器翻译(SMT)系统中起着重要作用。在翻译过程中,基于短语的SMT系统在很大程度上依赖短语表来生成输出。在本文中,我们提出了两种提高短语表质量的方法。第一种方法是通过使用向量表示相似度来重新计算短语表权重。剩下的方法是通过集成扩展词典中的新短语对和单词向量表示在目标语言空间上的投影来丰富短语表。我们的方法分别在域内和跨域(亚洲语言树库-ALT)英语-越南数据集上分别获得了0.21和0.44的BLEU分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号