首页> 外文期刊>Computer speech and language >Parallel fragments : Measuring their impact on translation performance
【24h】

Parallel fragments : Measuring their impact on translation performance

机译:并行片段:测量它们对翻译性能的影响

获取原文
获取原文并翻译 | 示例
           

摘要

Lack of parallel corpora have diverted the direction of research towards exploring other arenas to fill in the dearth. Comparable corpora have proved to be a valuable resource in this regard. Interestingly other than the parallel sentences extracted from comparable corpora, parallel phrase fragments have also proved to be beneficial for statistical machine translation. We present a novel approach based on an efficient framework for parallel fragment extraction from comparable corpora. Using the fragments as additional corpus for translation, we are able to obtain an improvement of 0.88 and 0.89 BLEU points on test data for Arabic-English and French-English systems respectively. We have also conducted a detailed analysis of impact of fragments extracted from related vs non-related corpus. A comparison of impact of parallel fragments vs. parallel sentences is also presented highlighting the significance of parallel segments for statistical machine translation. The article concludes with a crude comparative analysis of our approach with an existing fragment extraction technique at various stages of the fragment extraction pipeline.
机译:缺乏平行语料库已将研究方向转向探索其他领域以填补匮乏。在这方面,可比语料库被证明是宝贵的资源。有趣的是,除了从可比语料库中提取的平行句子之外,平行短语片段也被证明对统计机器翻译是有益的。我们提出了一种基于有效框架的可比语料库并行片段提取的新颖方法。使用这些片段作为附加语料库进行翻译,我们能够分别将阿拉伯英语和法语英语系统的测试数据提高0.88和0.89 BLEU点。我们还对从相关与非相关语料库中提取的片段的影响进行了详细分析。还对平行片段与平行句子的影响进行了比较,突出了平行片段对统计机器翻译的重要性。本文以在碎片提取流程各个阶段使用现有碎片提取技术对我们的方法进行了粗略的比较分析作为结束。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号