Selecting Syntactic, Non-redundant Segments in Active Learning for Machine Translation

机译：在主动学习中选择语法，非冗余段的机器翻译

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Active learning is a framework that makes it possible to efficiently train statistical models by selecting informative examples from a pool of unlabeled data. Previous work has found this framework effective for machine translation (MT), making it possible to train better translation models with less effort, particularly when annotators translate short phrases instead of full sentences. However, previous methods for phrase-based active learning in MT fail to consider whether the selected units are coherent and easy for human translators to translate, and also have problems with selecting redundant phrases with similar content. In this paper, we tackle these problems by proposing two new methods for selecting more syntactically coherent and less redundant segments in active learning for MT. Experiments using both simulation and extensive manual translation by professional translators find the proposed method effective, achieving both greater gain of BLEU score for the same number of translated words, and allowing translators to be more confident in their translations.

机译：主动学习是一种框架，可以通过从未标记的数据池中选择信息实例来有效地培训统计模型。以前的工作已经发现此框架对机器翻译（MT）有效，使得可以使用更少的努力训练更好的翻译模型，特别是当注释器翻译短语而不是完整的句子时。然而，在MT中的基于短语的主动学习方法无法考虑所选单元是否是连贯的，并且对于人类转换器来说是连贯的，并且还具有选择具有类似内容的冗余短语的问题。在本文中，我们通过提出两种新方法来解决这些问题，用于在主动学习中选择更多的语法相干和更少的冗余段。专业翻译的模拟和广泛手动翻译的实验发现提出的方法有效，实现了相同数量的翻译词的BLEU评分的更大增益，并允许翻译人员在翻译中更有信心。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2016年|lviii 777 p.|共10页
会议地点
作者
Akiva Miura; Graham Neubig; Michael Paul; Satoshi Nakamura;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Batch-mode semi-supervised active learning for statistical machine translation [J] . Sankaranarayanan Ananthakrishnan, Rohit Prasad, David Stallard, Computer speech and language . 2013,第2期

机译：用于统计机器翻译的批处理模式半监督主动学习
2. "Active Learning Systems and Methods for Rapid Porting of Machine Translation Systems to New Language Pairs Or New Domains" in Patent Application Approval Process [J] . Robotics and Machine Learning . 2012,第52期

机译：专利申请批准过程中的“用于将机器翻译系统快速移植到新语言对或新域的主动学习系统和方法”
3. Automation of Active Space Selection for Multireference Methods via Machine Learning on Chemical Bond Dissociation [J] . Jeong WooSeok, Stoneburner Samuel J., King Daniel, Journal of chemical theory and computation: JCTC . 2020,第4期

机译：通过机器学习在化学粘接解离式中的多引用方法自动化
4. Selecting Syntactic, Non-redundant Segments in Active Learning for Machine Translation [C] . Akiva Miura, Graham Neubig, Michael Paul, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2016

机译：在主动学习中选择句法，非冗余句段进行机器翻译
5. Active Learning and Crowdsourcing for Machine Translation in Low Resource Scenarios. [D] . Ambati, Vamshi. 2012

机译：在资源不足的情况下为机器翻译进行主动学习和众包。
6. Experimental Data Based Machine Learning Classification Models with Predictive Ability to Select in Vitro Active Antiviral and Non-Toxic Essential Oils [O] . Manuela Sabatino, Marco Fabiani, Mijat Božović, 2020

机译：基于实验数据的机器学习分类模型具有预测活性的体外抗病毒和无毒精油选择能力
7. Rule Selection with Soft Syntactic Features for String-to-Tree Statistical Machine Translation [O] . Fabienne Braune, Nina Seemann, Er Fraser 2015

机译：用于字符串到树统计机器翻译的软句法特征的规则选择

Selecting Syntactic, Non-redundant Segments in Active Learning for Machine Translation

摘要

著录项

相似文献

相关主题

期刊订阅