首页> 外文会议>Annual Conference on Neural Information Processing Systems >HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation
【24h】

HM-BiTAM: Bilingual Topic Exploration, Word Alignment, and Translation

机译:HM-BiTAM:双语主题探索,单词对齐和翻译

获取原文

摘要

We present a novel paradigm for statistical machine translation (SMT), based on a joint modeling of word alignment and the topical aspects underlying bilingual document-pairs, via a hidden Markov Bilingual Topic AdMixture (HM-BiTAM). In this paradigm, parallel sentence-pairs from a parallel document-pair are coupled via a certain semantic-flow, to ensure coherence of topical context in the alignment of mapping words between languages, likelihood-based training of topic-dependent translational lexicons, as well as in the inference of topic representations in each language. The learned HM-BiTAM can not only display topic patterns like methods such as LDA [1], but now for bilingual corpora; it also offers a principled way of inferring optimal translation using document context. Our method integrates the conventional model of HMM-a key component for most of the state-of-the-art SMT systems, with the recently proposed BiTAM model [10]; we report an extensive empirical analysis (in many ways complementary to the description-oriented [10]) of our method in three aspects: bilingual topic representation, word alignment, and translation.
机译:我们通过隐藏的马尔可夫双语主题广告混合(HM-BiTAM),基于单词对齐和双语文档对背后的主题方面的联合建模,提出了一种用于统计机器翻译(SMT)的新颖范例。在这种范例中,来自并行文档对的并行句子对通过一定的语义流进行耦合,以确保语言之间映射单词对齐时主题上下文的连贯性,主题依赖翻译词典的基于似然性的训练,例如以及每种语言中主题表示的推论。学到的HM-BiTAM不仅可以显示诸如LDA [1]之类的方法的主题模式,而且还可以显示双语语料。它还提供了使用文档上下文推断最佳翻译的原则方法。我们的方法将HMM的传统模型(大多数SMT系统的关键组成部分)与最近提出的BiTAM模型集成在一起[10]。我们在三个方面报告了我们的方法的广泛的经验分析(在许多方面与面向描述的[10]互补):双语主题表示,单词对齐和翻译。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号