【24h】

The Geometry of Statistical Machine Translation

机译:统计机器翻译的几何形状

获取原文

摘要

Most modern statistical machine translation systems are based on linear statistical models. One extremely effective method for estimating the model parameters is minimum error rate training (MERT), which is an efficient form of line optimisation adapted to the highly nonlinear objective functions used in machine translation. We describe a polynomial-time generalisation of line optimisation that computes the error surface over a plane embedded in parameter space. The description of this algorithm relies on convex geometry, which is the mathematics of polytopes and their faces. Using this geometric representation of MERT we investigate whether the optimisation of linear models is tractable in general. Previous work on finding optimal solutions in MERT (Galley and Quirk, 2011) established a worst-case complexity that was exponential in the number of sentences, in contrast we show that exponential dependence in the worst-case complexity is mainly in the number of features. Although our work is framed with respect to MERT, the convex geometric description is also applicable to other error-based training methods for linear models. We believe our analysis has important ramifications because it suggests that the current trend in building statistical machine translation systems by introducing a very large number of sparse features is inherently not robust.
机译:大多数现代统计机器翻译系统基于线性统计模型。一种极其有效的估计模型参数方法是最小误差率训练(MERT),其是一种有效的线路优化形式,适用于机器翻译中使用的高度非线性目标函数。我们描述了线路优化的多项式泛化,该时间呈现在嵌入参数空间中的平面上的错误表面。该算法的描述依赖于凸起几何形状,这是多核糖及其脸部的数学。使用MERT的这种几何表示,我们研究了线性模型的优化是否普遍是易行的。以前的努力在MERT(厨房和Quirk,2011)中找到最佳解决方案)在句子数量中建立了最糟糕的复杂性,相比之下,我们表明在最坏情况复杂性中的指数依赖主要是特征的数量。虽然我们的作品是对默认的框架,但凸几何描述也适用于用于线性模型的其他基于误差的训练方法。我们认为我们的分析具有重要的影响,因为它表明通过引入大量稀疏功能来构建统计机器翻译系统的当前趋势本质上是不稳定的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号