首页> 外文会议>Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >Towards Unsupervised and Language-independent Compound Splitting using Inflectional Morphological Transformations
【24h】

Towards Unsupervised and Language-independent Compound Splitting using Inflectional Morphological Transformations

机译:使用拐点形态转换实现无监督和语言独立的化合物拆分

获取原文

摘要

In this paper, we address the task of language-independent, knowledge-lean and unsupervised compound splitting, which is an essential component for many natural language processing tasks such as machine translation. Previous methods on statistical compound splitting either include language-specific knowledge (e.g., linking elements) or rely on parallel data, which results in limited applicability. We aim to overcome these limitations by learning compounding morphology from inflectional information derived from lemmatized monolingual corpora. In experiments for Germanic languages, we show that our approach significantly outperforms language-dependent state-of-the-art methods in finding the correct split point and that word inflection is a good approximation for compounding morphology.
机译:在本文中,我们解决了语言无关,知识贫乏和无监督的复合拆分的任务,这是许多自然语言处理任务(例如机器翻译)的重要组成部分。统计复合拆分的先前方法要么包含特定于语言的知识(例如,链接元素),要么依赖并行数据,这导致适用性有限。我们的目的是通过从非语言化单语语料库衍生的拐点信息中学习复合形态,从而克服这些局限性。在日耳曼语言的实验中,我们发现,在寻找正确的分割点方面,我们的方法明显优于依赖于语言的最新技术,并且单词变形可以很好地逼近复合形态。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号