首页> 外文会议>9th International conference on language resources and evaluation >Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature
【24h】

Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature

机译:从并行数据自动提取与分布相似度的并行数据作为重新排名特征的同义词

获取原文

摘要

We present a method for the extraction of synonyms for German particle verbs based on a word-aligned German-English parallel corpus: by translating the particle verb to a pivot, which is then translated back, a set of synonym candidates can be extracted and ranked according to the respective translation probabilities. In order to deal with separated particle verbs, we apply re-ordering rules to the German part of the data. In our evaluation against a gold standard, we compare different pre-processing strategies (lemmatized vs. inflected forms) and introduce language model scores of synonym candidates in the context of the input particle verb as well as distributional similarity as additional re-ranking criteria. Our evaluation shows that distributional similarity as a re-ranking feature is more robust than language model scores and leads to an improved ranking of the synonym candidates. In addition to evaluating against a gold standard, we also present a small-scale manual evaluation.
机译:我们介绍了一种基于单词对齐的德语 - 英语并行语言提取德语粒子动词的同义词的方法:通过将粒子动词翻译为枢轴,然后翻译回来,可以提取一组同义词候选人并排名根据各自的翻译概率。为了处理分隔的粒子动词,我们将重新订购规则应用于德国数据的德国部分。在我们对黄金标准的评估中,我们比较不同的预处理策略(lemmatized与流动形式),并在输入粒子动词的上下文中引入语言模型分数,以及作为额外重新排名标准的分布相似性。我们的评估表明,作为重新排名特征的分布相似性比语言模型分数更强大,并导致同义词候选的改进排名。除了评估金标准之外,我们还提出了一个小规模的手动评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号