首页> 外文会议>Workshop on Natural Language Processing for Indigenous Languages of the Americas >Restoring the Sister: Reconstructing a Lexicon from Sister Languages using Neural Machine Translation
【24h】

Restoring the Sister: Reconstructing a Lexicon from Sister Languages using Neural Machine Translation

机译:恢复姐姐:使用神经机翻译重建来自姐妹语言的词典

获取原文

摘要

The historical comparative method has a long history in historical linguists. It describes a process by which historical linguists aim to reverse-engineer the historical developments of language families in order to reconstruct proto-forms and familial relations between languages. In recent years, there have been multiple attempts to replicate this process through machine learning, especially in the realm of cognate detection (List et al., 2016; Ciobanu and Dinu, 2014; Rama et al.. 2018). So far, most of these experiments aimed at actual reconstruction have attempted the prediction of a proto-form from the forms of the daughter languages (Ciobanu and Dinu, 2018; Meloni et al., 2019). Here, we propose a reimple-mentation that uses modern related languages, or sisters, instead, to reconstruct the vocabulary of a target language. In particular, we show that we can reconstruct vocabulary of a target language by using a fairly small data set of parallel cognates from different sister languages, using a neural machine translation (NMT) architecture with a standard encoder-decoder setup. This effort is directly in furtherance of the goal to use machine learning tools to help under-served language communities in their efforts at reclaiming, preserving, or reconstructing their own languages.
机译:历史比较方法在历史语言学家中具有悠久的历史。它描述了一种过程,其中历史语言学家旨在反向工程师的历史性家庭的历史发展,以重建语言之间的原始形式和家族关系。近年来,已经多次尝试通过机器学习复制这一过程,尤其是在同源检测的领域(List等,2016; Ciobanu和Dinu,2014; Rama等人2018)。到目前为止,大多数针对实际重建的这些实验都试图从女儿语言的形式预测了一种原始形式(Ciobanu和Dinu,2018; Meloni等,2019)。在这里,我们提出了一种重复的决策,它使用现代相关语言或姐妹来重建目标语言的词汇。特别是,我们表明我们可以使用具有标准编码器 - 解码器设置的神经机器翻译(NMT)架构来使用来自不同姐妹语言的相当小的并行同源的并行同源的相当小的并行同源组重建目标语言的词汇。这项努力直接促进了使用机器学习工具在回收,保存或重建自己的语言时努力帮助提供的语言社区的目标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号