首页> 外文会议>First Workshop on multilingual modeling 2012 >The Study of Effect of Length in Morphological Segmentation of Agglutinative Languages
【24h】

The Study of Effect of Length in Morphological Segmentation of Agglutinative Languages

机译:长度对凝集语言形态分割的影响研究

获取原文
获取原文并翻译 | 示例

摘要

Morph length is one of the indicative feature that helps learning the morphology of languages, in particular agglutinative languages. In this paper, we introduce a simple unsu-pervised model for morphological segmentation and study how the knowledge of morph length affect the performance of the segmentation task under the Bayesian framework. The model is based on (Goldwater et al., 2006) unigram word segmentation model and assumes a simple prior distribution over morph length. We experiment this model on two highly related and agglutinative languages namely Tamil and Telugu, and compare our results with the state of the art Mor-fessor system. We show that, knowledge of morph length has a positive impact and provides competitive results in terms of overall performance.
机译:变体长度是指示性特征之一,可帮助学习语言(尤其是凝集性语言)的形态。在本文中,我们介绍了一个简单的,未经监督的形态学分割模型,并研究了变体长度的知识如何影响贝叶斯框架下的分割任务的性能。该模型基于(Goldwater et al。,2006)字母组合词切分模型,并假设词素长度上具有简单的先验分布。我们在泰米尔语和泰卢固语这两种高度相关和凝集的语言上对该模型进行了实验,并将我们的结果与最先进的Mor-fessor系统进行了比较。我们表明,变体长度的知识具有积极的影响,并在整体性能方面提供竞争性结果。

著录项

  • 来源
  • 会议地点 Jeju Island(KR)
  • 作者单位

    Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics, Charles University in Prague;

    Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics, Charles University in Prague;

    Seminar fuer Sprachwissenschaft Universitaet Tuebingen;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号