首页> 外文会议>China National Conference on Computational Linguistics >Chinese Long and Short Form Choice Exploiting Neural Network Language Modeling Approaches
【24h】

Chinese Long and Short Form Choice Exploiting Neural Network Language Modeling Approaches

机译:中国长而短的形式选择利用神经网络语言建模方法

获取原文

摘要

Lexicalisation is one of the most challenging tasks of Natural Language Generation (NLG). This paper presents our work in choosing between long and short forms of elastic words in Chinese, which is a key aspect of lexicalisation. Long and short forms is a highly frequent linguistic phenomenon in Chinese such as 老虎-虎 (laohu-hu, tiger). The choice of long and short form task aims to properly choose between long and short form for a given context to producing high-quality Chinese. We tackle long and short form choice as a word prediction question with neural network language modeling approaches because of their powerful language representation capability. In this work, long and short form choice models based on the-state-of-art Neural Network Language Models (NNLMs) have been built, and a classical n-gram Language Model (LM) is constructed as a baseline system. A well-designed test set is constructed to evaluate our models, and results show that NNLMs-based models achieve significantly improved performance than the baseline system.
机译:词汇化是自然语言生成最具挑战性的任务之一(NLG)。本文介绍了我们在中文中选择的长而短的弹性词之间的工作,这是词汇化的关键方面。漫长而短的形式是老虎 - 虎(老虎虎,老虎)如中文常见的语言现象。 The choice of long and short form task aims to properly choose between long and short form for a given context to producing high-quality Chinese.由于其强大的语言表示能力,我们将长期和短的形式选择作为具有神经网络语言建模方法的单词预测问题。在这项工作中,已经构建了基于最先进的神经网络语言模型(NNLMS)的长而短的形式选择模型,并且经典的N-GRAM语言模型(LM)被构造为基线系统。建造精心设计的测试集以评估我们的模型,结果表明,基于NNLMS的模型比基线系统实现显着提高的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号