首页> 外文会议>Mexican international conference on artificial intelligence >Surface Realisation Using Factored Language Models and Input Seed Features
【24h】

Surface Realisation Using Factored Language Models and Input Seed Features

机译:使用分解语言模型和输入种子特征的表面实现

获取原文

摘要

Natural Language Generation research field needs to move forward to the design and development of flexible and adaptive techniques and approaches capable of producing language automatically, for any domain, language and purpose. In light of this, the aim of this paper is to study the appropriateness of factored language models for the stage of surface realisation, thus presenting an almost-fully language independent statistical approach. Its main novelty is that it can be adapted to generate texts for different purposes or domains thanks to the use of an input seed feature that guides all the generation process. In the context of this research, the seed input is a phoneme and our goal is to generate a full meaningful sentence that maximises the amount of words containing that phoneme. We experimented with different factors, including lemmas or part-of-speech tags, based on a trigram language model. The analysis carried out with several configurations of our proposed approach showed an improvement of 47% and 40% as far as the total meaningful generated sentences is concerned, with respect to traditional language models, for English and Spanish, respectively.
机译:自然语言生成研究领域需要朝着设计和开发能够适应任何领域,语言和目的自动生成语言的灵活的自适应技术和方法发展。有鉴于此,本文的目的是研究因式语言模型在表面实现阶段的适用性,从而提出一种几乎完全独立于语言的统计方法。它的主要新颖之处在于,由于使用了指导所有生成过程的输入种子功能,因此可以适应为不同目的或领域生成文本。在本研究的背景下,种子输入是一个音素,我们的目标是生成一个完整的有意义的句子,以使包含该音素的单词数量最大化。我们基于Trigram语言模型对不同的因素进行了实验,包括词缀或词性标签。使用我们提出的方法的几种配置进行的分析显示,相对于传统语言模型(分别针对英语和西班牙语),就总有意义的句子生成量而言,分别提高了47%和40%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号