...
首页> 外文期刊>Computer speech and language >Bridge the gap between statistical and hand-crafted grammars
【24h】

Bridge the gap between statistical and hand-crafted grammars

机译:弥补统计语法与手工语法之间的差距

获取原文
获取原文并翻译 | 示例
           

摘要

LTAG is a rich formalism for performing NLP tasks such as semantic interpretation, parsing, machine translation and information retrieval. Depend on the specific NLP task, different kinds of LTAGs for a language may be developed. Each of these LTAGs is enriched with some specific features such as semantic representation and statistical information that make them suitable to be used in that task. The distribution of these capabilities among the LTAGs makes it difficult to get the benefit from all of them in NLP applications. This paper discusses a statistical model to bridge between two kinds LTAGs for a natural language in order to benefit from the capabilities of both kinds. To do so, an HMM was trained that links an elementary tree sequence of a source LTAG onto an elementary tree sequence of a target LTAG. Training was performed by using the standard HMM training algorithm called Baum-Welch. To lead the training algorithm to a better solution, the initial state of the HMM was also trained by a novel EM-based semi-supervised bootstrapping algorithm. The model was tested on two English LTAGs, XTAG (XTAG-Group, 2001) and MICA's grammar (Bangalore et al., 2009) as the target and source LTAGs, respectively. The empirical results confirm that the model can provide a satisfactory way for linking these LTAGs to share their capabilities together.
机译:LTAG是用于执行NLP任务(例如语义解释,解析,机器翻译和信息检索)的丰富形式主义。根据特定的NLP任务,可以开发一种语言的不同种类的LTAG。这些LTAG中的每个LTAG都具有一些特定的功能,例如语义表示和统计信息,使它们适合用于该任务。这些功能在LTAG之间的分布使其很难从NLP应用程序中的所有功能中受益。本文讨论了一种统计模型,用于在自然语言的两种LTAG之间进行桥接,以便从两种功能中受益。为此,训练了HMM,该HMM将源LTAG的基本树序列链接到目标LTAG的基本树序列。通过使用称为Baum-Welch的标准HMM训练算法进行训练。为了将训练算法引向更好的解决方案,还通过一种新颖的基于EM的半监督自举算法来训练HMM的初始状态。该模型在两个英语LTAG上进行了测试,分别是XTAG(XTAG-Group,2001)和MICA语法(Bangalore等,2009)作为目标LTAG和源LTAG。实验结果证实,该模型可以提供令人满意的方式来链接这些LTAG,以共同共享其功能。

著录项

  • 来源
    《Computer speech and language》 |2013年第5期|1085-1104|共20页
  • 作者

    Ali Basirat; Heshaam Faili;

  • 作者单位

    Laboratory of Natural Language and Text Processing, School of Electrical & Computer Engineering, College of Engineering,University of Tehran, Tehran, Iran;

    Laboratory of Natural Language and Text Processing, School of Electrical & Computer Engineering, College of Engineering,University of Tehran, Tehran, Iran;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Tree adjoining grammar; LTAG; Hidden Markov model; XTAG; MICA;

    机译:树邻接语法;LTAG;隐马尔可夫模型;XTAG;云母;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号