...
首页> 外文期刊>ACM transactions on Asian language information processing >Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction
【24h】

Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction

机译:完全无监督的汉语词性归纳的自适应贝叶斯HMM

获取原文
获取原文并翻译 | 示例
           

摘要

We propose an adaptive Bayesian hidden Markov model for fully unsupervised part-of-speech (POS) induction. The proposed model with its inference algorithm has two extensions to the first-order Bayesian HMM with Dirichlet priors. First our algorithm infers the optimal number of hidden states from the training corpus rather than fixes the dimensionality of state space beforehand. The second extension studies the Chinese unknown word processing module which measures similarities from both morphological properties and context distribution. Experimental results showed that both of these two extensions can help to find the optimal categories for Chinese in terms of both unsupervised clustering metrics and grammar induction accuracies on the Chinese Treebank.
机译:我们为完全无监督的词性(POS)归纳提出了一种自适应贝叶斯隐马尔可夫模型。所提出的模型及其推理算法对具有Dirichlet先验的一阶贝叶斯HMM进行了两个扩展。首先,我们的算法从训练语料库中推断出最佳的隐藏状态数,而不是预先确定状态空间的维数。第二个扩展部分研究了中文未知字处理模块,该模块从形态特征和上下文分布两个方面衡量相似性。实验结果表明,这两个扩展都可以帮助从中文树库的无监督聚类度量和语法归纳准确性两个方面为中文找到最佳类别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号