Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction

LIDAN ZHANG; KWOP-PING CHAN

首页> 外文期刊>ACM transactions on Asian language information processing >Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction

【24h】

Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction

机译：完全无监督的汉语词性归纳的自适应贝叶斯HMM

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose an adaptive Bayesian hidden Markov model for fully unsupervised part-of-speech (POS) induction. The proposed model with its inference algorithm has two extensions to the first-order Bayesian HMM with Dirichlet priors. First our algorithm infers the optimal number of hidden states from the training corpus rather than fixes the dimensionality of state space beforehand. The second extension studies the Chinese unknown word processing module which measures similarities from both morphological properties and context distribution. Experimental results showed that both of these two extensions can help to find the optimal categories for Chinese in terms of both unsupervised clustering metrics and grammar induction accuracies on the Chinese Treebank.

机译：我们为完全无监督的词性（POS）归纳提出了一种自适应贝叶斯隐马尔可夫模型。所提出的模型及其推理算法对具有Dirichlet先验的一阶贝叶斯HMM进行了两个扩展。首先，我们的算法从训练语料库中推断出最佳的隐藏状态数，而不是预先确定状态空间的维数。第二个扩展部分研究了中文未知字处理模块，该模块从形态特征和上下文分布两个方面衡量相似性。实验结果表明，这两个扩展都可以帮助从中文树库的无监督聚类度量和语法归纳准确性两个方面为中文找到最佳类别。

著录项

来源
《ACM transactions on Asian language information processing》 |2012年第3期|p.9.1-9.22|共22页
作者
LIDAN ZHANG; KWOP-PING CHAN;
展开▼
作者单位

Department of Computer Science, The University of Hong Kong, Hong Kong;

Department of Computer Science, The University of Hong Kong, Hong Kong;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
part-of-speech induction; chinese language model; bayesian HMM; variational inference; dirichlet distribution;

机译：词性诱导;中文模型;贝叶斯HMM;变分推理狄利克雷分布;

相似文献

外文文献
中文文献
专利

1. Improving part-of-speech tagging using lexicalized HMMs [J] . FERRAN PLA, ANTONIO MOLINA Natural language engineering . 2004,第Jun期

机译：使用词法化的HMM改进词性标记
2. Pattern Extraction of Topsoil and Subsoil Heterogeneity and Soil-Crop Interaction Using Unsupervised Bayesian Machine Learning: An Application to Satellite-Derived NDVI Time Series and Electromagnetic Induction Measurements [J] . Hui Wang, Florian Wellmann, Tianqi Zhang, Journal of Geophysical Research. Biogeosciences . 2019,第6期

机译：无监督贝叶斯机器学习的表土和底层异质性和土壤作物相互作用的模式提取：卫星衍生NDVI时间序列和电磁感应测量的应用
3. Unsupervised Dance Motion Patterns Classification from Fused Skeletal Data using Exemplar-based HMMs [J] . A. Kitsikidis, N. V. Boulgouris, K. Dimitropoulos, International journal of heritage in the digital era . 2015,第2期

机译：使用基于示例的HMM从融合骨骼数据中进行无监督的舞蹈运动模式分类
4. Unsupervised Part-of-Speech Tagging in Noisy and Esoteric Domains with a Syntactic-Semantic Bayesian HMM [C] . William M. Darling, Michael J. Paul, Fei Song EACL Workshop on Semantic Analysis in Social Media 2012 . 2012

机译：带有语义语义贝叶斯HMM的嘈杂和深奥域中的无监督词性标记
5. Toward language-independent morphological segmentation and part-of-speech induction. [D] . Dasgupta, Sajib. 2007

机译：走向独立于语言的形态学分割和词性诱导。
6. Bayesian-Estimated Hierarchical HMMs Enable Robust Analysis of Single-Molecule Kinetic Heterogeneity [O] . Jason Hon, Ruben L. Gonzalez, Jr. 2019

机译：贝叶斯估计的分层HMM可以对单分子动力学异质性进行可靠的分析
7. A language-independent and fully unsupervised approach to lexicon induction and part-of-speech tagging for closely related languages [O] . Scherrer Yves, Sagot Benoît 2014

机译：一种语言无关且完全不受监督的方法，用于对紧密相关的语言进行词汇归纳和词性标记

Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction

摘要

著录项

相似文献

相关主题

期刊订阅