Unsupervised Part-of-Speech Tagging in Noisy and Esoteric Domains with a Syntactic-Semantic Bayesian HMM

机译：带有语义语义贝叶斯HMM的嘈杂和深奥域中的无监督词性标记

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Unsupervised part-of-speech (POS) tagging has recently been shown to greatly benefit from Bayesian approaches where HMM parameters are integrated out, leading to significant increases in tagging accuracy. These improvements in unsupervised methods are important especially in specialized social media domains such as Twitter where little training data is available. Here, we take the Bayesian approach one step further by integrating semantic information from an LDA-like topic model with an HMM. Specifically, we present Part-of-Speech IDA (POSLDA), a syntactically and semantically consistent generative probabilistic model. This model discovers POS specific topics from an unla-belled corpus. We show that this model consistently achieves improvements in unsupervised POS tagging and language modeling over the Bayesian HMM approach with varying amounts of side information in the noisy and esoteric domain of Twitter.

机译：最近，无监督的词性（POS）标记已被证明可以从贝叶斯方法中受益，贝叶斯方法将HMM参数进行了整合，从而大大提高了标记精度。无监督方法的这些改进非常重要，尤其是在专门的社交媒体领域（例如Twitter）中，培训数据很少。在这里，我们通过将来自类似LDA的主题模型的语义信息与HMM集成，使贝叶斯方法更进一步。具体来说，我们提出词性IDA（POSLDA），这是一种句法和语义上一致的生成概率模型。该模型从无言语料库中发现POS特定主题。我们表明，该模型通过贝叶斯HMM方法在Twitter嘈杂而深奥的领域中使用了不同数量的辅助信息，从而持续改进了无监督POS标记和语言建模。

著录项

来源
《EACL Workshop on Semantic Analysis in Social Media 2012》|2012年|1-9|共9页
会议地点 Avignon(FR)
作者
William M. Darling; Michael J. Paul; Fei Song;
展开▼
作者单位

School of Computer Science University of Guelph;

Dept. of Computer Science Johns Hopkins University;

School of Computer Science University of Guelph;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction [J] . LIDAN ZHANG, KWOP-PING CHAN ACM transactions on Asian language information processing . 2012,第3期

机译：完全无监督的汉语词性归纳的自适应贝叶斯HMM
2. Effect of Data Imbalance on Unsupervised Domain Adaptation of Part-of-Speech Tagging and Pivot Selection Strategies [J] . Xia Cui, Frans Coenen, Danushka Bollegala JMLR: Workshop and Conference Proceedings . 2017,第1期

机译：数据不平衡对词性标记和数据透视选择策略的无监督域适应的影响
3. Improving part-of-speech tagging using lexicalized HMMs [J] . FERRAN PLA, ANTONIO MOLINA Natural language engineering . 2004,第Jun期

机译：使用词法化的HMM改进词性标记
4. Unsupervised Part-of-Speech Tagging in Noisy and Esoteric Domains with a Syntactic-Semantic Bayesian HMM [C] . William M. Darling, Michael J. Paul, Fei Song Conference of the European Chapter of the Association for Computational Linguistics . 2012

机译：没有句法和深度博物馆的无监督的术语标签，具有语法语义贝叶斯嗯
5. IITagger: Tagging Wall Street Journal text with part-of-speech information [D] . Kim, Yeongkwun 1996

机译：IITagger：使用词性信息标记“华尔街日报”文本
6. Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation [O] . Jeffrey P Ferraro, Hal Daumé III, Scott L DuVall, 2013

机译：通过领域适应提高自然语言处理词性标注在临床叙事上的表现
7. Domain adaptation for part-of-speech tagging of noisy user-generated text [O] . Luisa März, Dietrich Trautmann, Benjamin Roth 2019

机译：域适应嘈杂的用户生成的文本的语音标记

Unsupervised Part-of-Speech Tagging in Noisy and Esoteric Domains with a Syntactic-Semantic Bayesian HMM

摘要

著录项

相似文献

相关主题

期刊订阅