Chinese Word Segmentation as POC-NLW Tagging

机译：中文分词作为POC-NLW标记

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In Chinese word segmentation, disambiguation and unknown words identification are the two key issues still remaining. In order to deal with these problems in a uniform way, a language tagging template, named POC-NLW, is presented in this paper to explore the word creation mechanisms of Chinese language on character-level. Based on this template, a Hidden Markov Model based tagger is constructed to implement word segmentation as character tagging. In this method, the basic word segmentation, disambiguation, and the unknown words identification are integrated fundamentally and accomplished in one unified process. Experimental results on the SIGHAN Bakeoff 2005 corpus show that the method can achieve high accuracy on word segmentation, especially on unknown words identification, with appreciable processing efficiency. This method is characterized by the good interoperability and expansionary over different kinds of words, thus it is applicable for practical Chinese information processing applications.

机译：在中文分词中，消歧和未知词识别是仍然存在的两个关键问题。为了统一解决这些问题，本文提出了一种语言标签模板，称为POC-NLW，以探讨汉字在单词层面上的造词机制。基于此模板，构造了一个基于隐马尔可夫模型的标记器，以实现单词分割作为字符标记。该方法将基本分词，消歧和未知词识别从根本上整合在一起，并在一个统一的过程中完成。在SIGHAN Bakeoff 2005语料库上的实验结果表明，该方法在分词，特别是在未知词识别方面可以达到很高的准确度，并且处理效率很高。该方法具有良好的互操作性，可以在不同种类的单词上扩展，因此适用于实际的中文信息处理应用。

著录项

来源
《International Conference on Signal Processing(ICSP'06); 20061116-20; Guilin(CN)》|2006年|P.1770-1774|共5页
会议地点 Guilin(CN)
作者
Bo Chen; Hui He; Jun Guo; Weiran Xu;
展开▼
作者单位

School of Information Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, P. R. China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类通信理论;
关键词

相似文献

外文文献
中文文献
专利

1. Joint Chinese Word Segmentation and POS Tagging Using an Error-Driven Word-Character Hybrid Model [J] . Canasai KRUKNGKRA, Kiyotaka UCHIMOTO, Junichi KAZAMA, IEICE Transactions on Information and Systems . 2009,第12期

机译：使用错误驱动的字-字符混合模型的联合中文分词和POS标记
2. Encoding multi-granularity structural information for joint Chinese word segmentation and POS tagging [J] . Zhao Ling, Zhang Ailian, Liu Ying, Pattern recognition letters . 2020,第Octa期

机译：编码联合中文字分割和POS标记的多粒度结构信息
3. A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text [J] . Ying Xiong, Zhongmin Wang, Dehuan Jiang, BMC Medical Informatics and Decision Making . 2019,第2期

机译：用于临床文本的细粒度中文分词和词性标注语料库
4. POC-NLW Template Based Tagging Method for Chinese Word Segmentation [C] . Bo Chen, Hui He, Weiran Xu, Proceedings of the 2006 International Conference on Computational Intelligence and Security (CIS 2006) . 2006

机译：基于POC-NLW模板的中文分词标签方法
5. Stages in Chinese children's reading of English words. [D] . Yin, Li. 2005

机译：中国儿童阅读英语单词的阶段。
6. A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text [O] . Ying Xiong, Zhongmin Wang, Dehuan Jiang, 2019

机译：用于临床文本的细粒度中文分词和词性标注语料库
7. Reduce Meaningless Words for Joint Chinese Word Segmentation and Part-of-speech Tagging [O] . Zhang, Kaixu, Sun, Maosong 2013

机译：减少无意义的汉语联合词语分词词性标注

Chinese Word Segmentation as POC-NLW Tagging

摘要

著录项

相似文献

相关主题

期刊订阅