Chinese Word Ordering Errors Detection and Correction for Non-Native Chinese Language Learners

机译：非母语中文学习者的中文单词排序错误检测与纠正

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Word Ordering Errors (WOEs) are the most frequent type of grammatical errors at sentence level for non-native Chinese language learners. Learners taking Chinese as a foreign language often place character(s) in the wrong places in sentences, and that results in wrong word(s) or ungrammatical sentences. Besides, there are no clear word boundaries in Chinese sentences. That makes WOEs detection and correction more challenging. In this paper, we propose methods to detect and correct WOEs in Chinese sentences. Conditional random fields (CRFs) based WOEs detection models identify the sentence segments containing WOEs. Segment point-wise mutual information (PMI), inter-segment PMI difference, language model, tag of the previous segment, and CRF bigram template are explored. Words in the segments containing WOEs are reordered to generate candidates that may have correct word orderings. Ranking SVM based models rank the candidates and suggests the most proper corrections. Training and testing sets are selected from HSK dynamic composition corpus created by Beijing Language and Culture University. Besides the HSK WOE dataset, Google Chinese Web 5-gram corpus is used to learn features for WOEs detection and correction. The best model achieves an accuracy of 0.834 for detecting WOEs in sentence segments. On the average, the correct word orderings are ranked 4.8 among 184.48 candidates.

机译：对于非母语的中文学习者，单词顺序错误（WOE）是句子级别上最常见的语法错误类型。将汉语作为外语的学习者经常将字符放在句子中的错误位置，从而导致单词或语法错误的句子。此外，中文句子中没有明确的单词边界。这使得WOE的检测和纠正更具挑战性。在本文中，我们提出了检测和纠正中文句子中的WOE的方法。基于条件随机字段（CRF）的WOE检测模型可识别包含WOE的句子片段。研究了分段逐点相互信息（PMI），分段间PMI差异，语言模型，上一个分段的标签以及CRF bigram模板。包含WOE的句段中的单词会重新排序，以生成可能具有正确单词顺序的候选单词。基于SVM的排名模型对候选者进行排名，并提出最适当的更正。培训和测试集选自北京语言大学创建的HSK动态写作语料库。除了HSK WOE数据集，Google中文Web 5克语料库还用于学习WOE检测和纠正的功能。最佳模型在句子段中检测WOE的准确度达到0.834。平均而言，正确的词序在184.48个候选词中排名4.8。

著录项

来源
《International conference on computational linguistics》|2014年|279-289|共11页
会议地点
作者
Shuk-Man Cheng; Chi-Hsin Yu; Hsin-Hsi Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A qualitative study on learning trajectories of non-native Chinese instructors as successful Chinese language learners [J] . Shijuan Liu, Fu Wang Asian-Pacific Journal of Second and Foreign Language Education . 2018,第1期

机译：定性研究非母语汉语教师作为成功汉语学习者的学习轨迹
2. South African Grade 5 non-native learners learning Mandarin as a second additional language with a focus on Chinese characters [J] . Norma M. Nel, Soezin Krog, Lazarus Lebeloane Literator . 2019,第1期

机译：南非5年级非母语学习者将汉语作为第二种附加语言学习，重点是汉字
3. Non-native Chinese language learners' attitudes towards online vision-based motion games [J] . Yungwei Hao, Jon-Chao Hong, Jyh-Tsorng Jong, British Journal of Educational Technology . 2010,第6期

机译：非母语学习者对基于视觉的在线动作游戏的态度
4. Chinese Word Ordering Errors Detection and Correction for Non-Native Chinese Language Learners [C] . Shuk-Man Cheng, Chi-Hsin Yu, Hsin-Hsi Chen International conference on computational linguistics . 2014

机译：中文单词排序错误检测和纠正非原生汉语学习者
5. An investigation of error correction in the zone of proximal development: Oral interaction with beginning learners of Chinese as a foreign language. [D] . An, Kun. 2006

机译：在近端发育区的纠错研究：与初学汉语作为外语的口头互动。
6. Corrigendum: Applicability of the Compensatory Encoding Model in Foreign Language Reading: An Investigation With Chinese College English Language Learners [O] . Feifei Han -1

机译：勘误：补偿性编码模型在外语阅读中的适用性：对中国大学英语学习者的一项调查
7. A Study on the Detection and the Correction of Prosodic Errors Produced by Chinese Korean-Learners [O] . Young-Sook Yune 2012

机译：韩国学习者生产韵律误差检测与校正研究

Chinese Word Ordering Errors Detection and Correction for Non-Native Chinese Language Learners

摘要

著录项

相似文献

相关主题

期刊订阅