首页> 外文会议> >Automatic recognition of Chinese place names: a statistical and rule-based combined approach
【24h】

Automatic recognition of Chinese place names: a statistical and rule-based combined approach

机译:自动识别中文地名:一种基于统计和基于规则的组合方法

获取原文

摘要

The automatic recognition of Chinese place names, a special case of the recognition of Chinese special nouns, is an important task in Chinese information processing. In this paper, we propose an approach combining statistical and rule-based techniques. The proposed approach discovers candidates from Chinese texts based upon the probability of a character being part of a Chinese place name; and confirms or eliminates the candidates by applying rules obtained by human summarization and transformation-based machine learning. In this approach, we employ a statistical measure: weight of likelihood (WOL), to estimate the likelihood of a character being part of a Chinese place name in real corpora. To the authors' knowledge, it is the first time WOL has been used to capture the capability of a character forming Chinese places names in real corpora. We evaluate the performance of our approach on a real data set and the recall and precision are 97% and 90.92% respectively.
机译:中文地名的自动识别是中文特殊名词识别的一种特殊情况,是中文信息处理中的重要任务。在本文中,我们提出了一种结合统计和基于规则的技术的方法。所提出的方法基于字符是中文地名一部分的概率从中文文本中发现候选者;并通过应用基于人类摘要和基于变换的机器学习获得的规则来确认或消除候选人。在这种方法中,我们采用统计量度:似然权重(WOL),以估计某个字符成为真实语料库中中文地名一部分的可能性。据作者所知,这是第一次使用WOL捕获在真实语料库中形成中文地名的字符的功能。我们在真实数据集上评估我们的方法的性能,召回率和准确性分别为97%和90.92%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号