首页> 外文期刊>International journal of synthetic emotions >Development of Part of Speech Tagger for Assamese Using HMM
【24h】

Development of Part of Speech Tagger for Assamese Using HMM

机译:使用HMM开发针对阿萨姆语的词性标注器

获取原文
获取原文并翻译 | 示例
           

摘要

This article presents the work on the Part-of-Speech Tagger for Assamese based on Hidden Markov Model (HMM). Over the years, a lot of language processing tasks have been done for Western and South-Asian languages. However, very little work is done for Assamese language. So, with this point of view, the POS Tagger for Assamese using Stochastic Approach is being developed. Assamese is a free word-order, highly agglutinate and morphological rich language, thus developing POS Tagger with good accuracy will help in development of other NLP task for Assamese. For this work, an annotated corpus of 271,890 words with a BIS tagset consisting of 38 tag labels is used. The model is trained on 256,690 words and the remaining words are used in testing. The system obtained an accuracy of 89.21% and it is being compared with other existing stochastic models.
机译:本文介绍了基于隐马尔可夫模型(HMM)的Assamese词性标注器的工作。多年来,已经为西方和南亚语言完成了许多语言处理任务。但是,阿萨姆语的工作很少。因此,从这种观点出发,正在开发使用随机方法的阿萨姆语POS塔格。阿萨姆语是一种自由词序,高度凝集且形态丰富的语言,因此开发具有良好准确性的POS Tagger将有助于阿萨姆语其他NLP任务的开发。对于这项工作,使用了一个带有271,890个单词的带注释的语料库,其BIS标签集由38个标签标签组成。该模型使用256,690个单词训练,其余单词用于测试。该系统的精度为89.21%,正在与其他现有的随机模型进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号