首页> 外文会议>International Conference on Contemporary Computing >A lexicon pool augmented Naive Bayes Classifier for Nepali Text
【24h】

A lexicon pool augmented Naive Bayes Classifier for Nepali Text

机译:词典池增强了尼泊尔语文本的朴素贝叶斯分类器

获取原文

摘要

This paper presents our experimental work on machine classification of Nepali texts. We have implemented a Naive Bayes classifier for the task and then augmented it through a multinomial lexicon pooling. The lexicon-pooled Naive Bayes Classifier obtains better results on classification task as compared to a normal Naive Bayes implementation. This hybrid approach also helps in dealing with the unavailability of linguistic resources in Nepali (such as stemmer, stop word list and accurate POS tagger). The proposed lexicon-pooled Naive Bayes approach is evaluated by applying on a sufficiently large dataset of Nepalese news stories. The experimental results demonstrate the higher classification accuracy and usefulness of the method for Nepali text classification. The paper also contributes resources to Nepali language processing, in form of a Nepali news stories corpus and a domain specific lexicon for Nepali news stories.
机译:本文介绍了我们对尼泊尔文本机器分类的实验工作。我们为该任务实现了朴素贝叶斯分类器,然后通过多项式词典池对其进行了扩充。与普通的朴素贝叶斯实现相比,词典集中的朴素贝叶斯分类器在分类任务上获得了更好的结果。这种混合方法还有助于处理尼泊尔语中语言资源的不足(例如词干,停用词列表和准确的POS标签)。通过在尼泊尔新闻故事的足够大的数据集上进行应用,对拟议的词典库朴素贝叶斯方法进行了评估。实验结果表明,该方法对尼泊尔文字分类具有较高的分类准确性和实用性。本文还以尼泊尔新闻故事语料库和尼泊尔新闻故事的特定领域词典的形式为尼泊尔语言处理提供了资源。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号