...
首页> 外文期刊>Procedia Computer Science >Word-Level vs Sentence-Level Language Identification: Application to Algerian and Arabic Dialects
【24h】

Word-Level vs Sentence-Level Language Identification: Application to Algerian and Arabic Dialects

机译:单词级与句子级语言识别:应用于阿尔及利亚和阿拉伯方言

获取原文
           

摘要

In this paper, we investigate a set of methods for textual Arabic Dialect Identification, where we considered word-level and sentence-level approaches. We used three classifiers, namely: Linear Support Vector Machine L-SVM, Bernoulli Naive Bayes BNB and Multinomial Naive Bayes MNB. Then we combined them by using a voting procedure. We carried out experiments on two sets of dialects: the first one, PADIC, which consists of parallel sentences in Maghrebi and Middle Eastern dialects; and the second, a set of Algerian dialects only, that we built manually. For the Arabic dialects, we obtained an average accuracy of 92%. For Algerian dialects, our approach yielded an average accuracy of about 76%.
机译:在本文中,我们研究了一套用于文本阿拉伯方言识别的方法,其中我们考虑了单词级和句子级方法。我们使用了三个分类器,即:线性支持向量机L-SVM,Bernoulli朴素贝叶斯BNB和多项式朴素贝叶斯MNB。然后,我们使用投票程序将它们合并。我们对两套方言进行了实验:第一套是PADIC,由马格里比语和中东方言中的平行句子组成;第二种是我们手动构建的一组仅阿尔及利亚方言。对于阿拉伯语,我们的平均准确度为92%。对于阿尔及利亚方言,我们的方法得出的平均准确度约为76%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号