...
首页> 外文期刊>PeerJ Computer Science >Natural language inference for Malayalam language using language agnostic sentence representation
【24h】

Natural language inference for Malayalam language using language agnostic sentence representation

机译:使用语言无关句子表示的Malayalam语言的自然语言推断

获取原文
           

摘要

Natural language inference (NLI) is an essential subtask in many natural language processing applications. It is a directional relationship from premise to hypothesis. A pair of texts is defined as entailed if a text infers its meaning from the other text. The NLI is also known as textual entailment recognition, and it recognizes entailed and contradictory sentences in various NLP systems like Question Answering, Summarization and Information retrieval systems. This paper describes the NLI problem attempted for a low resource Indian language Malayalam, the regional language of Kerala. More than 30 million people speak this language. The paper is about the Malayalam NLI dataset, named MaNLI dataset, and its application of NLI in Malayalam language using different models, namely Doc2Vec (paragraph vector), fastText, BERT (Bidirectional Encoder Representation from Transformers), and LASER (Language Agnostic Sentence Representation). Our work attempts NLI in two ways, as binary classification and as multiclass classification. For both the classifications, LASER outperformed the other techniques. For multiclass classification, NLI using LASER based sentence embedding technique outperformed the other techniques by a significant margin of 12% accuracy. There was also an accuracy improvement of 9% for LASER based NLI system for binary classification over the other techniques.
机译:自然语言推理(NLI)是许多自然语言处理应用程序中的必需子任务。它是从前提到假设的方向关系。如果文本从其他文本中infers infers,则定义了一对文本。 NLI也被称为文本意外识别,它识别出在问题应答,摘要和信息检索系统等各种NLP系统中的额定和矛盾的句子。本文介绍了对喀拉拉邦区域语言的低资源印度语言Malayalam尝试的NLI问题。超过3000万人讲这种语言。本文是关于MALAYALAM NLI DATASET,名为MANLI DATASET的数据集,以及使用不同型号的MALAYALAM语言中的NLI应用,即DOC2VEC(段落向量),FastText,BERT(来自变压器的双向编码器表示),以及语言不可知句子表示)。我们的工作尝试了NLI以两种方式,作为二进制分类和多种多组分类。对于分类,激光器优于其他技术。对于多标量分类,使用基于激光的句子嵌入技术的NLI优先于其他技术的显着余量为12%的精度。对于基于激光的NLI系统,还有9%的准确性提高,用于其他技术的二进制分类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号