首页> 外文期刊>Journal of Educational Data Mining >A Joint Probabilistic Classification Model of Relevant and Irrelevant Sentences in Mathematical Word Problems
【24h】

A Joint Probabilistic Classification Model of Relevant and Irrelevant Sentences in Mathematical Word Problems

机译:数学单词问题中相关句和无关句的联合概率分类模型

获取原文
           

摘要

Estimating the difficulty level of math word problems is an important task for many educational applications. Identification of relevant and irrelevant sentences in math word problems is an important step for calculating the difficulty levels of such problems. This paper addresses a novel application of text categorization to identify two types of sentences in mathematical word problems, namely relevant and irrelevant sentences. A novel joint probabilistic classification model is proposed to estimate the joint probability of classification decisions for all sentences of a math word problem by utilizing the correlation among all sentences along with the correlation between the question sentence and other sentences, and sentence text. The proposed model is compared with i) a SVM classifier which makes independent classification decisions for individual sentences by only using the sentence text and ii) a novel SVM classifier that considers the correlation between the question sentence and other sentences along with the sentence text. An extensive set of experiments demonstrates the effectiveness of the joint probabilistic classification model for identifying relevant and irrelevant sentences as well as the novel SVM classifier that utilizes the correlation between the question sentence and other sentences. Furthermore, empirical results and analysis show that i) it is highly beneficial not to remove stopwords and ii) utilizing part of speech tagging does not make a significant improvement although it has been shown to be effective for the related task of math word problem type classification.
机译:估计数学单词问题的难度水平是许多教育应用程序的重要任务。识别数学单词问题中相关和不相关的句子是计算此类问题的难度级别的重要步骤。本文研究了文本分类在识别数学单词问题中两种类型的句子(即相关和不相关的句子)方面的新颖应用。提出了一种新颖的联合概率分类模型,通过利用所有句子之间的相关性以及疑问句与其他句子以及句子文本之间的相关性来估计数学单词问题的所有句子的分类决策的联合概率。将提出的模型与i)SVM分类器进行比较,i SVM分类器仅通过使用句子文本对单个句子做出独立的分类决策; ii)一种新颖的SVM分类器,它考虑问题句子与其他句子之间的相关性以及句子文本。大量的实验证明了联合概率分类模型用于识别相关和不相关句子的有效性,以及利用问题句和其他句子之间的相关性的新型SVM分类器。此外,经验结果和分析表明,i)不删除停用词是非常有益的,并且ii)利用部分语音标记没有显着改善,尽管已证明它对数学单词问题类型分类的相关任务是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号