首页> 外文期刊>Journal of Intelligent Learning Systems and Applications >Knowledge Discovery for Query Formulation for Validation of a Bayesian Belief Network
【24h】

Knowledge Discovery for Query Formulation for Validation of a Bayesian Belief Network

机译:用于贝叶斯信任网络验证的查询公式的知识发现

获取原文
           

摘要

This paper proposes machine learning techniques to discover knowledge in a dataset in the form of if-then rules for the purpose of formulating queries for validation of a Bayesian belief network model of the same data. Although do-main expertise is often available, the query formulation task is tedious and laborious, and hence automation of query formulation is desirable. In an effort to automate the query formulation process, a machine learning algorithm is lev-eraged to discover knowledge in the form of if-then rules in the data from which the Bayesian belief network model under validation was also induced. The set of if-then rules are processed and filtered through domain expertise to identify a subset that consists of “interesting” and “significant” rules. The subset of interesting and significant rules is formulated into corresponding queries to be posed, for validation purposes, to the Bayesian belief network induced from the same dataset. The promise of the proposed methodology was assessed through an empirical study performed on a real-life dataset, the National Crime Victimization Survey, which has over 250 attributes and well over 200,000 data points. The study demonstrated that the proposed approach is feasible and provides automation, in part, of the query formulation process for validation of a complex probabilistic model, which culminates in substantial savings for the need for human expert involvement and investment.
机译:本文提出了一种机器学习技术,以if-then规则的形式发现数据集中的知识,目的是提出查询以验证同一数据的贝叶斯信念网络模型。尽管通常具有领域专业知识,但是查询制定任务繁琐且费力,因此希望自动化查询制定。为了使查询制定过程自动化,人们提出了一种机器学习算法,以if-then规则的形式发现数据中的知识,从中也可以得出验证中的贝叶斯信念网络模型。一组if-then规则将通过领域专业知识进行处理和过滤,以识别包含“有趣”和“重要”规则的子集。有趣且重要的规则的子集被公式化为相应的查询,以进行验证,以对同一数据集产生的贝叶斯信念网络提出验证。通过对真实数据集“国家犯罪被害调查”进行的实证研究,评估了所提出方法的前景。该研究具有250多个属性,并且拥有超过200,000个数据点。研究表明,所提出的方法是可行的,并且部分地为复杂的概率模型的验证提供了查询表述过程的自动化,最终节省了大量人力,需要人工参与和投资。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号