首页> 外文期刊>Arabian Journal for Science and Engineering. Section A, Sciences >Imputation of Discrete and Continuous Missing Values in Large Datasets Using Bayesian Based Ant Colony Optimization
【24h】

Imputation of Discrete and Continuous Missing Values in Large Datasets Using Bayesian Based Ant Colony Optimization

机译:使用贝叶斯基于蚁群优化的大型数据集中离散和持续缺失值的归纳

获取原文
获取原文并翻译 | 示例
           

摘要

When preparing large databases, obtaining quality data for analysis without any missing values is almost impossible in many cases. Integration of raw data from multiple heterogeneous sources often results in some values missing leading to loss of valuable information. Even though many methods have been introduced by researchers, only less effort has been spent on handling missing values in heterogeneous attributes (both discrete and continuous) under Missing At Random pattern, the common scenario where missing values have dependency on covariates in the dataset. Also, only fewtechniques are capable of dealingwith missing values in large databases and this demands immediate attention of researchers. This paper addresses both these problems by introducing a single technique calledBayesian Ant colony Optimization (BACO) which combines the searching capability of Ant Colony Optimization with probabilistic nature of Bayesian principles. The algorithm is designed in such a way that missing values in both discrete and continuous attributes in large datasets are efficiently imputed. BACO is implemented in six large real datasets, and it is observed that its imputation accuracy outperforms than that of existing standard techniques. The statistical tests conducted also prove the superior results ofBACOin the imputation process.
机译:在准备大型数据库时,在许多情况下,在没有任何缺失值的情况下获得分析的质量数据几乎是不可能的。从多个异构源集成原始数据通常会导致一些值导致导致有价值信息的丢失。尽管研究人员已经引入了许多方法,但在随机模式下缺少的异构属性(离散和连续)中的缺失值仅花费较少的努力,缺少值缺失的常见场景具有数据集中的协变量。此外,只有少量的技术能够在大型数据库中处理缺少的值,这要求立即关注研究人员。本文通过引入叫做巴氏蚁群优化(Baco)的单一技术来解决这些问题,这与贝叶斯原则的概率性质结合了蚁群优化的搜索能力。该算法以这样的方式设计,即有效地避免了大型数据集中的离散和连续属性中的缺失值。 Baco在六个大型实时数据集中实现,并且观察到其归属精度优于现有标准技术的估算精度。进行的统计测试也证明了撤销过程的优异成果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号