首页> 外文期刊>Information and Knowledge Management >Computational Efficiency Analysis of Customer Churn Prediction Using Spark and Caret Random Forest Classifier
【24h】

Computational Efficiency Analysis of Customer Churn Prediction Using Spark and Caret Random Forest Classifier

机译:使用Spark和Caret随机森林分类器的客户流失预测的计算效率分析

获取原文
           

摘要

Today’s businesses are buying into technological advancement for productivity, profit maximization and better service delivery. Meanwhile technology as also brought about data coming in at an alarming rate in which businesses need to re-strategize how these data are being handled for them to retain ability to turn them to value. Traditional data mining techniques has proofed beyond doubt that data can be harnessed and turn into value for business growth. But the era of large scale data is posing a challenge of computational efficiency to this traditional approach. This paper therefore address this issue by under-studying a big data analytics tool- Spark with a data mining technique Caret . A churn Telecom dataset was used to analyse both the computational and performance metrics of the two approaches using their Random Forest (RF) classifier. The Classifier was trained with same the train set partitioning and tuning parameters. The result shows that Spark-RF is computational efficient with execution time of 50.25 secs compared to Caret-RF of 847.20 secs . Customer churning rate could be minimized if proper management attention and policy is paid to tenure (ShortTenure), Contract, InternetService and PaymentMethod as the variable importance plot and churn rate count mechanism confirm that. The Classifier accuracy was approximately 80% for both implementation.
机译:当今的企业正在为提高生产力,利润最大化和更好的服务交付而投入技术进步。同时,技术带来的数据输入速度惊人,企业需要重新制定如何处理这些数据的战略,以保持其将价值转化为价值的能力。传统的数据挖掘技术无疑证明了可以利用数据并为业务增长带来价值。但是,大规模数据时代对这种传统方法提出了计算效率的挑战。因此,本文通过对大数据分析工具-Spark和数据挖掘技术Caret的研究不足,来解决此问题。使用流失的电信数据集,使用其随机森林(RF)分类器来分析这两种方法的计算和性能指标。使用相同的训练集分区和调整参数训练了分类器。结果表明,与Caret-RF的847.20 sec相比,Spark-RF的执行时间为50.25 sec,计算效率高。如果对可变权重要度图和流失率计数机制进行了确认,那么如果对权属(ShortTenure),合同,InternetService和PaymentMethod给予适当的管理关注和政策,则可以最大程度地减少客户流失率。对于这两种实施方式,分类器的准确性均约为80%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号