首页> 中文期刊> 《计算机应用研究》 >互联网广告点击率预估模型中特征提取方法的研究与实现

互联网广告点击率预估模型中特征提取方法的研究与实现

         

摘要

互联网广告是一个具有上千亿元规模的市场,广告的点击率(CTR)是互联网广告投放效果的重要指标.在广告点击率预估模型中,特征提取是关键因素,特征的好坏直接影响到最终模型的效果.针对如何提高广告点击率预估效率问题,在Hadoop大数据平台环境中,提出了基于梯度提升决策树(gradient boost decisiontree,GBDT)模型的多维特征提取方法.该方法利用原始数据构建多维基础特征库,并将基础特征库中除ID类特征以外的其余特征输入GBDT模型进行特征刷选,得到高层特征,进一步进行分类.该方法的使用不仅减少了特征提取的人工成本和时间成本,也在很大程度上提升了模型的精度.%Internet advertising is a hundreds of billions of dollars of market.CTR(click-through-rate) is an important indicator of the effectiveness of Internet advertising.In the CTR prediction model,features are used to be a key factor to the success or failure of many machine learning projects and the characteristics of the feature will directly affect the final model.In order to make the Internet advertisement CTR prediction model can be more accurate,this paper put forward a GBDT-based multidimensional feature extraction method which ran on the Hadoop big data platform.This method used raw data to build a multidimensional feature library and put all the basic features into GBDT model for feature selection except for ID features,in order to get high level features for further classification.This method not only reduces labor costs and time costs in feature extraction stage,but largely enhances the accuracy of the CTR prediction model.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号