InPrivate Digging: Enabling Tree-based Distributed Data Mining with Differential Privacy

机译：InPrivate Digging：启用具有差异性隐私的基于树的分布式数据挖掘

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data mining has heralded the major breakthrough in data analysis, serving as a “super cruncher” to discover hidden information and valuable knowledge in big data systems. For many applications, the collection of big data usually involves various parties who are interested in pooling their private data sets together to jointly train machine-learning models that yield more accurate prediction results. However, data owners may not be willing to disclose their own data due to privacy concerns, making it imperative to provide privacy guarantee in collaborative data mining over distributed data sets. In this paper, we focus on tree-based data mining. To begin with, we design novel privacy-preserving schemes for two most common tasks: regression and binary classification, where individual data owners can perform training locally in a differentially private manner. Then, for the first time, we design and implement a privacy-preserving system for gradient boosting decision tree (GBDT), where different regression trees trained by multiple data owners can be securely aggregated into an ensemble. We conduct extensive experiments to evaluate the performance of our system on multiple real-world data sets. The results demonstrate that our system can provide a strong privacy protection for individual data owners while maintaining the prediction accuracy of the original trained model.

机译：数据挖掘预示了数据分析的重大突破，它可以作为“超级研究者”来发现大数据系统中的隐藏信息和有价值的知识。对于许多应用程序来说，大数据的收集通常涉及有兴趣将其私有数据集合并在一起以共同训练机器学习模型的各方，这些模型可产生更准确的预测结果。但是，由于隐私问题，数据所有者可能不愿意公开自己的数据，因此必须在分布式数据集的协作数据挖掘中提供隐私保证。在本文中，我们专注于基于树的数据挖掘。首先，我们为两种最常见的任务设计了新颖的隐私保护方案：回归和二进制分类，其中单个数据所有者可以以差异私有的方式在本地进行培训。然后，我们首次设计并实现了用于梯度提升决策树（GBDT）的隐私保护系统，在该系统中，可以将由多个数据所有者训练的不同回归树安全地聚合到一个集合中。我们进行了广泛的实验，以评估我们的系统在多个真实数据集上的性能。结果表明，我们的系统可以为单个数据所有者提供强大的隐私保护，同时保持原始训练模型的预测准确性。

著录项

来源
《IEEE Conference on Computer Communications Workshops》|2018年|2087-2095|共9页
会议地点
作者
Lingchen Zhao; Lihao Ni; Shengshan Hu; Yaniiao Chen; Pan Zhou; Fu Xiao; Libing Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Decision trees; Boosting; Privacy; Data models;

机译：决策树;提升;隐私;数据模型;

相似文献

外文文献
中文文献
专利

1. A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining [J] . Xiao-Bai Li, Sarkar S. IEEE Transactions on Knowledge and Data Engineering . 2006,第9期

机译：一种基于树的数据摄动方法，用于保护隐私的数据挖掘
2. Enabling Multilevel Trust in Privacy Preserving Data Mining [J] . Li Yaping, Chen Minghua, Li Qiwei, Knowledge and Data Engineering, IEEE Transactions on . 2012,第9期

机译：在隐私保护数据挖掘中启用多级信任
3. PRIVACY PRESERVING DATA MINING OF VERTICALLY PARTITIONED DATA IN DISTRIBUTED ENVIRONMENT- AN EXPERIMENTAL ANALYSIS [J] . DR. PREETI GULIA, HEMLATA Journal of Theoretical and Applied Information Technology . 2018,第10期

机译：分布式环境中垂直分区数据的隐私保护数据挖掘-实验分析
4. InPrivate Digging: Enabling Tree-based Distributed Data Mining with Differential Privacy [C] . Lingchen Zhao, Lihao Ni, Shengshan Hu, IEEE Conference on Computer Communications . 2018

机译：Inprivate Douging：通过差异隐私启用基于树的分布式数据挖掘
5. A Utility-Aware Privacy Preserving Framework For Distributed Data Mining With Worst Case Privacy Guarantee. [D] . Banerjee, Madhushri. 2011

机译：一个实用程序感知的隐私保护框架，用于具有最坏情况隐私保证的分布式数据挖掘。
6. A Distributed Ensemble Approach for Mining Healthcare Data under Privacy Constraints [O] . Yan Li, Changxin Bai, Chandan K. Reddy -1

机译：在隐私约束下挖掘医疗数据的分布式集成方法
7. Enabling Multi-level Trust in Privacy Preserving Data Mining [O] . Li, Yaping, Chen, Minghua, Li, Qiwei, 2011

机译：在隐私保护数据挖掘中实现多级信任

InPrivate Digging: Enabling Tree-based Distributed Data Mining with Differential Privacy

摘要

著录项

相似文献

相关主题

期刊订阅