首页> 外文会议>2017 IEEE International Conference on Big Knowledge >Privacy-Preserving Pattern Mining on Online Density Estimates
【24h】

Privacy-Preserving Pattern Mining on Online Density Estimates

机译:在线密度估计的隐私保护模式挖掘

获取原文
获取原文并翻译 | 示例

摘要

Traditional pattern mining algorithms require access to the data, either in the form of a complete set of data, as in batch data mining, or in the form of a window of recent data, as in stream mining. In the case of stream mining, this comes with a number of disadvantages, such as the possibly unbounded growth of relevant instances, drift, possibly changing data mining tasks, and issues with privacy, to name a few. Therefore, an approach has been recently proposed that extracts patterns just from statistical information of the stream - more precisely, an online density estimate that is inferred from it. As this approach is mainly based on sampling from the density estimates, it still struggles with itemsets having a medium to low frequency. To resolve this issue, we pursue an alternative strategy in this paper and directly exploit the structure of the density estimates to extract frequent itemsets. Additionally, we address the important matter of privacy-preserving data mining by ensuring that the density estimate fulfills privacy-related properties. To show the effectiveness of the proposed methods, we provide proofs and evaluate the performance on synthetic and real-world data.
机译:传统的模式挖掘算法要求以批处理数据挖掘的完整数据集的形式访问数据,或者像流挖掘一样以最近的数据窗口的形式访问数据。在流挖掘的情况下,存在许多缺点,例如相关实例的增长可能不受限制,漂移,数据挖掘任务可能会更改以及隐私问题等。因此,最近提出了一种仅从流的统计信息中提取模式的方法,更准确地说,是从流的统计信息中提取在线密度估计值。由于此方法主要基于密度估计值的采样,因此仍难以解决中低频率的项目集。为了解决这个问题,我们在本文中寻求一种替代策略,并直接利用密度估计的结构来提取频繁项集。此外,我们通过确保密度估算值满足隐私相关属性来解决隐私保护数据挖掘的重要问题。为了证明所提出方法的有效性,我们提供了证明并评估了综合和真实数据的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号