Efficient Algorithms for Masking and Finding Quasi-Identifiers

机译：用于屏蔽和查找准标识符的高效算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A quasi-identifier refers to a subset of attributes that can uniquely identify most tuples in a table. Incautious publication of quasi-identifiers will lead to privacy leakage. In this paper we consider the problems of finding and masking quasi-identifiers. Both problems are provably hard with severe time and space requirements. We focus on designing efficient approximation algorithms for large data sets. We first propose two natural measures for quantifying quasi-identifiers: distinct ratio and separation ratio. We develop efficient algorithms that find small quasi-identifiers with provable size and separation/distinct ratio guarantees, with space and time requirements sublinear in the number of tuples. We also propose efficient algorithms for masking quasi-identifiers, where we use a random sampling technique to greatly reduce the space and time requirements, without much sacrifice in the quality of the results. Our algorithms for masking and finding quasi-identifiers naturally apply to stream databases. Extensive experimental results on real world data sets confirm efficiency and accuracy of our algorithms.

机译：准标识符是指可以唯一地标识表中大多数元组的属性子集。对准标识符的不规则之刊将导致隐私泄漏。在本文中，我们考虑了查找和掩蔽准标识符的问题。这两个问题都是难以严重的时间和空间要求。我们专注于为大型数据集设计有效的近似算法。我们首先提出了两种用于量化准标识符的自然措施：不同的比例和分离率。我们开发高效的算法，该算法找到具有可提供的尺寸和分离/不同的比率保证的小准标识符，空间和时间要求在元组的数量中汇总。我们还提出了用于掩蔽准标识符的高效算法，在那里我们使用随机采样技术大大减少空间和时间要求，而不是在结果的质量上牺牲。我们用于屏蔽和查找准标识符的算法自然适用于流数据库。对现实世界数据的广泛实验结果确定了我们算法的效率和准确性。

著录项

来源
《SIAM International Conference on Data Mining》|2008年|869 p.|共10页
会议地点
作者
Rajeev Motwani; Ying Xu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP274.2-53;
关键词

相似文献

外文文献
中文文献
专利

1. Finding Quasi-identifiers for K-Anonymity Model by the Set of Cut-vertex [J] . Yan Yan, Wanjun Wang, Xiaohong Hao, Engineering Letters . 2018,第1期

机译：通过Cut-vertex集寻找K-匿名模型的准标识符
2. Finding Quasi-identifiers for K-Anonymity Model by the Set of Cut-vertex [J] . Yan Yan, Wanjun Wang, Xiaohong Hao, Engineering Letters . 2018,第1期

机译：通过Cut-vertex集寻找K-匿名模型的准标识符
3. On efficient algorithms for finding efficient salvo policies [J] . van Ee Martijn Naval Research Logistics . 2020,第2期

机译：关于寻找有效的齐射策略的有效算法
4. Efficient Algorithms for Masking and Finding Quasi-Identifiers [C] . Rajeev Motwani, Ying Xu SIAM International Conference on Data Mining . 2008

机译：用于屏蔽和查找准标识符的高效算法
5. Efficient Algorithms for Frequent Path Finding and Similarity Join in Big Multidimensional Data [D] . Luo, Wuman 2012

机译：大多维数据中频繁路径查找和相似联接的高效算法
6. Efficient sequential and parallel algorithms for finding edit distance based motifs [O] . Soumitra Pal, Peng Xiao, Sanguthevar Rajasekaran 2016

机译：高效的顺序和并行算法用于查找基于编辑距离的图案
7. Variable selection procedures and efficient suboptimal mask search algorithms in fuzzy inductive reasoning [O] . Mirats-Tur, Josep M., Cellier, François E., Huber Garrido, Rafael 2002

机译：模糊归纳推理中的变量选择程序和有效的次优模板搜索算法

Efficient Algorithms for Masking and Finding Quasi-Identifiers

摘要

著录项

相似文献

相关主题

期刊订阅