Information preserving regression-based tools for statistical disclosure control

Langsrud Oyvind

首页> 外文期刊>Statistics and computing >Information preserving regression-based tools for statistical disclosure control

【24h】

Information preserving regression-based tools for statistical disclosure control

机译：信息保留基于回归的统计泄露控制工具

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a unified framework for regression-based statistical disclosure control for microdata. A basic method, known as information preserving statistical obfuscation (IPSO), produces synthetic data that preserve variances, covariances and fitted values. The data are then generated conditionally according to the multivariate normal distribution. Generalizations of the IPSO method are described in the literature, and these methods aim to generate data more similar to the original data. This paper describes these methods in a concise and interpretable way, which is close to efficient implementation. Decomposing the residual data into orthogonal scores and corresponding loadings is an essential part of the framework. Both QR decomposition (Gram-Schmidt orthogonalization) and singular value decomposition (principal components) may be used. Within this framework, new and generalized methods are presented. In particular, a method is described by means of which the correlations to the original principal component scores can be controlled exactly. It is shown that a suggested method of random orthogonal matrix masking can be implemented without generating an orthogonal matrix. Generalized methodology for hierarchical categories is presented within the context of microaggregation. Some information can then be preserved at the lowest level and more information at higher levels. The presented methodology is also applicable to tabular data. One possibility is to replace the content of primary and secondary suppressed cells with generated values. It is proposed replacing suppressed cell frequencies with decimal numbers, and it is argued that this can be a useful method.

机译：本文介绍了对Microdata的基于回归的统计披露控制的统一框架。一种基本方法，称为信息保留统计混淆（IPSO），产生了保持差异，协方差和装配值的合成数据。然后根据多变量正态分布条件地生成数据。在文献中描述了IPSO方法的概括，这些方法旨在生成更类似于原始数据的数据。本文以简洁和可解释的方式介绍了这些方法，这是接近有效的实现。将残余数据分解成正交分数和相应的负载是框架的重要组成部分。可以使用QR分解（Gram-Schmidt正交化）和奇异值分解（主成分）。在此框架内，提出了新的和广义方法。特别地，通过该方法描述了可以精确地控制与原始主成分分数的相关性。结果表明，可以在不生成正交矩阵的情况下实现所建议的随机正交矩阵屏蔽的方法。在微识别的背景下呈现了分层类别的广义方法。然后可以在更高级别的最低级别和更多信息中保留一些信息。呈现的方法也适用于表格数据。一种可能性是替换具有生成值的主抑制单元的内容。建议用十进制数替换抑制的细胞频率，并且认为这可以是有用的方法。

著录项

来源
《Statistics and computing》 |2019年第5期|965-976|共12页
作者
Langsrud Oyvind;
展开▼
作者单位

Stat Norway POB 2633 N-0131 Oslo Norway;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Microdata anonymization; Synthetic data; Microaggregation; Hybrid microdata; Cell suppression; Official statistics;

机译：Microdata匿名化;合成数据;微烧结;混合微立数据;细胞抑制;官方统计;

相似文献

外文文献
中文文献
专利

1. Information preserving regression-based tools for statistical disclosure control [J] . Langsrud Oyvind Statistics and computing . 2019,第5期

机译：基于信息保留回归的统计信息披露控制工具
2. On the connections between statistical disclosure control for microdata and some artificial intelligence tools [J] . Domingo-Ferrer J., Torra V. Information Sciences: An International Journal . 2003,第0期

机译：关于微数据统计披露控制与一些人工智能工具之间的联系
3. Automated regression-based statistical downscaling tool [J] . Masoud Hessami, Philippe Gachon, Taha B.M.J. Ouarda, Environmental Modelling & Software . 2008,第6期

机译：自动化的基于回归的统计缩减工具
4. Understanding Microaggregation- A technique of Statistical Disclosure Control for Privacy Preserving and Data Publishing in Inter-Cloud [C] . Veena Gadad, Sowmyarani C N International Conference on Advances in Electronics, Computers and Communications . 2018

机译：了解微聚合-一种用于云间隐私保护和数据发布的统计披露控制技术
5. Statistical tools for disclosure limitation in multi-way contingency tables. [D] . Dobra, Adrian. 2002

机译：多向列联表中限制披露的统计工具。
6. IPUMS-International Statistical Disclosure Controls: 159 Census Microdata Samples in Dissemination 100+ in Preparation [O] . Robert McCaa, Steven Ruggles, Matt Sobek -1

机译：IPUMS-International统计披露控制：159人口普查Microdata样品在播散100 +制备中
7. Preserving edits when perturbing microdata for statistical disclosure control [O] . Shlomo Natalie, De Waal Ton 2005

机译：在扰动微观数据以进行统计公开控制时保留编辑
8. Distribution-Preserving Statistical Disclosure Limitation:Technical paper [R] . Woodcock, S. D. 2006

机译：保持分配的统计披露限制：技术文件

Information preserving regression-based tools for statistical disclosure control

摘要

著录项

相似文献

相关主题

期刊订阅