首页> 美国卫生研究院文献>AMIA Annual Symposium Proceedings >Effects of data anonymization by cell suppression on descriptive statistics and predictive modeling performance.
【2h】

Effects of data anonymization by cell suppression on descriptive statistics and predictive modeling performance.

机译:通过单元格抑制对数据匿名进行描述性统计和预测建模性能的影响。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Protecting individual data in disclosed databases is essential. Data anonymization strategies can produce table ambiguation by suppression of selected cells. Using table ambiguation, different degrees of anonymization can be achieved, depending on the number of individuals that a particular case must become indistinguishable from. This number defines the level of anonymization. Anonymization by cell suppression does not necessarily prevent inferences from being made from the disclosed data. Preventing inferences may be important to preserve confidentiality. We show that anonymized data sets can preserve descriptive characteristics of the data, but might also be used for making inferences on particular individuals, which is a feature that may not be desirable. The degradation of predictive performance is directly proportional to the degree of anonymity. As an example, we report the effect of anonymization on the predictive performance of a model constructed to estimate the probability of disease given clinical findings.
机译:保护公开数据库中的个人数据至关重要。数据匿名化策略可以通过抑制所选单元格来产生表歧义。使用表歧义,可以实现不同程度的匿名化,具体取决于特定案例必须变得与众不同的个人数量。此数字定义匿名化的级别。通过小区抑制进行匿名化并不一定防止从公开的数据中得出推论。防止推断对于保护机密性可能很重要。我们显示匿名数据集可以保留数据的描述性特征,但也可以用于对特定个体进行推断,这可能是不希望的。预测性能的下降与匿名程度成正比。例如,我们报告了匿名化对模型的预测性能的影响,该模型的构建旨在根据临床发现估计疾病的可能性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号