首页> 外国专利> Apparatus and method for probabilistic population size and overlap determination, remote processing of private data and probabilistic population size and overlap determination for three or more data sets

Apparatus and method for probabilistic population size and overlap determination, remote processing of private data and probabilistic population size and overlap determination for three or more data sets

机译:用于三个或更多数据集的概率人口规模和重叠确定,私有数据和概率人口规模和重叠确定的设备和方法

摘要

The invention determines the population size and population overlap in data containing records on the unique entities without unique identifiers for the unique entities and having at least one common type of information with a known distribution of finite expectation by decomposing probabilistic calculations. The computer determines population overlap of unique entities between the data sets by subtracting a probabilistic incremental number of unique entities needed for a larger total number of values of the information with the known distribution from the data sets. The invention can also maintain the security of private data by allowing a remote computer where the original data is stored to download diagnostic and aggregation procedures from another computer over a network. The remote computer performs the functions on the data and forwards the results to the estimate processor computer over the network. The estimate processor determines population size and overlap from aggregate results and forwards this information back to the remote computer over the network. The invention also determines the overlap of three or more data sets by concatenating all combinations of the data sets and determining estimates for all subsets of the combinations of the data sets. The operations involve the cancellation of equivalent terms that have opposite signs.
机译:本发明通过分解概率计算来确定包含唯一实体的记录而没有唯一实体的唯一标识符并且具有至少一种常见类型的信息且具有有限期望的已知分布的数据中的种群大小和种群重叠。计算机通过从数据集中减去具有已知分布的较大数量的信息值所需的唯一实体的概率增量数,来确定数据集之间的唯一实体的总体重叠。本发明还可以通过允许存储原始数据的远程计算机通过网络从另一台计算机下载诊断和聚集程序来维护私有数据的安全性。远程计算机对数据执行功能,然后将结果通过网络转发给估算处理器计算机。估计处理器根据总体结果确定总体大小和重叠部分,并将此信息通过网络转发回远程计算机。本发明还通过级联数据集的所有组合并确定数据集的组合的所有子集的估计来确定三个或更多数据集的重叠。这些操作涉及取消具有相反符号的等效术语。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号