...
首页> 外文期刊>BMC Medical Genomics >Logistic regression over encrypted data from fully homomorphic encryption
【24h】

Logistic regression over encrypted data from fully homomorphic encryption

机译:从完全同态加密对加密数据进行逻辑回归

获取原文
           

摘要

One of the tasks in the 2017 iDASH secure genome analysis competition was to enable training of logistic regression models over encrypted genomic data. More precisely, given a list of approximately 1500 patient records, each with 18 binary features containing information on specific mutations, the idea was for the data holder to encrypt the records using homomorphic encryption, and send them to an untrusted cloud for storage. The cloud could then homomorphically apply a training algorithm on the encrypted data to obtain an encrypted logistic regression model, which can be sent to the data holder for decryption. In this way, the data holder could successfully outsource the training process without revealing either her sensitive data, or the trained model, to the cloud. Our solution to this problem has several novelties: we use a multi-bit plaintext space in fully homomorphic encryption together with fixed point number encoding; we combine bootstrapping in fully homomorphic encryption with a scaling operation in fixed point arithmetic; we use a minimax polynomial approximation to the sigmoid function and the 1-bit gradient descent method to reduce the plaintext growth in the training process. Our algorithm for training over encrypted data takes 0.4–3.2 hours per iteration of gradient descent. We demonstrate the feasibility but high computational cost of training over encrypted data. On the other hand, our method can guarantee the highest level of data privacy in critical applications.
机译:2017年iDASH安全基因组分析竞赛的任务之一是启用对加密基因组数据的逻辑回归模型的训练。更准确地说,给定约1500条患者记录的列表,每条记录都具有包含特定突变信息的18个二进制特征,其想法是让数据持有者使用同态加密对记录进行加密,然后将其发送到不受信任的云中进行存储。然后,云可以将加密算法同形地应用训练算法以获得加密的逻辑回归模型,该模型可以发送到数据持有者进行解密。这样,数据持有者可以成功地将培训过程外包,而无需将其敏感数据或受过训练的模型透露给云。我们对这个问题的解决方案有几个新颖之处:我们在完全同态加密中使用多位明文空间以及定点数编码;我们将完全同态​​加密中的自举与定点算法中的缩放操作结合在一起;我们对Sigmoid函数使用最小极大多项式逼近和1位梯度下降法来减少训练过程中明文的增长。我们的加密数据训练算法每次梯度下降迭代需要0.4-3.2小时。我们证明了对加密数据进行训练的可行性,但计算成本较高。另一方面,我们的方法可以保证关键应用程序中最高级别的数据隐私。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号