A privacy-preserving distributed filtering framework for NLP artifacts

Nazmus Sadat; Momin Al Aziz; Noman Mohammed; Serguei Pakhomov; Hongfang Liu; Xiaoqian Jiang

首页> 外文期刊>BMC Medical Informatics and Decision Making >A privacy-preserving distributed filtering framework for NLP artifacts

【24h】

A privacy-preserving distributed filtering framework for NLP artifacts

机译：用于NLP工件的保护隐私的分布式过滤框架

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Medical data sharing is a big challenge in biomedicine, which often hinders collaborative research. Due to privacy concerns, clinical notes cannot be directly shared. A lot of efforts have been dedicated to de-identifying clinical notes but it is still very challenging to accurately locate and scrub all sensitive elements from notes in an automatic manner. An alternative approach is to remove sentences that might contain sensitive terms related to personal information. A previous study introduced a frequency-based filtering approach that removes sentences containing low frequency bigrams to improve the privacy protection without significantly decreasing the utility. Our work extends this method to consider clinical notes from distributed sources with security and privacy considerations. We developed a novel secure protocol based on private set intersection and secure thresholding to identify uncommon and low-frequency terms, which can be used to guide sentence filtering. As the computational cost of our proposed framework mostly depends on the cardinality of the intersection of the sets and the number of data owners, we evaluated the framework in terms of these two factors. Experimental results demonstrate that our proposed method is scalable in various experimental settings. In addition, we evaluated our framework in terms of data utility. This evaluation shows that the proposed method is able to retain enough information for data analysis. This work demonstrates the feasibility of using homomorphic encryption to develop a secure and efficient multi-party protocol.

机译：医学数据共享是生物医学中的一大挑战，这通常会阻碍协作研究。由于隐私问题，不能直接共享临床笔记。为了消除临床笔记的识别性，已经进行了很多努力，但是要准确地自动定位笔记中的所有敏感元素并对其进行擦洗仍然是很大的挑战。另一种方法是删除可能包含与个人信息有关的敏感术语的句子。先前的研究引入了一种基于频率的过滤方法，该方法可删除包含低频双字母组的句子，从而在不显着降低实用性的情况下改善隐私保护。我们的工作将这种方法扩展为考虑来自分布式来源的临床注意事项，同时考虑到安全性和隐私权。我们开发了一种基于私有集交集和安全阈值的新颖安全协议，以识别不常见和低频的术语，可用于指导句子过滤。由于我们提出的框架的计算成本主要取决于集合和数据所有者数量的交集的基数，因此我们根据这两个因素对框架进行了评估。实验结果表明，我们提出的方法在各种实验设置下均可扩展。此外，我们根据数据实用性评估了我们的框架。该评估表明，所提出的方法能够保留足够的信息以进行数据分析。这项工作演示了使用同态加密来开发安全有效的多方协议的可行性。

著录项

来源
《BMC Medical Informatics and Decision Making》 |2019年第1期|共10页
作者
Nazmus Sadat; Momin Al Aziz; Noman Mohammed; Serguei Pakhomov; Hongfang Liu; Xiaoqian Jiang;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医药、卫生;
关键词
Biomedical data security and privacyClinical notes de-identificationHomomorphic encryption;

机译：生物医学数据安全性和隐私临床笔记去识别同态加密;

相似文献

外文文献
中文文献
专利

1. Privacy-preserving distributed collaborative filtering [J] . Boutet Antoine, Frey Davide, Guerraoui Rachid, Computing . 2016,第8期

机译：隐私保护的分布式协作过滤
2. Privacy-preserving hybrid collaborative filtering on cross distributed data [J] . Ibrahim Yakut, Huseyin Polat Knowledge and information systems . 2012,第2期

机译：跨分布数据的隐私保护混合协作过滤
3. Privacy-preserving hybrid collaborative filtering on cross distributed data [J] . Ibrahim Yakut, Huseyin Polat Knowledge and Information Systems . 2012,第2期

机译：交叉分布数据的保护隐私的混合协作过滤
4. Privacy-Preserving Distributed Graph Filtering [C] . Qiongxiu Li, Mario Coutino, Geert Leus, European Signal Processing Conference . 2020

机译：隐私保留分布式图筛选
5. P4P: A practical framework for privacy-preserving distributed computation [D] . Duan, Yitao 2007

机译：P4P：用于保护隐私的分布式计算的实用框架
6. A privacy-preserving distributed filtering framework for NLP artifacts [O] . Md Nazmus Sadat, Md Momin Al Aziz, Noman Mohammed, 2019

机译：用于NLP工件的保护隐私的分布式过滤框架
7. A privacy-preserving distributed filtering framework for NLP artifacts [O] . Md Nazmus Sadat, Md Momin Al Aziz, Noman Mohammed, 2019

机译：用于NLP工件的隐私分布式过滤框架

A privacy-preserving distributed filtering framework for NLP artifacts

摘要

著录项

相似文献

相关主题

期刊订阅