An efficient record linkage scheme using graphical analysis for identifier error detection

John M Finney; A S Walker; Tim EA Peto; David H Wyllie

首页> 外文期刊>BMC Medical Informatics and Decision Making >An efficient record linkage scheme using graphical analysis for identifier error detection

【24h】

An efficient record linkage scheme using graphical analysis for identifier error detection

机译：使用图形分析的有效记录链接方案用于标识符错误检测

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Integration of information on individuals (record linkage) is a key problem in healthcare delivery, epidemiology, and "business intelligence" applications. It is now common to be required to link very large numbers of records, often containing various combinations of theoretically unique identifiers, such as NHS numbers, which are both incomplete and error-prone. Methods We describe a two-step record linkage algorithm in which identifiers with high cardinality are identified or generated, and used to perform an initial exact match based linkage. Subsequently, the resulting clusters are studied and, if appropriate, partitioned using a graph based algorithm detecting erroneous identifiers. Results The system was used to cluster over 250 million health records from five data sources within a large UK hospital group. Linkage, which was completed in about 30 minutes, yielded 3.6 million clusters of which about 99.8% contain, with high likelihood, records from one patient. Although computationally efficient, the algorithm's requirement for exact matching of at least one identifier of each record to another for cluster formation may be a limitation in some databases containing records of low identifier quality. Conclusions The technique described offers a simple, fast and highly efficient two-step method for large scale initial linkage for records commonly found in the UK's National Health Service.

机译：背景技术关于个人的信息集成（记录链接）是医疗保健提供，流行病学和“商业智能”应用程序中的关键问题。现在通常需要链接非常多的记录，这些记录通常包含理论上唯一的标识符（例如NHS编号）的各种组合，这些标识符既不完整也不容易出错。方法我们描述了一种两步记录链接算法，其中识别或生成具有高基数的标识符，并用于执行基于初始精确匹配的链接。随后，研究所得的群集，并在适当时使用基于图形的检测错误标识符的算法对群集进行分区。结果该系统被用来对来自英国一家大型医院集团的五个数据源的2.5亿条健康记录进行聚类。链接在大约30分钟内完成，产生了360万个簇，其中大约99.8％包含着一名患者的记录。尽管计算效率高，但是该算法要求将每个记录的至少一个标识符与另一个记录进行精确匹配以进行聚类形成，这可能在某些包含低标识符质量的记录的数据库中是一个限制。结论所描述的技术提供了一种简单，快速，高效的两步方法，可用于大规模初始链接，以获取英国国家卫生服务局（National Health Service）常见的记录。

著录项

来源
《BMC Medical Informatics and Decision Making》 |2011年第1期|共页
作者
John M Finney; A S Walker; Tim EA Peto; David H Wyllie;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医药、卫生;
关键词

相似文献

外文文献
中文文献
专利

1. Use of graph theory measures to identify errors in record linkage [J] . RandallS.M., BoydJ.H., FerranteA.M., Computer Methods and Programs in Biomedicine: An International Journal Devoted to the Development, Implementation and Exchange of Computing Methodology and Software Systems in Biomedical Research and Medical Practice . 2014,第2期

机译：使用图论度量来识别记录链接中的错误
2. Secrecy analysis of alamouti scheme using feedback-rate efficient transmit antenna selection with robust error performance in the presence of feedback errors [J] . Rajiv Kumar, Sudakar Singh Chauhan AEU: Archiv fur Elektronik und Ubertragungstechnik: Electronic and Communication . 2018,第期

机译：使用反馈速率高效发射天线选择在存在反馈错误的情况下使用反馈速率有效发射天线选择的救济性分析
3. Accuracy Analysis of the Frisch Scheme for Identifying Errors-in-Variables Systems [J] . Sderstrm T. IEEE Transactions on Automatic Control . 2007,第6期

机译：识别变量误差系统的Frisch方案的精度分析
4. A Scalable and Efficient Subgroup Blocking Scheme for Multidatabase Record Linkage [C] . Thilina Ranbaduge, Dinusha Vatsalan, Peter Christen Pacific-Asia conference on knowledge discovery and data mining . 2018

机译：用于多数据库记录链接的可扩展且高效的子组阻止方案
5. Efficient Hardware Constructions for Error Detection of Post-Quantum Cryptographic Schemes [D] . Cintas-Canto, Alvaro. 2021

机译：后量子加密方案的错误检测有效硬件结构
6. An efficient record linkage scheme using graphical analysis for identifier error detection [O] . John M Finney, A Sarah Walker, Tim EA Peto, 2011

机译：使用图形分析的有效记录链接方案用于标识符错误检测
7. An efficient record linkage scheme using graphical analysis for identifier error detection. [O] . Finney, JM, Walker, AS, Peto, TE, 2011

机译：使用图形化分析进行标识符错误检测的有效记录链接方案。

An efficient record linkage scheme using graphical analysis for identifier error detection

摘要

著录项

相似文献

相关主题

期刊订阅