首页> 外文期刊>JMIR Medical Informatics >Temporal Expression Classification and Normalization From Chinese Narrative Clinical Texts: Pattern Learning Approach
【24h】

Temporal Expression Classification and Normalization From Chinese Narrative Clinical Texts: Pattern Learning Approach

机译:中国叙事临床文本的时间表达分类和标准化:模式学习方法

获取原文
           

摘要

Background Temporal information frequently exists in the representation of the disease progress, prescription, medication, surgery progress, or discharge summary in narrative clinical text. The accurate extraction and normalization of temporal expressions can positively boost the analysis and understanding of narrative clinical texts to promote clinical research and practice. Objective The goal of the study was to propose a novel approach for extracting and normalizing temporal expressions from Chinese narrative clinical text. Methods TNorm, a rule-based and pattern learning-based approach, has been developed for automatic temporal expression extraction and normalization from unstructured Chinese clinical text data. TNorm consists of three stages: extraction, classification, and normalization. It applies a set of heuristic rules and automatically generated patterns for temporal expression identification and extraction of clinical texts. Then, it collects the features of extracted temporal expressions for temporal type prediction and classification by using machine learning algorithms. Finally, the features are combined with the rule-based and a pattern learning-based approach to normalize the extracted temporal expressions. Results The evaluation dataset is a set of narrative clinical texts in Chinese containing 1459 discharge summaries of a domestic Grade A Class 3 hospital. The results show that TNorm, combined with temporal expressions extraction and temporal types prediction, achieves a precision of 0.8491, a recall of 0.8328, and a F1 score of 0.8409 in temporal expressions normalization. Conclusions This study illustrates an automatic approach, TNorm, that extracts and normalizes temporal expression from Chinese narrative clinical texts. TNorm was evaluated on the basis of discharge summary data, and results demonstrate its effectiveness on temporal expression normalization.
机译:背景技术在叙事临床文本中的疾病进展,处方,药物,手术,手术进展或排放概况的表现中经常存在。时间表达的准确提取和归一化可以积极提高对叙事临床文本的分析和理解,以促进临床研究和实践。目的是该研究的目标是提出一种提取和规范中国叙事临床文本的时间表达的新方法。方法,基于规则和基于模式的基于模式的方法,已经开发出用于自动临时表达提取和来自非结构化中文临床文本数据的归一化。 Tnorm由三个阶段组成:提取,分类和标准化。它适用一组启发式规则,并自动生成的模式,以进行时间表达识别和临床文本的提取。然后,它通过使用机器学习算法收集用于时间类型预测和分类的提取时间表达式的特征。最后,该特征与基于规则的和基于模式学习的方法相结合,以归一化提取的时间表达式。结果评估数据集是一组中文叙事临床文本,含有1459年的国内级3级医院的汇总。结果表明,Tnorm与时间表达提取和时间类型预测相结合,实现了0.8491的精度,召回0.8328,并且在时间表达式中的F1得分为0.8409。结论本研究说明了一种自动方法,即从中国叙事临床文本中提取和标准化时间表达。在放电汇总数据的基础上评估Tnorm,结果表明其对时间表达标准化的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号