Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis

Mark Ormerod; Jesús Martínez del Rincón; Barry Devereux

首页> 外文期刊>JMIR Medical Informatics >Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis

【24h】

Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis

机译：使用变压器模型预测临床句子对之间的语义相似性：评估和代表性分析

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Semantic textual similarity (STS) is a natural language processing (NLP) task that involves assigning a similarity score to 2 snippets of text based on their meaning. This task is particularly difficult in the domain of clinical text, which often features specialized language and the frequent use of abbreviations. Objective We created an NLP system to predict similarity scores for sentence pairs as part of the Clinical Semantic Textual Similarity track in the 2019 n2c2/OHNLP Shared Task on Challenges in Natural Language Processing for Clinical Data. We subsequently sought to analyze the intermediary token vectors extracted from our models while processing a pair of clinical sentences to identify where and how representations of semantic similarity are built in transformer models. Methods Given a clinical sentence pair, we take the average predicted similarity score across several independently fine-tuned transformers. In our model analysis we investigated the relationship between the final model’s loss and surface features of the sentence pairs and assessed the decodability and representational similarity of the token vectors generated by each model. Results Our model achieved a correlation of 0.87 with the ground-truth similarity score, reaching 6th place out of 33 teams (with a first-place score of 0.90). In detailed qualitative and quantitative analyses of the model’s loss, we identified the system’s failure to correctly model semantic similarity when both sentence pairs contain details of medical prescriptions, as well as its general tendency to overpredict semantic similarity given significant token overlap. The token vector analysis revealed divergent representational strategies for predicting textual similarity between bidirectional encoder representations from transformers (BERT)–style models and XLNet. We also found that a large amount information relevant to predicting STS can be captured using a combination of a classification token and the cosine distance between sentence-pair representations in the first layer of a transformer model that did not produce the best predictions on the test set. Conclusions We designed and trained a system that uses state-of-the-art NLP models to achieve very competitive results on a new clinical STS data set. As our approach uses no hand-crafted rules, it serves as a strong deep learning baseline for this task. Our key contribution is a detailed analysis of the model’s outputs and an investigation of the heuristic biases learned by transformer models. We suggest future improvements based on these findings. In our representational analysis we explore how different transformer models converge or diverge in their representation of semantic signals as the tokens of the sentences are augmented by successive layers. This analysis sheds light on how these “black box” models integrate semantic similarity information in intermediate layers, and points to new research directions in model distillation and sentence embedding extraction for applications in clinical NLP.

机译：背景技术语义文本相似性（STS）是一种自然语言处理（NLP）任务，涉及基于其含义将相似度分数分配给2个短片段。该任务在临床文本领域尤其困难，这通常具有专门的语言和频繁使用缩写。目的我们创建了一个NLP系统，以预测句子对的相似性分数，作为2019 N2C2 / OHNLP共享任务的临床语义文本相似度轨道的一部分，这是临床数据的自然语言处理中的挑战。随后我们试图分析从我们模型中提取的中间令牌向量，同时处理一对临床句子，以识别在变压器模型中建立了语义相似性的何处和何处。方法给出临床句子对，我们采取跨越多个独立微调的变压器的平均预测相似度得分。在我们的模型分析中，我们调查了句子对的最终模型损失和表面特征之间的关系，并评估了每个模型生成的令牌矢量的可解码性和代表性相似性。结果我们的型号达到了0.87的相关性，地面真理相似度得分，达到33支球队的第6名（首次得分为0.90）。在模型的损失的详细定性和定量分析中，当两个句子对都包含医疗处方的细节时，我们确定了系统未正确模型语义相似性，以及其对溢出语义相似性的一般趋势，给出了显着的令牌重叠。令牌矢量分析显示了用于预测来自变换器（BERT）-Style模型和XLNET之间的双向编码器表示之间的文本相似性的发散代表性策略。我们还发现，可以使用分类令牌的组合和第一层的变压器模型中的第一层的句子对表示之间的组合来捕获与预测STS相关的大量信息，该变压器模型中没有产生测试集的最佳预测。结论我们设计并培训了一种使用最先进的NLP模型来实现在新的临床STS数据集中实现非常竞争力的结果的系统。由于我们的方法使用没有手工制作的规则，它是这项任务的强烈深度学习基准。我们的主要贡献是对模型产出的详细分析和变压器模型学习的启发式偏见的调查。我们建议基于这些发现的未来改进。在我们的代表性分析中，我们探讨不同的变压器模型在语义信号的表示中如何聚集或分歧，因为句子的令牌被连续的层增强。这种分析揭示了这些“黑匣子”模型如何在中间层中集成语义相似性信息，并指向临床NLP中应用的模型蒸馏和句子嵌入提取的新研究方向。

著录项

来源
《JMIR Medical Informatics》 |2021年第5期|共7页
作者
Mark Ormerod; Jesús Martínez del Rincón; Barry Devereux;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类医药、卫生;
关键词
natural language processingbiomedical NLPtransformer modelsrepresentation learningclinical text;

机译：自然语言处理基本NLPTransformer ModelsErseentation学习文本;

相似文献

外文文献
中文文献
专利

1. Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs [J] . McInnes Bridget T., Pedersen Ted Journal of biomedical informatics. . 2015,第Null期

机译：在临床术语对的语义分组上评估语义相似性和相关性
2. Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs [J] . McInnes Bridget T., Pedersen Ted Journal of biomedical informatics. . 2015,第Null期

机译：评估临床术语对语义分组的语义相似性和相关性
3. Evaluation of groundwater quality using an integrated approach of set pair analysis and variable fuzzy improved model with binary semantic analysis: A case study in Jiaokou Irrigation District, east of Guanzhong Basin, China [J] . Qiying Zhang, Panpan Xu, Jie Chen, Science of the total environment . 2021,第MAYa1期

机译：用二元语义分析，使用集对分析的综合方法评估地下水质量 - 以二元语义分析为例：以中国广州盆地东部交通灌溉区为例
4. Spectral Learning of Semantic Units in a Sentence Pair to Evaluate Semantic Textual Similarity [C] . Akanksha Mehndiratta, Krishna Asawa International Conference on Big Data Analytics . 2020

机译：句子对中语义单元的光谱学习，以评估语义文本相似性
5. Clinical predictive value of pre-clinical cancer models: A study of the predictive value of the in vitro cell line, human xenograft and murine allograft pre-clinical cancer models for phase II clinical trials of cytotoxic cancer drugs [D] . Voskoglou-Nomikos, Theodora 2001

机译：临床前癌症模型的临床预测价值：体外细胞系，人异种移植和小鼠同种异体移植临床前癌症模型对细胞毒性癌症药物II期临床试验的预测价值研究
6. Representational models: A common framework for understanding encoding pattern-component and representational-similarity analysis [O] . Jörn Diedrichsen, Nikolaus Kriegeskorte 2017

机译：表示模型：用于理解编码模式分量和表示相似性分析的通用框架
7. Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis [O] . Mark Ormerod, Jesús Martínez del Rincón, Barry Devereux 2021

机译：使用变压器模型预测临床句子对之间的语义相似性：评估和代表性分析

Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis

摘要

著录项

相似文献

相关主题

期刊订阅