MMBERT: Multimodal BERT Pretraining for Improved Medical VQA

机译：MMBERT：改进医疗VQA的多模式烫伤预借鉴

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Images in the medical domain are fundamentally different from the general domain images. Consequently, it is infeasible to directly employ general domain Visual Question Answering (VQA) models for the medical domain. Additionally, medical image annotation is a costly and time-consuming process. To overcome these limitations, we propose a solution inspired by self-supervised pretraining of Transformer-style architectures for NLP, Vision, and Language tasks. Our method involves learning richer medical image and text semantic representations using Masked Vision-Language Modeling as the pretext task on a large medical image + caption dataset. The proposed solution achieves new state-of-the-art performance on two VQA datasets for radiology images – VQA-Med 2019 and VQA-RAD, outperforming even the ensemble models of previous best solutions. Moreover, our solution provides attention maps which help in model interpretability.

机译：医疗领域中的图像与总域图像根本不同。因此，直接采用医疗领域的一般域视觉问题应答（VQA）模型即可令人遗憾。此外，医学图像注释是一种昂贵且耗时的过程。为了克服这些限制，我们提出了一种通过对NLP，Vision和语言任务的变压器风格架构的自我监督预借鉴的解决方案。我们的方法涉及使用蒙版视图语言建模来学习更丰富的医学图像和文本语义表示，作为大型医学图像+标题数据集的借口任务。该提出的解决方案在两个VQA数据集上实现了新的最新性能，用于放射线图像 - VQA-MED 2019和VQA-RAD，即使是先前最佳解决方案的集合模型也是优越的。此外，我们的解决方案提供了注意图，有助于模型解释性。

著录项

来源
《IEEE International Symposium on Biomedical Imaging》|2021年|1033-1036|共4页
会议地点
作者
Yash Khare; Viraj Bagal; Minesh Mathew; Adithi Devi; U Deva Priyakumar; CV Jawahar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Visualization; Biological system modeling; Bit error rate; Semantics; Biological systems; Predictive models; Radiology;

机译：可视化;生物系统建模;误码率;语义;生物系统;预测模型;放射学;

相似文献

外文文献
中文文献
专利

1. HUNER: improving biomedical NER with pretraining [J] . Weber Leon, Munchmeyer Jannes, Rocktaschel Tim, Bioinformatics . 2020,第1期

机译：饥饿：用预先推测改善生物医学
2. Deep ensemble multitask classification of emergency medical call incidents combining multimodal data improves emergency medical dispatch [J] . Ferri Pablo, Saez Carlos, Felix-De Castro Antonio, Artificial intelligence in medicine . 2021,第Jula期

机译：紧急医疗呼叫事件的深度集成多任务分类组合多式联算数据的紧急医疗派遣
3. Improving clinical outcome predictions using convolution over medical entities with multimodal learning [J] . Bardak Batuhan, Tan Mehmet Artificial intelligence in medicine . 2021,第Jula期

机译：通过多式化学习的医学实体卷积提高临床结果预测
4. Improving Biomedical Pretrained Language Models with Knowledge [C] . Zheng Yuan, Yijia Liu, Chuanqi Tan, SIGBioMed Workshop on Biomedical Language Processing . 2021

机译：通过知识改善生物医学净化语言模型
5. Multimodal Sensing and Data Processing for Speaker and Emotion Recognition Using Deep Learning Models with Audio, Video and Biomedical Sensors [D] . Abtahi, Farnaz. 2018

机译：使用具有音频，视频和生物医学传感器的深度学习模型，对说话人和情感识别进行多模式传感和数据处理
6. Unified Medical Language System resources improve sieve-based generation and Bidirectional Encoder Representations from Transformers (BERT)–based ranking for concept normalization [O] . Dongfang Xu, Manoj Gopale, Jiacheng Zhang, 2020

机译：统一的医疗语言系统资源改善了从变压器（BERT）的基于筛的生成和双向编码器表示为概念标准化排名
7. A Multimodal Curriculum With Patient Feedback to Improve Medical Student Communication: Pilot Study [O] . Nicole Dubosh, Matthew Hall, Victor Novack, 2019

机译：具有患者反馈的多模式课程，以改善医学学生沟通：试点研究

MMBERT: Multimodal BERT Pretraining for Improved Medical VQA

摘要

著录项

相似文献

相关主题

期刊订阅