Abstract
Recently, self-supervised pretraining of transformers has gained considerable attention in analyzing electronic medical records. However, systematic evaluation of different pretraining tasks in radiology applications using both images and radiology reports is still lacking. We propose PreRadE, a simple proof of concept framework that enables novel evaluation of pretraining tasks in a controlled environment. We investigated three most-commonly used pretraining tasks (MLM—Masked Language Modelling, MFR—Masked Feature Regression, and ITM—Image to Text Matching) and their combinations against downstream radiology classification on MIMIC-CXR, a medical chest X-ray imaging and radiology text report dataset. Our experiments in the multimodal setting show that (1) pretraining with MLM yields the greatest benefit to classification performance, largely due to the task-relevant information learned from the radiology reports. (2) Pretraining with only a single task can introduce variation in classification performance across different fine-tuning episodes, suggesting that composite task objectives incorporating both image and text modalities are better suited to generating reliably performant models.
Original language | English |
---|---|
Article number | 4661 |
Number of pages | 14 |
Journal | Mathematics |
Volume | 10 |
Issue number | 24 |
DOIs | |
Publication status | Published - 8 Dec 2022 |
Keywords
- computational radiology
- deep learning applications
- machine learning
- masked language modelling
- model evaluation
- multimodal
- pathoanatomical classification
- self-supervised learning
- X-ray analysis