Text Analysis

LLMs for radiological report summarization, 05/14/2024

GemTex - LLW Workshop, Leipzig, Germany 

LLMs for radiological report summarization 

Martina Langhals, David Männle, Máté E. Maros 

We (MIDorAI subgroup) were happy to participate and present our project about Neuroradiological Report Summarization using LLMs and SLMs at the GemTex LLM Workshop in Leipzig, Germany. 

About: GeMTeX  is an MII project to create and perform automated indexing of medical texts for research. 

In everyday clinical practice, numerous texts are produced, such as doctors' letters and reports, which contain valuable information about the development, course, and treatment of a disease. These texts could be used by natural language processing (NLP) tools to assist doctors and researchers in their work. However, the full potential of clinical documents cannot be realised due to a lack of standardisation. The GeMTeX (German Medical Text Corpus) methodology platform aims to fill this gap and make medical texts from patient care available for research projects. The goal is to create the largest medical text corpus in the German language.

How to cite:   Maros, M.E.; Cho, C.G.; Junge, A.G.; Kämpgen, B.; Saase, V.; Siegel, F.; Trinkmann, F.; Ganslandt, T.; Wenz, H. Comparative Analysis of Machine Learning Algorithms for Computer-Assisted Reporting Based on Fully Automated Cross-Lingual RadLex® Mappings. Preprints 2020, 2020040354 (doi: 10.20944/preprints202004.0354.v1).

Comparative Analysis of Machine Learning Algorithms for Computer-Assisted Reporting Based on Fully Automated Cross-Lingual RadLex® Mappings

Comparative Analysis of Machine Learning Algorithms for Computer-Assisted Reporting Based on Fully Automated Cross-Lingual RadLex® Mappings

Máté E. Maros, Chang Gyu Cho, Andreas G. Junge, Benedikt Kämpgen, Victor Saase, Fabian Siegel, Frederik Trinkmann, Thomas Ganslandt, Holger Wenz

Objectives: Studies evaluating machine learning (ML) algorithms on cross-lingual RadLex® mappings for developing context-sensitive radiological reporting tools are lacking. Therefore, we investigated whether ML-based approaches can be utilized to assist radiologists in providing key imaging biomarkers – such as The Alberta stroke programme early CT score (APECTS).

Material and Methods: A stratified random sample (age, gender, year) of CT reports (n=206) with suspected ischemic stroke was generated out of 3997 reports signed off between 2015-2019. Three independent, blinded readers assessed these reports and manually annotated clinico-radiologically relevant key features. The primary outcome was whether ASPECTS should have been provided (yes/no: 154/52). For all reports, both the findings and impressions underwent cross-lingual (German to English) RadLex®-mappings using natural language processing. Well-established ML-algorithms including classification trees, random forests, elastic net, support vector machines (SVMs) and boosted trees were evaluated in a 5 x 5-fold nested cross-validation framework. Further, a linear classifier (fastText) was directly fitted on the German reports. Ensemble learning was used to provide robust importance rankings of these ML-algorithms. Performance was evaluated using derivates of the confusion matrix and metrics of calibration including AUC, brier score and log loss as well as visually by calibration plots. 

Results: On this imbalanced classification task SVMs showed the highest accuracies both on human-extracted- (87%) and fully automated RadLex® features (findings: 82.5%; impressions: 85.4%). FastText without pre-trained language model showed the highest accuracy (89.3%) and AUC (92%) on the impressions. Ensemble learner revealed that boosted trees, fastText and SVMs are the most important ML-classifiers. Boosted trees fitted on the findings showed the best overall calibration curve. 

Conclusions: Contextual ML-based assistance suggesting ASPECTS while reporting neuroradiological emergencies is feasible, even if ML-models are restricted to be developed on limited and highly imbalanced data sets. 

Quality assessment of structured multi-parametric MRI reports of the prostate based on RadLex mapping of urosurgical key information content 

ME Maros, F Siegel, B Kämpgen, P Sodmann, W Sommer, SO Schönberg, T Henzler, C Groden, H Wenz 
Oral Presentation: 7788, Imaging Informatics - SS 205 - Intelligent dose and quality management, 2/27 2019; Link.

Purpose: the Prostate Imaging Reporting and Data System (PI-RADS v2) was developed to provide imaging and reporting standards of multi-parametric prostate MRI (mpMRI). Urologist rely heavily on radiological reports when planning prostate biopsies. We investigated whether 1) mapping urosurgically relevant key content to RadLex® terms might be feasible to assess radiological report quality with regard to clinical usability 2) and compared it to a fully-automated guideline-based quality assessment.  


Conclusion: Biopsy-relevant key RadLex® content could serve as an important quality measure of mpMRI reports of the prostate to improve communication with urologists and to support their planning of invasive interventions.

Try the related software:

RASP - RadLex® Annotation and Scoring Pipeline

Bibtex Citation: 
@article{maros2018objective,  title={Objective Comparison Using Guideline-based Query of Conventional Radiological Reports and Structured Reports},  author={Maros, Mate E and Wenz, Ralf and Foerster, Alex and Froelich, Matthias F and Groden, Christoph and Sommer, Wieland H and Schoenberg, Stefan O and Henzler, Thomas and Wenz, Holger},  journal={in vivo},  volume={32},  number={4},  pages={843--849},  year={2018},  publisher={International Institute of Anticancer Research}}

Objective Comparison Using Guideline-based Query of Conventional Radiological Reports and Structured Reports 

1Department of Neuroradiology, and 5Institute of Clinical Radiology and Nuclear Medicine, University Medical Center Mannheim, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany; 2Department of Life Sciences, Faculty of Natural Sciences, Imperial College London, London, U.K.; 3Smart-Radiology, Smart Reporting GmbH, Munich, Germany; 4Institute for Clinical Radiology, Ludwig Maximilian University Hospital, Munich, Germany

Background: This feasibility study of text-mining- based scoring algorithm provides an objective comparison of structured reports (SR) and conventional free-text reports (cFTR) by means of guideline-based key terms. Furthermore, an open-source online version of this ranking algorithm was provided with multilingual text-retrieval pipeline, customizable query and real-time-scoring. 

Materials and methods: Twenty-five patients with suspected stroke and magnetic resonance imaging were re-assessed by two independent/blinded readers [inexperienced: 3 years; experienced >6 years/Board-certified). SR and cFTR were compared with guideline-query using the cosine similarity score (CSS) and Wilcoxon signed-rank test. 

Results: All pathological findings (18/18) were identified by SR and cFTR. The impressions section of the SRs of the inexperienced reader had the highest median (0.145) and maximal (0.214) CSS and were rated significantly higher (p=2.21×10−5 and p=1.4×10−4, respectively) than cFTR (median=0.102). CSS was robust to variations of query. 

Conclusion: Objective guideline-based comparison of SRs and cFTRs using the CSS is feasible and provides a scalable quality measure which can facilitate the adoption of structured reports in all fields of radiology. 

Try the related software: