CT-AGRG: Automated Abnormality-Guided Report Generation from 3D Chest CT Volumes

ISBI 2025 - Oral

Theo Di Piazza1,2 Carole Lazarus3 Olivier Nempont3 Loic Boussel1,2
1INSA Lyon, 2Hospices Civil de Lyon 3Philips Clinical Informatics

Abstract

The rapid increase of Computed Tomography examinations have created a need for robust automated analysis techniques in clinical settings to assist radiologists managing their growing workload. Existing methods generate entire reports directly from 3D CT images, without explicitly focusing on observed abnormalities. This unguided approach can result in repetitive content or incomplete reports. We propose a new anomaly-guided report generation model, which first predicts abnormalities and then generates targeted descriptions for each. Evaluation on a public dataset demonstrates significant improvements in report quality and clinical relevance. We extend our work by conducting an ablation study to demonstrate its effectiveness.

Method

We employ a two-stage approach for anomaly detection and description generation. Initially, we use a visual feature extractor pre-trained on a multi-label classification task. In the first stage, we perform multi-task learning with one classification head per anomaly. If an anomaly is detected, its associated vector representation is then passed to the second stage. Here, a pre-trained GPT-2 model generates a descriptive text of the identified anomaly.

Main figure

Experiments

Models are trained and evaluated on CT-RATE, using train/validation/test splits across five independent runs.

Quantitative results

CT-AGRG is compared against CT2Rep, the first-of-its-kind report generation framework for 3D CT scans. CT2Rep introduces an encoder-decoder architecture that generates the entire report without an intermediate abnormality classification tasks. For CT-AGRG, we report performances using an attention-based backbone (CT-ViT), as well as a 2.5D convolutional neural networks (CT-Net). We report natural language generation metrics (METEOR, ROUGE, BERT, BART, BLEU), and Clinical Efficacy metrics (F1-Score, Precision, Recall) extracted with the Rad-BERT labeler.

Qualitative results

Key findings:

CT-AGRG outperforms CT2Rep on both report generation and clinical efficacy metrics. With CT-Net as the visual backbone, CT-AGRG improves Recall by 64% and F1-score by 50%, demonstrating that an intermediate abnormality classification stage significantly enhances pathology detection while producing semantically more accurate reports.

Qualitative results

The qualitative example below compares CT-AGRG with the CT2Rep baseline and the ground-truth radiology report. Color-coded annotations highlight clinically relevant findings, showing that CT-AGRG more accurately identifies pathologies and generates reports using terminology that closely matches radiologist-written reports.

Qualitative results

Related Links

In this work of academic research, our experiments are run on a public Computed Tomography dataset. We acknowledge contributors from CT-RATE [1] for releasing the dataset to the research community.

[1] Generalist foundation models from a multimodal dataset for 3D CT. Hamamci et al. 2026.

BibTeX

@article{dipiazza_2026_unict,
  author    = {Di Piazza, Theo and Lazarus, Carole and Nempont, Olivier. and Boussel, Loic},
  title     = {CT-AGRG: Automated Abnormality-Guided Report Generation from 3D Chest CT Volumes},
  booktitle = {IEEE 22nd International Symposium on Biomedical Imaging (ISBI)},
  year      = {2025},
}

More research

Explore additional recent work in medical image analysis related to this project. Click on the images to access the corresponding project pages.

Method 1

CT-Scroll
MIDL 2025
2.5D Representation learning

Method 2

CT-SSG
MELBA Journal 2026
Graph representation learning

Method 3

UniCT
MICCAI 2026
Multi-task learning