Treffer: Medformer: A Multitask Multimodal Foundational Model for Medical Imaging.

Title:
Medformer: A Multitask Multimodal Foundational Model for Medical Imaging.
Authors:
Simionescu, Cristian1 (AUTHOR) cristian@nexusmedia.ro
Source:
Procedia Computer Science. 2025, Vol. 270, p446-455. 10p.
Database:
Supplemental Index

Weitere Informationen

Medical imaging datasets vary widely in modality, dimensionality, and clinical tasks. This diversity typically necessitates single-purpose deep learning models for each domain. We propose Medformer, a unified foundation model for multitask and multimodal medical imaging based on transformer architectures. Medformer uses two specialized modules called Adaptformers—one for input adaptation and one for output adaptation—along with learnable latent embeddings that encode dimensionality (2D vs. 3D), imaging modality (CT, X-ray, microscopy), anatomical region, and task requirements (classification, ordinal regression, etc.). These embeddings guide a shared transformer backbone, enabling broad parameter sharing across heterogeneous tasks. Experiments on the MedMNIST collection, comprising 18 diverse 2D/3D datasets, demonstrate that Medformer can match specialized baselines, particularly for data-scarce tasks, via both multi-task training and self-supervised pretraining. Results suggest that Medformer can serve as a flexible foundation model to unify disparate medical imaging domains within a single architecture. [ABSTRACT FROM AUTHOR]