MedRAX: Medical Reasoning Agent for Chest X-ray

Authors: Adibvafa Fallahpour, Jun Ma, Alif Munim, Hongwei Lyu, Bo Wang

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that Med RAX achieves state-of-the-art performance compared to both open-source and proprietary models, representing a significant step toward the practical deployment of automated CXR interpretation systems. ... 5. Experiments
Researcher Affiliation Collaboration 1Department of Computer Science, University of Toronto, Toronto, Canada 2Vector Institute, Toronto, Canada 3University Health Network, Toronto, Canada 4Cohere, Toronto, Canada 5Cohere Labs, Toronto, Canada 6Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Canada.
Pseudocode Yes Algorithm 1 Med RAX Re Act Framework
Open Source Code Yes Data and code have been publicly available at https: //github.com/bowang-lab/Med RAX.
Open Datasets Yes To rigorously evaluate its capabilities, we introduce Chest Agent Bench, a comprehensive benchmark containing 2,500 complex medical queries across 7 diverse categories. ... We utilize Eurorad, the largest peer-reviewed radiological case report database maintained by the European Society of Radiology (ESR). ... MIMIC-CXR Radiology Report Generation... SLAKE VQA, which evaluates medical visual question answering...
Dataset Splits No The paper primarily describes evaluation on established test sets for benchmarks like MIMIC-CXR (test set) and SLAKE VQA (test samples), and introduces a new evaluation benchmark (Chest Agent Bench) without providing explicit training, validation, and test splits for a model trained by the authors. Med RAX is an agent framework that integrates pre-trained models.
Hardware Specification Yes Med RAX uses GPT-4o as its backbone LLM, and we deploy it on a single NVIDIA RTX 6000 GPU using the same configuration as described in Section 3.
Software Dependencies No The paper states: "Med RAX is built on the Lang Chain and Lang Graph frameworks." and "Med RAX uses GPT-4o as its backbone LLM". However, it does not provide specific version numbers for these frameworks or any other software libraries or programming languages used.
Experiment Setup Yes The algorithm implements a Re Act (Reasoning and Acting) loop... Input: ...tmax: Maximum allowed time. ...Med RAX employs the following system prompt to guide the reasoning engine: You are an expert medical AI assistant who can answer any medical questions and analyze medical images similar to a doctor. Solve using your own vision and reasoning and use tools to complement your reasoning...