MedRAX: Medical Reasoning Agent for Chest X-ray
Authors: Adibvafa Fallahpour, Jun Ma, Alif Munim, Hongwei Lyu, Bo Wang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that Med RAX achieves state-of-the-art performance compared to both open-source and proprietary models, representing a significant step toward the practical deployment of automated CXR interpretation systems. ... 5. Experiments |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, University of Toronto, Toronto, Canada 2Vector Institute, Toronto, Canada 3University Health Network, Toronto, Canada 4Cohere, Toronto, Canada 5Cohere Labs, Toronto, Canada 6Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Canada. |
| Pseudocode | Yes | Algorithm 1 Med RAX Re Act Framework |
| Open Source Code | Yes | Data and code have been publicly available at https: //github.com/bowang-lab/Med RAX. |
| Open Datasets | Yes | To rigorously evaluate its capabilities, we introduce Chest Agent Bench, a comprehensive benchmark containing 2,500 complex medical queries across 7 diverse categories. ... We utilize Eurorad, the largest peer-reviewed radiological case report database maintained by the European Society of Radiology (ESR). ... MIMIC-CXR Radiology Report Generation... SLAKE VQA, which evaluates medical visual question answering... |
| Dataset Splits | No | The paper primarily describes evaluation on established test sets for benchmarks like MIMIC-CXR (test set) and SLAKE VQA (test samples), and introduces a new evaluation benchmark (Chest Agent Bench) without providing explicit training, validation, and test splits for a model trained by the authors. Med RAX is an agent framework that integrates pre-trained models. |
| Hardware Specification | Yes | Med RAX uses GPT-4o as its backbone LLM, and we deploy it on a single NVIDIA RTX 6000 GPU using the same configuration as described in Section 3. |
| Software Dependencies | No | The paper states: "Med RAX is built on the Lang Chain and Lang Graph frameworks." and "Med RAX uses GPT-4o as its backbone LLM". However, it does not provide specific version numbers for these frameworks or any other software libraries or programming languages used. |
| Experiment Setup | Yes | The algorithm implements a Re Act (Reasoning and Acting) loop... Input: ...tmax: Maximum allowed time. ...Med RAX employs the following system prompt to guide the reasoning engine: You are an expert medical AI assistant who can answer any medical questions and analyze medical images similar to a doctor. Solve using your own vision and reasoning and use tools to complement your reasoning... |