To the Editor: Following the release of the generative pre‐trained transformer (GPT) ChatGPT in November 2022, a wide range of large language models (LLMs), including ChatGPT‐3.5 (GPT‐3‐derived), ChatGPT‐4 and New Bing (GPT‐4‐derived), have been made publicly available. There is suggestion that ChatGPT‐4 outperforms ChatGPT‐3.5 in answering questions from medical exams,1 but it is unknown whether GPT‐4‐derived LLMs consistently outperform GPT‐3‐derived LLMs.
The full article is accessible to AMA
members and paid subscribers.
Login to MJA or subscribe now.
- 1. Nori H, King N, McKinney SM, et al. Capabilities of GPT‐4 on medical challenge problems [preprint]. arXiv 230313375; 20 Mar 2023. https://doi.org/10.48550/arXiv.2303.13375 (viewed Mar 2023).
- 2. OpenAI. GPT‐4 technical report [preprint]. ArXiv 2303.08774; 15 Mar 2023. https://doi.org/10.48550/arXiv.2303.08774 (viewed Mar 2023).
- 3. Australian Medical Council Limited. MCQ trial examination [website]. Canberra: AMC, 2022. https://www.amc.org.au/assessment/mcq/mcq‐trial/ (viewed Mar 2023).
- 4. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: potential for AI‐assisted medical education using large language models. PLOS Digit Health 2023; 2: e0000198.
- 5. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 2023; 9: e45312.
Online responses are no longer available. Please refer to our instructions for authors page for more information.


We thank Joshua Kovoor for providing editorial and statistical support to this piece.
No relevant disclosures.