ADVERTISEMENT

Home|Journals|Articles by Year|Audio Abstracts
 

Original Article

ATJMED. 2026; 6(1): 81-6


Benchmarking four large language models on emergency rheumatology scenarios evaluating AI in acute rheumatologic scenarios

Nese Cabuk Celik, Elif Altunel Kilinc, Fatih Albayrak.



Abstract
Download PDF Post

Aim: Large language models (LLMs) are increasingly integrated into medical education and decision support systems. However, their capabilities in acute care settings, such as emergency rheumatology, require systematic evaluation. This study aimed to compare the educational performance of four LLMs ChatGPT-4o, DeepSeek v3.2, Gemini 2.5 Pro, and Perplexity Academic across five domains: clinical accuracy, safety, diagnostic reasoning, realism, and use of evidence.
Materials and Methods: Each model generated responses to 20 standardized emergency rheumatology scenarios. Two board-certified rheumatologists independently evaluated the outputs using a 10-point scoring rubric. Inter-rater reliability was calculated using the intraclass correlation coefficient (ICC). Differences among models were assessed using Friedman tests with post-hoc Wilcoxon signed-rank tests corrected for multiple comparisons. A Likert scale was also used to assess scenario complexity and educational utility.
Results: ChatGPT-4o and DeepSeek v3.2 achieved the highest average total scores (mean: 7.75 each), significantly outperforming Perplexity Academic (mean: 6.70; p 0.80).
Conclusion: ChatGPT-4o and DeepSeek v3.2 demonstrated superior clinical reasoning, safety, and educational utility in emergency rheumatology scenarios. These findings support their potential role as adjunctive tools in medical training, provided expert oversight and validation mechanisms are in place.

Key words: Artificial intelligence, education, validation, descriptive studies, digital health







Bibliomed Article Statistics

10
R
E
A
D
S

20
D
O
W
N
L
O
A
D
S
03
2026

Full-text options


Share this Article


Online Article Submission
• ejmanager.com




ejPort - eJManager.com
Author Tools
About BiblioMed
License Information
Terms & Conditions
Privacy Policy
Contact Us

The articles in Bibliomed are open access articles licensed under Creative Commons Attribution 4.0 International License (CC BY), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.