ADVERTISEMENT

Home|Journals|Articles by Year|Audio Abstracts
 

Original Article



Evaluation of a guideline-anchored retrieval-augmented generation (RAG) system for headache diagnosis and management: A case-vignette study

Gulcin Babaoglu.



Abstract
Download PDF Post

Headache disorders are frequently misdiagnosed, because of diagnostic criteria complexity. This study aimed to evaluate the diagnostic accuracy, safety, and usability of a guideline-anchored retrieval-augmented generation (RAG) clinical decision support workflow for headache diagnosis and management. Thirty-five fictional case vignettes representing primary headache disorders, secondary headache disorders, and neuropathic or facial pain syndromes were developed. NotebookLM was indexed with the International Classification of Headache Disorders (ICHD-3) and major headache treatment guidelines. Two independent pain specialists assigned gold-standard diagnoses and evaluated model outputs. Outcomes included diagnostic accuracy, treatment concordance (5-point Likert scale), red-flag detection using SNNOOP10 criteria, usability using the System Usability Scale (SUS) and Healthcare Systems Usability Scale (HSUS), and readability using Flesch Reading Ease (FRE) and Flesch–Kincaid Grade Level (FKGL). The system correctly classified secondary versus non-secondary headache categories in all vignettes (35/35, 100%). Subtype-level diagnostic agreement with expert evaluation was also perfect (κ=1.00). Acute treatment recommendations were rated as fully concordant in 62.9% and mostly concordant in 37.1% of cases, with no unsafe recommendations identified. Preventive therapy suggestions were applicable in 20 cases and achieved high concordance (Likert score ≥4 in 100%). Red-flag detection sensitivity was 87.5% overall and 93.3% for secondary headache cases, with 100% specificity. Usability scores were favorable (mean SUS 77.5; HSUS 4.5/5). However, readability analysis indicated highly technical output with very low FRE scores. A guideline-constrained RAG workflow demonstrated high diagnostic agreement and strong guideline concordance in headache case vignettes. While the system shows promise as a supervised clinical decision support tool, further validation in real-world clinical settings is required before routine implementation.

Key words: Headache disorders, clinical decision support systems, artificial intelligence, natural language processing, diagnostic accuracy







Bibliomed Article Statistics

4
R
E
A
D
S

10
D
O
W
N
L
O
A
D
S
06
2026

Full-text options


Share this Article


Online Article Submission
• ejmanager.com




ejPort - eJManager.com
Author Tools
About BiblioMed
License Information
Terms & Conditions
Privacy Policy
Contact Us

The articles in Bibliomed are open access articles licensed under Creative Commons Attribution 4.0 International License (CC BY), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.