Headache disorders are frequently misdiagnosed, because of diagnostic criteria complexity. This study aimed to evaluate the diagnostic accuracy, safety, and usability of a guideline-anchored retrieval-augmented generation (RAG) clinical decision support workflow for headache diagnosis and management. Thirty-five fictional case vignettes representing primary headache disorders, secondary headache disorders, and neuropathic or facial pain syndromes were developed. NotebookLM was indexed with the International Classification of Headache Disorders (ICHD-3) and major headache treatment guidelines. Two independent pain specialists assigned gold-standard diagnoses and evaluated model outputs. Outcomes included diagnostic accuracy, treatment concordance (5-point Likert scale), red-flag detection using SNNOOP10 criteria, usability using the System Usability Scale (SUS) and Healthcare Systems Usability Scale (HSUS), and readability using Flesch Reading Ease (FRE) and Flesch–Kincaid Grade Level (FKGL). The system correctly classified secondary versus non-secondary headache categories in all vignettes (35/35, 100%). Subtype-level diagnostic agreement with expert evaluation was also perfect (κ=1.00). Acute treatment recommendations were rated as fully concordant in 62.9% and mostly concordant in 37.1% of cases, with no unsafe recommendations identified. Preventive therapy suggestions were applicable in 20 cases and achieved high concordance (Likert score ≥4 in 100%). Red-flag detection sensitivity was 87.5% overall and 93.3% for secondary headache cases, with 100% specificity. Usability scores were favorable (mean SUS 77.5; HSUS 4.5/5). However, readability analysis indicated highly technical output with very low FRE scores. A guideline-constrained RAG workflow demonstrated high diagnostic agreement and strong guideline concordance in headache case vignettes. While the system shows promise as a supervised clinical decision support tool, further validation in real-world clinical settings is required before routine implementation.
Key words: Headache disorders, clinical decision support systems, artificial intelligence, natural language processing, diagnostic accuracy
|