LLMs for Cardiovascular Risk Prediction from Structured Clinical Data
Quick Take
A hybrid framework combining structured clinical data and LLMs, like GPT and Gemini, predicts coronary artery disease (CAD) with 94.61% fidelity. While Random Forest outperforms conventional models, LLMs offer privacy benefits by using natural language descriptions, enhancing clinical prediction systems.
Key Points
- Developed a hybrid framework for CAD prediction using 1,190 patient records.
- Achieved 94.61% fidelity in reverse extraction of clinical variables.
- Random Forest model showed the highest accuracy among conventional methods.
- LLMs maintain patient privacy by utilizing natural language descriptions.
- Combining structured data with LLM narratives opens new clinical prediction avenues.
Article Content
From source RSS / original summaryarXiv:2606. 00031v1 Announce Type: new Abstract: Coronary artery disease (CAD) remains one of the leading causes of death globally, highlighting the need for reliable predictive systems to support early diagnosis and risk assessment. While traditional machine learning models perform well on structured clinical data, large language models (LLMs) present new possibilities to interpret medical information expressed in natural language.
In this work, we develop a hybrid framework that bridges structured clinical data and natural-language representations for CAD prediction. Using a publicly available dataset of 1,190 patient records with 11 clinical attributes, structured variables are converted into interpretable feature representations and synthetic clinical narratives using LLMs. A validation pipeline performs reverse extraction of clinical variables and computes a consistency score with the original records, achieving an average fidelity of 94.
61%. We then evaluate four conventional machine learning models and compare their performance with LLM-based classification under zero-shot and few-shot prompting settings. We use two LLMs here, GPT and Gemini. Experimental results show that Random Forest achieves the highest accuracy. Despite this advantage, LLM-based classification remains beneficial in real-world clinical settings.
This is because LLMs operate directly on natural language patient descriptions, meaning that sensitive numerical patient data such as exact lab values, blood pressure readings, and diagnostic codes are kept private. Findings suggest that combining structured clinical data with LLM-generated narratives can enable new directions for hybrid clinical prediction systems.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.