Foresight — Deep Generative Modelling of Patient Timelines using Electronic Health Records. (arXiv:2212.08072v1 [cs.CL])
Electronic Health Records (EHRs) hold detailed longitudinal information about
each patient’s health status and general clinical history, a large portion of
which is stored within the unstructured text. Temporal modelling of this
medical history, which considers the sequence of events, can be used to
forecast and simulate future events, estimate risk, suggest alternative
diagnoses or forecast complications. While most prediction approaches use
mainly structured data or a subset of single-domain forecasts and outcomes, we
processed the entire free-text portion of EHRs for longitudinal modelling. We
present Foresight, a novel GPT3-based pipeline that uses NER+L tools (i.e.
MedCAT) to convert document text into structured, coded concepts, followed by
providing probabilistic forecasts for future medical events such as disorders,
medications, symptoms and interventions. Since large portions of EHR data are
in text form, such an approach benefits from a granular and detailed view of a
patient while introducing modest additional noise. On tests in two large UK
hospitals (King’s College Hospital, South London and Maudsley) and the US
MIMIC-III dataset precision@10 of 0.80, 0.81 and 0.91 was achieved for
forecasting the next biomedical concept. Foresight was also validated on 34
synthetic patient timelines by 5 clinicians and achieved relevancy of 97% for
the top forecasted candidate disorder. Foresight can be easily trained and
deployed locally as it only requires free-text data (as a minimum). As a
generative model, it can simulate follow-on disorders, medications and
interventions for as many steps as required. Foresight is a general-purpose
model for biomedical concept modelling that can be used for real-world risk
estimation, virtual trials and clinical research to study the progression of
diseases, simulate interventions and counterfactuals, and for educational
purposes.
Source: https://arxiv.org/abs/2212.08072