Natural language processing and machine learning for development of a Fontan Failure risk prediction model from electronic health records
National Heart Lung and Blood InstituteDescription
/ABSTRACT Fontan palliation for rare single ventricle heart defects is lifesaving but creates deranged cardiovascular physiology with eventual premature multi-organ circulatory failure. Circulatory failure after Fontan palliation may be related to a number of physiologic states which may change over time, associated with variable prognoses, and requiring development of physiology-specific treatment options. Fontan research is limited by heterogeneity of native anatomy, post-Fontan anatomy, physiologic states, and small sample sizes due to rarity. Adverse outcomes in the Fontan population begin in childhood, are common and diverse, often affecting multiple organ systems. We have previously described Fontan Failure physiologic phenotypes based on (1) Systolic Heart Failure (2) Diastolic Heart Failure (3) Hepatic and Pulmonary phenotype (normal cardiac output) and (4) Lymphatic Abnormalities. Despite the broad range of complications, treatments for Fontan patients are generally consensus based and may not address the underlying physiologic derangement. Heart transplantation can be lifesaving for this population; however, heart transplantation creates a different disease state with its own related late morbidity and mortality, and optimal timing is unknown. Using two electronic health records systems (pediatric and adult) including free text notes, for a diverse population with Fontan anatomy across the age spectrum, we propose to use natural language processing (NLP) and machine learning (ML) techniques to improve detection of multi-organ comorbid conditions in this population to define anatomic and physiologic phenotypes, and develop of an annualized risk score applicable across age, sex, race and ethnicity. Our proposed work builds on a rigorous pilot study in which we developed an NLP-based ML model for automatically identifying Fontan patients from two hospital systems representing a racially diverse cohort across the lifespan. Our pilot system achieved significantly better performance compared to ICD code-based classification of Fontan cases. In the proposed work, we will (i) advance the state of the art in biomedical NLP to improve the automatic classification of Fontan phenotypes in the cohort so that it is closer to human-level performance; (ii) develop a generalizable and interpretable pipeline so that NLP/ML outputs can be traced by domain experts from the final decision to initial data point; and (iii) implement data-driven methods to develop a risk prediction model for adverse outcomes in Fontan patients. Our innovative approach can facilitate the development of physiology- based treatments and risk stratification for advanced therapies. Public, open-sourced release of the code associated with our technological innovations will benefit the research community as a whole to accelerate rare disease research, at lower cost and with greater inclusivity. Project Number: 1R21HL181630-01 | Fiscal Year: 2025 | NIH Institute/Center: National Heart Lung and Blood Institute (NHLBI) | Principal Investigator: Wendy Book (+1 co-PI) | Institution: EMORY UNIVERSITY, ATLANTA, GA | Award Amount: $232,498 | Activity Code: R21 | Study Section: Clinical Data Management and Analysis Study Section[CDMA] View on NIH RePORTER: https://reporter.nih.gov/project-details/1R21HL18163001
Interested in this grant?
Sign up to get match scores, save grants, and start your application with AI-powered tools.
Grant Details
$232,498 - $232,498
July 31, 2027
ATLANTA, GA
External Links
View Original ListingWant to see how well this grant matches your organization?
Get Your Match Score