Abstract
Background and objectives Left ventricular ejection fraction is disrupted in patients on maintenance hemodialysis and can be estimated using deep learning models on electrocardiograms. Smaller sample sizes within this population may be mitigated using transfer learning.
Design, setting, participants, & measurements We identified patients on hemodialysis with transthoracic echocardiograms within 7 days of electrocardiogram using diagnostic/procedure codes. We developed four models: (1) trained from scratch in patients on hemodialysis, (2) pretrained on a publicly available set of natural images (ImageNet), (3) pretrained on all patients not on hemodialysis, and (4) pretrained on patients not on hemodialysis and fine-tuned on patients on hemodialysis. We assessed the ability of the models to classify left ventricular ejection fraction into clinically relevant categories of ≤40%, 41% to ≤50%, and >50%. We compared performance by area under the receiver operating characteristic curve.
Results We extracted 705,075 electrocardiogram:echocardiogram pairs for 158,840 patients not on hemodialysis used for development of models 3 and 4 and n=18,626 electrocardiogram:echocardiogram pairs for 2168 patients on hemodialysis for models 1, 2, and 4. The transfer learning model achieved area under the receiver operating characteristic curves of 0.86, 0.63, and 0.83 in predicting left ventricular ejection fraction categories of ≤40% (n=461), 41%–50% (n=398), and >50% (n=1309), respectively. For the same tasks, model 1 achieved area under the receiver operating characteristic curves of 0.74, 0.55, and 0.71, respectively; model 2 achieved area under the receiver operating characteristic curves of 0.71, 0.55, and 0.69, respectively, and model 3 achieved area under the receiver operating characteristic curves of 0.80, 0.51, and 0.77, respectively. We found that predictions of left ventricular ejection fraction by the transfer learning model were associated with mortality in a Cox regression with an adjusted hazard ratio of 1.29 (95% confidence interval, 1.04 to 1.59).
Conclusion A deep learning model can determine left ventricular ejection fraction for patients on hemodialysis following pretraining on electrocardiograms of patients not on hemodialysis. Predictions of low ejection fraction from this model were associated with mortality over a 5-year follow-up period.
Podcast This article contains a podcast at https://www.asn-online.org/media/podcast/CJASN/2022_06_06_CJN16481221.mp3
Introduction
Patients on maintenance hemodialysis (HD) have high risk for cardiovascular mortality (1), with cardiovascular disease contributing to 53% of all deaths with a known cause (2). Prior to the onset of cardiovascular disease, abnormalities, such as increased left ventricular cavity size, thickened left ventricular posterior wall, thickened interventricular septum, and decreased left ventricular compliance, are common in patients on HD (3⇓–5), eventually manifesting as changes in left ventricular systolic function.
Left ventricular ejection fraction (EF) is the primary metric of left ventricular systolic function. Deterioration of left ventricular EF in patients on HD has been independently associated with greater risk of cardiovascular and all-cause mortality, highlighting the importance of regular cardiac monitoring (3,6,7). Although traditionally used to quantify left ventricular EF, transthoracic echocardiography remains challenging to implement in outpatient or resource-limited settings due to personnel training requirements in addition to time and cost constraints. Additionally, echocardiogram (echo) measurement of left ventricular EF is subject to significant intrareader and inter-reader variability (8).
Electrocardiogram (ECG) detects cardiac abnormalities by measuring the electrical activity generated by the myocardium. It is an inexpensive, widely available investigation that is often obtained bedside in the routine care of patients on HD. Human physicians interpret the ECG on the basis of their knowledge of waveform morphology. Consequently, even experienced physicians may find it difficult to diagnose subtle patterns suggestive of subclinical or even overt pathology.
Machine learning (ML) algorithms excel at derivation of patterns within diverse data without needing explicit instructions about the nature of either the data or the pattern. Deep learning is a subset of ML and describes the utilization of multilayered neural networks to extract patterns within large-scale data. Deep learning represents a powerful set of tools that can extend to or be used to elegantly combine waveform data, imaging modalities, and text patient notes. Recently, deep learning has been successfully applied to ECG waveforms to qualify and quantify several cardiac pathologies and metrics, including the measurement of left ventricular EF (9⇓⇓–12). Given the high burden of cardiovascular morbidity and mortality in patients on HD, an algorithm to predict low left ventricular EF using ECGs may provide for an easily accessible and less time-consuming option compared with transthoracic echocardiography. This may also be useful for adjusting dialysis prescriptions and the institution of potentially disease-modifying agents earlier in the disease course in patients with low left ventricular EF.
However, these algorithms may not generalize to the HD patient population due to subacute structural cardiovascular and hemodynamic changes that occur in the setting of dialysis (3⇓–5). Not only are patients on HD physiologically different from patients not on HD, but they also represent a much smaller population size and dataset that may be inadequate for training de novo algorithms. This lack of data and sample size for training new models may be mitigated by transfer learning, a technique leveraging the expertise of a trained neural network at one task and utilizing it for another task (13). Transfer learning allows for faster model training and accurate performance even with decreased sample sizes.
We aimed to derive and validate deep learning algorithms for automated determination of left ventricular EF from ECG data of patients on maintenance HD using both de novo training as well as transfer learning pretrained models. We then associated model predictions of lower left ventricular EF with mortality in these patients.
Materials and Methods
Study Design
This was a multicenter retrospective study of a patient cohort between 2003 and 2020 from five hospitals in the Mount Sinai Health System located in New York City. The hospitals were the Mount Sinai Hospital, Mount Sinai Morningside, Mount Sinai West, Mount Sinai Brooklyn, and Mount Sinai Beth Israel.
Definition of the Primary Outcome
The primary outcome was estimation of left ventricular EF as measured by transthoracic echo. Extracted values of left ventricular EF were divided into three clinically relevant categories of ≤40%, 41% to ≤50%, and >50% as defined by the 2021 universal classification of heart failure (8).
Data Processing
We identified all patients over the age of 18 years who had been administered care at any of the five Mount Sinai facilities within New York City. We utilized procedure codes and ICD-10 codes for kidney failure to identify inpatients within the EHR on HD as well as to extract the dates and times of HD for these patients. Patients within the non-HD cohort were selected for by confirming the absence of any dialysis-related procedure codes within the EHR. We also utilized the EHR to extract dates, times, and values of left ventricular EF associated with transthoracic procedures for both patients on HD and patients not on HD. We extracted ECG information from the GE MUSE system as XML files. These files contain raw waveforms situated as collections of integers in addition to demographics and machine-extracted descriptors of waveform data. ECGs were paired by unique patient identifiers to left ventricular EF data to create ECG:echo pairs. In all cases, we only included ECGs performed within ±7 days of a transthoracic echo. For patients on HD, this was further restricted to ECGs performed within ±7 days of the last session of dialysis to account for potential acute changes in left ventricular EF due to hypervolemia or missed treatments.
We applied the Butterworth bandpass filter and a subsequent median filter (14) to the ECG to remove recording artifacts within ECG waveforms. The Butterworth bandpass filter removes noise added to ECG waveforms secondary to electrical interference from outside sources, whereas the median filter allows for correction of low-frequency/large-amplitude corrections of the baseline due to patient movement. Processed ECGs were then plotted to an image. Additionally, tabular data were extracted from XML files concerning patient age, sex, PR interval, QTc interval, atrial rate, and ventricular rate.
Transfer Learning
Deep learning requires more up-front investment than linear or tree-based models in terms of requirements of computational power, run time, and, most importantly, data. Medical datasets are often restricted to one center or one health system. Models trained on sufficiently large populations may also fail to generalize to new populations with sufficiently different baseline characteristics that are not accounted for within the modeling process. Further, restriction to specific pathologies and demographics as necessitated by the clinical question leads to significantly smaller patient cohorts. It may be impossible to collect sufficient data for certain diseases of sufficient rarity (15), despite being amenable to ML-based diagnostic and prognostic approaches. Such low-data regimes represent a challenge for deep learning models because they are high variance (16) constructs that can easily memorize the structure of limited data, leading to poor performance when shown unseen testing data—a condition known as overfitting (17).
Such issues may be mitigated using transfer learning, a technique leveraging the expertise of a trained neural network at one task and utilizing it for an adjacent downstream task (13). In this approach, the neural network is first trained on data points and labels that are like the downstream task of interest—a process described as pretraining (18). Following the pretraining, the neural network is shown data points and labels that correspond to the downstream task, described as fine-tuning (18).
“Learning” in neural networks happens by adjustment of the inner state in response to data. This state is defined by a collection of decimal numbers known as tensors. The magnitude and direction of the adjustment are decided by the loss or the difference between model predictions and known ground truth. Training on large datasets helps the neural network find the configuration of tensors corresponding to the lowest value of this loss for a task. Transfer learning assists the neural network to start closer to this optimal configuration when utilized for a similar task, and therefore requires less time and computing power to achieve the best possible performance (19). Crucially, transfer learning can mitigate the issue of training a deep learning architecture on a limited sample size, such as in the case of patients on HD. By using transfer learning, algorithms can be pretrained on larger and more general datasets and then applied to more niche populations (20).
Model Architecture
We utilized a combination architecture consisting of a convolutional neural network backbone for the imaging data and a Multi-Layer Perceptron for the tabular data. The convolutional neural network architecture in each case was the Efficientnet-B4. Efficientnets (21) achieve strong performance at image classification through balanced scaling of input images and neural network dimensions. The Multi-Layer Perceptron consisted of three fully connected layers with ReLU activations. The fully connected output layer for the convolutional neural network in each case was replaced by another fully connected layer containing 64 neurons and attached to the MLP. In all cases, we utilized CrossEntropy as the loss function and the Adam optimizer with a constant learning rate of 1e-4. Model architecture is as shown in Supplemental Figure 1.
Model Development and Performance Evaluation
We developed four differently initialized models. The first model was initialized with random parameters and then trained using data only from patients on HD. The second model’s convolutional neural network backbone was pretrained on approximately 1.3 million publicly available images from ImageNet and fine-tuned on a dataset consisting of only patients on HD (ImageNet transfer learning model). A third model was trained to classify left ventricular EF using data from all ECG:echo pairs, excluding those for patients on HD. In the fourth case, this model was fine-tuned on a dataset consisting only of patients on HD using transfer learning (ECG transfer learning model). A complete schematic of data selection and the model selection strategy is shown in Figure 1.
Transfer learning overview. ECG, electrocardiogram; HD, hemodialysis.
We implemented a group-stratified K-fold (K=5) crossvalidation design to maximize data usage while eliminating data leakage during training and testing. Each patient was treated as a separate group, and group duplication between training and testing data was disallowed. For patients with more than one available ECG paired to a transthoracic echo, all ECGs were utilized for training purposes. For testing, we only utilized one ECG per echo closest to the date of the echo. This was done to closely emulate real-world conditions and so that performance metrics were not erroneously influenced by repeat testing on one patient. Day-wise time differences between ECG and paired transthoracic echo versus ECG and closest session of dialysis are shown in Supplemental Figure 2.
Within each crossvalidation fold, we further subdivided training data into a training dataset proper (75%) and a validation dataset (25%). To prevent overfitting, models were trained on the training dataset until performance stopped improving on the validation dataset. At this point, the performance of the model was evaluated on the testing dataset. To demonstrate the utility of transfer learning when initial training and final deployment populations may be subtly different, we also tested the performance of the model trained on patients not on HD on ECG:echo pairs in the testing split of each crossfold.
We operationalized the prediction task as three binary classification problems. Predictions taken from the final layer of the neural network for one category of EF were compared with the sum of the predictions of the remaining two categories. Model performance for each of the three binary classification models was measured using the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve metrics. The precision-recall curve tracks model precision (positive predictive value) against model recall (sensitivity) at different thresholds, and the area under the precision-recall curve is a more useful performance metric in imbalanced datasets. Additionally, we utilized the Youden J Index (22) to calculate a classification threshold corresponding to optimal sensitivity and specificity given the receiver operating characteristic curve. To evaluate calibration of the model, we calculated Brier scores, where a lower score indicates that the model was better calibrated. Additionally, we generated calibration curves that look at the proportion of true positives across the levels of mean predicted probability. As downstream validation, we fit a Kaplan–Meier curve to model predictions as classified by this threshold. We also fit a Cox proportional hazards model to investigate the association of model-predicted left ventricular EF of <40% and patient survival while adjusting for age, sex, and comorbidities (alcoholism, asthma, atrial fibrillation, cancer, chronic obstructive pulmonary disease, diabetes mellitus, hypertension, liver disease, and stroke).
Software and Hardware
We used the following packages: pandas (23), numpy (24), scipy (25), scikit-learn (26), PIL (27), torchvision (28), lifelines (29), and Pytorch (30) libraries. Plotting was performed using the matplotlib (31) and seaborn (32) libraries. All program code was written for the Python programming language (33) (3.9.x). Software was run within Docker (34) containers created from official PyTorch docker images. Models were trained on a HIPAA-compliant Azure Cloud virtual machine containing 4× NVIDIA v100 GPUs with 16 GB VRAM each. Code to train and evaluate neural networks is released under a GPLv3 license at https://github.com/akhilvaid/ECG_LVEF_HDPatients.
Results
We extracted 705,075 ECG:echo pairs for 158,840 patients not on HD and 18,626 ECG:echo pairs for 2168 patients on HD. Among patients on HD, the average age was 64 years, 41% were women, and 89% were non-White, including 45% who were of unknown race. The most common comorbidity was hypertension, which was prevalent in 79% of this sample. Characteristics of patients on HD are presented in Table 1, and characteristics of those not on HD are presented in Supplemental Table 1.
Characteristics of patients on hemodialysis in the Mount Sinai Health System in New York City with paired echocardiograms and electrocardiograms between 2003 and 2020
Classification of Left Ventricular Ejection Fraction
The ECG transfer learning model data performed best at detecting patients with left ventricular EF ≤40% in comparison with the other two categories, achieving an AUROC of 0.86 (95% confidence interval [95% CI], 0.83 to 0.88) (Table 2). In contrast, models trained only on patients with HD had an AUROC of 0.74 (95% CI, 0.67 to 0.80), and models pretrained on ImageNet were slightly worse, with an AUROC of 0.71 (95% CI, 0.65 to 0.77).
Comparative performance of models for left ventricular ejection fraction prediction in patients with hemodialysis
Performance was lower for all models in detecting left ventricular EF >40% and ≤50% in comparison with the other categories. The ECG transfer learning model produced an AUROC of 0.68 (95% CI, 0.63 to 0.73) (Table 2). Models trained from scratch and pretrained on ImageNet each had AUROC values of 0.55 (95% CI, 0.49 to 0.61).
For classification of left ventricular EF >50%, models trained from scratch and pretrained on ImageNet achieved AUROC values of 0.71 (95% CI, 0.66 to 0.75) and 0.69 (95% CI, 0.62 to 0.77), respectively (Table 2). The ECG transfer learning model achieved an AUROC of 0.83 (95% CI, 0.80 to 0.85).
In each case, the ECG transfer learning model also outperformed the model trained on all available ECG:echo pairs for patients not on HD and tested on patients on HD. AUROC values achieved for each category of left ventricular EF were 0.80 (95% CI, 0.77 to 0.83), 0.51 (95% CI, 0.41 to 0.61), and 0.77 (95% CI, 0.73 to 0.80).
Receiver operating characteristic curves and precision-recall curves for each case are shown in Figure 2. Areas under the precision-recall curve are tabulated in Supplemental Table 2. Output probabilities were calibrated with isotonic regression with resulting Brier scores of 0.08, 0.10, and 0.12, where a lower Brier score indicates better calibration. Calibration curves for each outcome demonstrate that as mean predicted probability increases, the proportion of true positives also increases, consistent with good calibration of the model (Supplemental Figure 3).
Left ventricular ejection fraction (LVEF) prediction in patients on HD. Tested on patients on HD indicates that the model was trained on all non-HD ECG:echocardiogram pairs and tested on patients on HD. Pretrained on ECG data indicate that the ECG transfer learning model was trained on all non-HD ECG:echocardiogram pairs and fine-tuned on patients on HD. (Upper panels) Receiver operating characteristic (ROCs) curves. (Lower panels) Precision-recall (PR) curves. P values were derived from a DeLong test for ROC curves generated from the model trained on patients not on HD and tested on patients on HD and the ECG transfer learning model; they are as follows: LVEF ≤40%, P<0.001; 40% <LVEF ≤50%, P=0.02; LVEF >50%, P<0.001.
We applied the Youden J Index to model predictions for patients with left ventricular EF ≤40% to ascertain an optimal threshold for our ECG transfer learning model. We found that the ECG transfer learning model achieved a sensitivity of 0.97 (95% CI, 0.95 to 0.99) against a specificity of 0.73 (95% CI, 0.70 to 0.76) for detection of left ventricular EF ≤40%.
Validation of Model Performance
We divided predictions of left ventricular EF ≤40% for all patients into true positives, false positives, false negatives, and true negatives on the basis of the threshold derived from the Youden J Index. A Kaplan–Meier curve was fit to the time of survival from the recorded time of the first prediction over a 5-year follow-up period for either group. We found that in-hospital mortality during this period was higher in the true-positive and false-positive groups compared with true negatives (Figure 3). Interestingly, mortality in the false-negative group tracked along the true positives and false positives for the first year of follow-up but dropped to rates seen in true-negative patients afterward.
Kaplan–Meier curve of in-hospital mortality in patients on HD with predicted LVEF ≤40%. True positives, false positives, false negatives, and true negatives were ascertained on the basis of the threshold derived from the Youden J Index. P=0.03. True positive denotes patients who were identified by the algorithm as having an LVEF ≤40% and had an LVEF ≤40%. False positive denotes patients who were identified by the algorithm as having an LVEF ≤40% and did not have an LVEF ≤40%. True negative denotes patients who were identified by the algorithm as having an LVEF >40% and had an LVEF >40%. False negative denotes patients who were identified by the algorithm as having an LVEF >40% and had an LVEF ≤40%.
Further, we fit a Cox proportional hazards model to a calibrated model probability prediction of low left ventricular EF, comorbidities, and demographics. We found that model prediction of low left ventricular EF (<40%) was significantly associated with mortality (hazard ratio, 1.29; 95% CI, 1.04 to 1.59).
Discussion
In summary, we utilized a variety of approaches to automatically derive left ventricular EF from raw ECG data in patients on HD. We found that using approximately 700,000 ECGs from a diverse cohort of 150,000 New York City patients for pretraining and then transfer learning to develop a model for 2168 patients on HD substantially outperformed de novo algorithms trained on only patients on HD, those pretrained on an existing dataset of publicly available images (ImageNet), and a model trained on a corpus of only patients not on HD without further fine-tuning. Finally, we validated our ECG transfer learning model’s predictions in the low left ventricular EF (≤40% group) against patient survival over a 5-year follow-up period.
Given the high prevalence of cardiovascular disease in patients on HD and its association with morbidity and mortality (2), early identification of patients with low left ventricular EF could facilitate close monitoring and the institution of disease-modifying treatments (6,7). Traditional approaches for evaluation of left ventricular systolic function, such as echocardiography, tend to be time and labor intensive (35). Routine ECGs for this purpose may streamline patient care and enable not only earlier determination of better care pathways for patients on HD but also more cost-effective use of time- and technology-intensive diagnostic procedures, like echocardiography.
As we show, models generated from patients not on HD cannot generalize well to patients on HD due to potential transient effects of the HD procedure on cardiac function and differences in the prevalence of comorbidities between patients on HD and patients not on HD, as well as a higher rate of baseline ECG abnormalities and left ventricular hypertrophy in patients on HD. This is best exemplified by considering the left ventricular EF classification for the >40% and ≤50% range. Echo operators find it easier to determine left ventricular EF correctly when it is either normal or very abnormal. Even if the model makes correct predictions by excluding the other ranges of left ventricular EF, overall performance will still suffer due to poorer quality labels in this left ventricular EF range. As such, performance for the 41%–50% range of left ventricular EF is seen to drop significantly. Further, the difference in AUROCs between the model trained on patients not on HD and the ECG transfer learning model increases to 0.17 in this range of left ventricular EF. We posit that poorer-quality labels amplified by the additional variability in left ventricular EF introduced by the HD procedure introduce changes in data distribution that the model trained on patients not on HD cannot account for. Finally, the large disparity in performance between the ImageNet transfer model and the ECG transfer model demonstrates that transfer learning starting from a corpus of extremely dissimilar natural images may not function as well as a starting point of similar images. We also validated predictions from the model trained using transfer learning against patient mortality over a 5-year follow-up period. Overall higher mortality in the true-positive and false-positive groups as opposed to the true-negative group is indicative of the model’s potential to detect and anticipate reductions in left ventricular EF that contribute to mortality.
Our work should be interpreted considering certain limitations. First, although this is the largest and most diverse study using transfer learning to analyze ECGs in patients on HD, this was a study conducted within a single health system (although the hospitals are very sociodemographically diverse), and external validation may be necessary before any application of the model can be done. Second, the Youden J Index optimizes for both sensitivity and specificity, and therefore, it may not be representative of real-world requirements. Such deployment will necessitate calibration of the threshold for negative/positive decisions in accordance with misclassification costs. Third, we do not have data regarding indications for echoes. Finally, the 40%–50% range of left ventricular EF is significant for patients on HD, but model performance in this range is relatively lower. This may be improved through obtaining better ground truth labels through more accurate modalities, such as cardiac MRI.
In conclusion, we demonstrate a use case for transfer learning to improve model performance in the prediction of left ventricular EF using routine ECG data.
Disclosures
L. Chan reports consultancy agreements with Vifor Pharma, Inc.; research funding from the National Institutes of Health; and honoraria from Fresenius and is supported in part by National Institute of Diabetes and Digestive and Kidney Diseases career development grant K23DK124645. D.M. Charytan reports consultancy agreements with Allena Pharmaceuticals (Data Safety and Monitoring Board), Amgen, AstraZeneca, CSL Behring, Eli Lilly/Boehringer Ingelheim, Fresenius, Gilead, GSK, Janssen (steering committee), Medtronic, Merck, Novo Nordisk, PLC Medical (clinical events committee), Renalytix, and Zogenix; research funding from Amgen, Bioporto for clinical trial support, Gilead, Medtronic for clinical trial support, and Novo Nordisk; expert witness fees related to proton pump inhibitors; and serves as an associate editor of CJASN. B.S. Glicksberg reports consultancy agreements with Anthem AI, GLG Research, and Prometheus Biosciences and honoraria from Virtual EP Connect. G.N. Nadkarni reports employment with Pensieve Health and Renalytix; consultancy agreements with AstraZeneca, BioVie, GLG Consulting, Pensieve Health, Reata, Renalytix AI, Siemens, and Variant Bio; research funding from Goldfinch Bio and Renalytix; honoraria from AstraZeneca, BioVie, Lexicon, and Reata; patents or royalties with Renalytix; owns equity and stock options in Pensieve Health as a cofounder and Renalytix; has received financial compensation as a scientific board member and advisor to Renalytix; serves on the advisory board of Neurona Health; and serves in an advisory or leadership role for Pensieve Health and Renalytix. K. Singh reports consultancy with Flatiron Health (as part of the scientific advisory board); research funding from Blue Cross Blue Shield of Michigan and Teva Pharmaceuticals; honoraria from Harvard University for education that K. Singh does in the Safety, Quality, Informatics, and Leadership program and their HMS Executive Education program; serves in an advisory or leadership role for Flatiron Health (paid member of the scientific advisory board); and reports other interests or relationships with Blue Cross Blue Shield of Michigan. K. Singh receives salary support through the University of Michigan for work done on the Michigan Urological Surgery Improvement Collaborative. All remaining authors have nothing to disclose.
Funding
G.N. Nadkarni is supported by National Heart, Lung, and Blood Institute grant R01HL155915 and National Institute of Diabetes and Digestive and Kidney Diseases grant R01DK127139. L. Chan is supported by National Institute of Diabetes and Digestive and Kidney Diseases K23DK124645.
Acknowledgments
Because David M. Charytan is an associate editor of CJASN, he was not involved in the peer review process for this manuscript. Another editor oversaw the peer review and decision-making process for this manuscript.
Author Contributions
D.M. Charytan and G.N. Nadkarni conceptualized the study; B.S. Glicksberg, J.J. Jiang, P. Kovatch, and A. Vaid were responsible for data curation; L. Chan, A.W. Charney, J.J. Jiang, G.N. Nadkarni, A. Sawant, and A. Vaid were responsible for investigation; L. Chan, J. Divers, B.S. Glicksberg, J.J. Jiang, G.N. Nadkarni, and A. Vaid were responsible for formal analysis; L. Chan, A.W. Charney, D.M. Charytan, J. Divers, B.S. Glicksberg, J.J. Jiang, G.N. Nadkarni, A. Sawant, K. Singh, and A. Vaid were responsible for methodology; J. Divers, P. Kovatch, G.N. Nadkarni, and A. Sawant were responsible for project administration; A.W. Charney, P. Kovatch, and G.N. Nadkarni were responsible for resources; P. Kovatch, A. Sawant, K. Singh, and A. Vaid were responsible for software; L. Chan and B.S. Glicksberg were responsible for validation; G.N. Nadkarni was responsible for visualization; A.W. Charney and G.N. Nadkarni were responsible for funding acquisition; L. Chan, A.W. Charney, D.M. Charytan, J. Divers, B.S. Glicksberg, G.N. Nadkarni, and K. Singh provided supervision; L. Chan, G.N. Nadkarni, and A. Vaid wrote the original draft; and L. Chan, A.W. Charney, D.M. Charytan, J. Divers, B.S. Glicksberg, J.J. Jiang, P. Kovatch, G.N. Nadkarni, A. Sawant, and K. Singh reviewed and edited the manuscript.
Data Sharing Statement
Data may potentially be shared with appropriate approvals by contacting the corresponding author. Code to train and evaluate neural networks is released under a GPLv3 license at https://github.com/akhilvaid/ECG_LVEF_HDPatients.
Supplemental Material
This article contains the following supplemental material online at http://cjasn.asnjournals.org/lookup/suppl/doi:10.2215/CJN.16481221/-/DCSupplemental.
Supplemental Table 1. Population metrics for patients not on maintenance hemodialysis.
Supplemental Table 2. Comparative performance of models for left ventricular ejection fraction prediction in patients with hemodialysis.
Supplemental Figure 1. Multimodal neural network architecture.
Supplemental Figure 2. Time difference between ECG and paired TTE versus time difference between ECG and closest session of dialysis.
Supplemental Figure 3. Calibration curves.
Footnotes
Published online ahead of print. Publication date available at www.cjasn.org.
- Received December 20, 2021.
- Accepted April 27, 2022.
- Copyright © 2022 by the American Society of Nephrology
Podcast