Research reveals most accurate tool for predicting AKI, death in patients with COVID-19
For patients hospitalized with COVID-19, a boosted decision tree model more accurately predicted AKI requiring dialysis and mortality compared with either standard models or other machine learning models.
The model, known as XGBoost, showed higher precision (defined by positive predictive value) and recall (defined by sensitivity) than the other models, which researchers contended is important because “a model with high positive predictive value minimizes false positives and thus can help avoid clinician fatigue and alert burnout ... [while] high sensitivity means the model will have fewer false negatives, thus maximizing the utility in identifying patients at need for dialysis or at risk for death.”
According to Akhil Vaid, MD, of Icahn School of Medicine at Mount Sinai, and colleagues, although several machine learning models have been published throughout the COVID-19 pandemic, none have specifically addressed AKI and dialysis.
“Machine learning models can harness the disparate data collected during clinical care in electronic health records for accurate outcome predictors,” they wrote. “ ... We aimed to develop and validate a machine learning model to predict a composite endpoint of AKI treated with dialysis or death in patients hospitalized with COVID-19 early in the hospital course.”
To this end, researchers included 6,093 patients who were hospitalized within the Mount Sinai Health System between March 2020 and December 2020. Researchers compared the performance between logistic regression, least absolute shrinkage and selection operator, random forest and XGBoost for predicting dialysis or mortality at 1 to 7 days following hospital admission. All models considered demographics, comorbidities, and laboratory and vital signs within 12 hours of hospital admission.
Results demonstrated XGBoost (without imputation for prediction of a composite outcome of either death or dialysis) performed better than the other models, with the highest area under the receiver curve on internal validation (range of 0.93 to 0.98) and area under the precision recall curve (range of 0.78 to 0.82 across) for all time points. XGBoost without imputation also outperformed all models through its higher precision and recall (mean difference in the area under the curve receiver operating characteristic of 0.04 and mean difference in the area under precision recall curve of 0.15).
Serum creatinine, blood urea nitrogen, systolic blood pressure, age and oxygen saturation were the “major drivers” of the model’s predictions.
“In conclusion, identification of patients at risk for acute dialysis and death in COVID-19 presents a variety of challenges,” Vaid and colleagues wrote. “One such difficulty pertains to resource allocation in potentially overcrowded hospitals. Our models may assist with this challenge and are being prospectively validated and deployed in a real-world setting to aide in management of hospitalized patients with COVID-19.”