Evaluating methods for risk prediction of Covid-19 mortality in nursing home residents before and after vaccine availability: a retrospective cohort study

TitleEvaluating methods for risk prediction of Covid-19 mortality in nursing home residents before and after vaccine availability: a retrospective cohort study
Publication TypeJournal Article
Year of Publication2024
AuthorsAryal K., Mowbray F.I, Miroshnychenko A., Strum R.P, Dash D., Hillmer M.P, Malikov K., Costa A.P, Jones A.
JournalBMC Med Res Methodol
Volume24
Issue1
Pagination77
Date PublishedMar 27
ISBN Number1471-2288 (Electronic)<br/>1471-2288 (Linking)
Accession Number38539074
Keywords*COVID-19/prevention & control, Aged, cohort study, COVID-19, COVID-19 Vaccines, Electronic Health Record, Female, Humans, Long-Term Care, machine learning, Male, nursing home, Nursing Homes, Older adults, Ontario/epidemiology, Prediction modeling, Retrospective Studies, SARS-CoV-2
Abstract

BACKGROUND: SARS-CoV-2 vaccines are effective in reducing hospitalization, COVID-19 symptoms, and COVID-19 mortality for nursing home (NH) residents. We sought to compare the accuracy of various machine learning models, examine changes to model performance, and identify resident characteristics that have the strongest associations with 30-day COVID-19 mortality, before and after vaccine availability. METHODS: We conducted a population-based retrospective cohort study analyzing data from all NH facilities across Ontario, Canada. We included all residents diagnosed with SARS-CoV-2 and living in NHs between March 2020 and July 2021. We employed five machine learning algorithms to predict COVID-19 mortality, including logistic regression, LASSO regression, classification and regression trees (CART), random forests, and gradient boosted trees. The discriminative performance of the models was evaluated using the area under the receiver operating characteristic curve (AUC) for each model using 10-fold cross-validation. Model calibration was determined through evaluation of calibration slopes. Variable importance was calculated by repeatedly and randomly permutating the values of each predictor in the dataset and re-evaluating the model's performance. RESULTS: A total of 14,977 NH residents and 20 resident characteristics were included in the model. The cross-validated AUCs were similar across algorithms and ranged from 0.64 to 0.67. Gradient boosted trees and logistic regression had an AUC of 0.67 pre- and post-vaccine availability. CART had the lowest discrimination ability with an AUC of 0.64 pre-vaccine availability, and 0.65 post-vaccine availability. The most influential resident characteristics, irrespective of vaccine availability, included advanced age (>/= 75 years), health instability, functional and cognitive status, sex (male), and polypharmacy. CONCLUSIONS: The predictive accuracy and discrimination exhibited by all five examined machine learning algorithms were similar. Both logistic regression and gradient boosted trees exhibit comparable performance and display slight superiority over other machine learning algorithms. We observed consistent model performance both before and after vaccine availability. The influence of resident characteristics on COVID-19 mortality remained consistent across time periods, suggesting that changes to pre-vaccination screening practices for high-risk individuals are effective in the post-vaccination era.

DOI10.1186/s12874-024-02189-3
Custom 1

The authors have no competing interests to declare.

PMCID

PMC10976701