Predicting COVID-19 related death using the OpenSAFELY platform
Williamson EJ., Tazare J., Bhaskaran K., McDonald HI., Walker AJ., Tomlinson L., Wing K., Bacon S., Bates C., Curtis HJ., Forbes H., Minassian C., Morton CE., Nightingale E., Mehrkar A., Evans D., Nicholson BD., Leon D., Inglesby P., MacKenna B., Davies NG., DeVito NJ., Drysdale H., Cockburn J., Hulme W., Morley J., Douglas I., Rentsch CT., Mathur R., Wong A., Schultze A., Croker R., Parry J., Hester F., Harper S., Grieve R., Harrison DA., Steyerberg EW., Eggo RM., Diaz-Ordaz K., Keogh R., Evans SJW., Smeeth L., Goldacre B.
AbstractObjectivesTo compare approaches for obtaining relative and absolute estimates of risk of 28-day COVID-19 mortality for adults in the general population of England in the context of changing levels of circulating infection.DesignThree designs were compared. (A) case-cohort which does not explicitly account for the time-changing prevalence of COVID-19 infection, (B) 28-day landmarking, a series of sequential overlapping sub-studies incorporating time-updating proxy measures of the prevalence of infection, and (C) daily landmarking. Regression models were fitted to predict 28-day COVID-19 mortality.SettingWorking on behalf of NHS England, we used clinical data from adult patients from all regions of England held in the TPP SystmOne electronic health record system, linked to Office for National Statistics (ONS) mortality data, using the OpenSAFELY platform.ParticipantsEligible participants were adults aged 18 or over, registered at a general practice using TPP software on 1stMarch 2020 with recorded sex, postcode and ethnicity. 11,972,947 individuals were included, and 7,999 participants experienced a COVID-19 related death. The study period lasted 100 days, ending 8thJune 2020.PredictorsA range of demographic characteristics and comorbidities were used as potential predictors. Local infection prevalence was estimated with three proxies: modelled based on local prevalence and other key factors; rate of A&E COVID-19 related attendances; and rate of suspected COVID-19 cases in primary care.Main outcome measuresCOVID-19 related death.ResultsAll models discriminated well between patients who did and did not experience COVID-19 related death, with C-statistics ranging from 0.92-0.94. Accurate estimates of absolute risk required data on local infection prevalence, with modelled estimates providing the best performance.ConclusionsReliable estimates of absolute risk need to incorporate changing local prevalence of infection. Simple models can provide very good discrimination and may simplify implementation of risk prediction tools in practice.