Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

BACKGROUND: Disease presentation and progression can vary greatly in heterogeneous diseases, such as COVID-19, with variability in patient outcomes, even within the hospital setting. This variability underscores the need for tailored treatment approaches based on distinct clinical subgroups. OBJECTIVES: This study aimed to identify COVID-19 patient subgroups with unique clinical characteristics using real-world data (RWD) from electronic health records (EHRs) to inform individualized treatment plans. MATERIALS AND METHODS: A Factor Analysis of Mixed Data (FAMD)-based agglomerative hierarchical clustering approach was employed to analyze the real-world data, enabling the identification of distinct patient subgroups. Statistical tests evaluated cluster differences, and machine learning models classified the identified subgroups. RESULTS: Three clusters of COVID-19 in patients with unique clinical characteristics were identified. The analysis revealed significant differences in hospital stay durations and survival rates among the clusters, with more severe clinical features correlating with worse prognoses and machine learning classifiers achieving high accuracy in subgroup identification. CONCLUSION: By leveraging RWD and advanced clustering techniques, the study provides insights into the heterogeneity of COVID-19 presentations. The findings support the development of classification models that can inform more individualized and effective treatment plans, improving patient outcomes in the future.

Original publication

DOI

10.3389/fpubh.2025.1544904

Type

Journal article

Journal

Front Public Health

Publication Date

2025

Volume

13

Keywords

classification, clustering analysis, critical care, factor analysis of mixed data, real-world data, Humans, COVID-19, Female, Male, Electronic Health Records, Middle Aged, Machine Learning, Critical Care, Aged, Cluster Analysis, SARS-CoV-2, Adult, Length of Stay, Factor Analysis, Statistical