Machine Learning based histology phenotyping to investigate the epidemiologic and genetic basis of adipocyte morphology and cardiometabolic traits.
Glastonbury CA., Pulit SL., Honecker J., Censin JC., Laber S., Yaghootkar H., Rahmioglu N., Pastel E., Kos K., Pitt A., Hudson M., Nellåker C., Beer NL., Hauner H., Becker CM., Zondervan KT., Frayling TM., Claussnitzer M., Lindgren CM.
Genetic studies have recently highlighted the importance of fat distribution, as well as overall adiposity, in the pathogenesis of obesity-associated diseases. Using a large study (n = 1,288) from 4 independent cohorts, we aimed to investigate the relationship between mean adipocyte area and obesity-related traits, and identify genetic factors associated with adipocyte cell size. To perform the first large-scale study of automatic adipocyte phenotyping using both histological and genetic data, we developed a deep learning-based method, the Adipocyte U-Net, to rapidly derive mean adipocyte area estimates from histology images. We validate our method using three state-of-the-art approaches; CellProfiler, Adiposoft and floating adipocytes fractions, all run blindly on two external cohorts. We observe high concordance between our method and the state-of-the-art approaches (Adipocyte U-net vs. CellProfiler: R2visceral = 0.94, P < 2.2 × 10-16, R2subcutaneous = 0.91, P < 2.2 × 10-16), and faster run times (10,000 images: 6mins vs 3.5hrs). We applied the Adipocyte U-Net to 4 cohorts with histology, genetic, and phenotypic data (total N = 820). After meta-analysis, we found that mean adipocyte area positively correlated with body mass index (BMI) (Psubq = 8.13 × 10-69, βsubq = 0.45; Pvisc = 2.5 × 10-55, βvisc = 0.49; average R2 across cohorts = 0.49) and that adipocytes in subcutaneous depots are larger than their visceral counterparts (Pmeta = 9.8 × 10-7). Lastly, we performed the largest GWAS and subsequent meta-analysis of mean adipocyte area and intra-individual adipocyte variation (N = 820). Despite having twice the number of samples than any similar study, we found no genome-wide significant associations, suggesting that larger sample sizes and a homogenous collection of adipose tissue are likely needed to identify robust genetic associations.