An Aggregated Statistical Analysis of the Diabetes Exposome through Technological Advances in Public Health Research

The Diabetes Epidemic

As of 2010, the United States spent over 17% of it’s Gross Domestic Product, or 2.8 trillion dollars ($9,000 per person), on health care. Strikingly, this number is projected to double in the next decade. 75% of those health care costs are related to chronic diseases, of which Diabetes Mellitus is the most common. In fact, one out of three Americans now fall under the category of “prediabetic”, meaning that they will develop Type II Diabetes Mellitus within the next decade unless they seek intervention. Dr. Lawrence S. Phillips of the Veterans Administration and the Emory School of Medicine echoes this notion; “Diabetes is generally diagnosed and first treated about ten years later than it could be. We waste this critical opportunity to slow disease progression and the development of complications. If we find prediabetes and early diabetes, along with proper management, we can change natural history and improve the lives of our patients” (Phillips et al, 2014). Taking preventative measures one step further, we can address the multi-factorial exposures that put individuals at risk for prediabetes in the first place. We can do this through the growing field of exposure sciences, utilizing a concept called the exposome.


The Exposome

The exposome is defined as “the cumulative measure of environmental influences and associated biological responses throughout the lifespan, including exposures from the environment, diet, behavior, and endogenous processes” (toxicological sciences, 2014). Exposures yield specific molecular, cellular, and physiological effects and responses that alter our biology, impact human health, and cause disease. To quantify risk for Type II Diabetes Mellitus, we can measure the chemical burden of these exposures, and then take action to mitigate development. Utilizing metabolomics (through the mass spectrometry tool), spatial statistics (through the GIS tool), and genetic epidemiology (through the GWAS tool), I will assess this burden and foster an innovative perspective on Type II Diabetes Mellitus.

Metabolomics: Mass Spectrometry

Metabolomics is a measure of low molecular mass chemicals (<2000 daltons). The metabolomics tool of mass spectrometry measures many (practical limit of a couple hundred) different chemicals as ions in the gas phase (mass/charge), proceeding to use ion dissociation and quantification to allow for identification and standardization.

In the paper, Quantitative Metabolomics by 1H-NMR and LC-MS/MS Confirms Altered Metabolic Pathways in Diabetes, Lanza et al. used analytical methods, namely quantitative metabolomics, to analyze metabolic pathways in diabetics, recognizing their abnormalities due to insulin deficiency (Lanza et al, 2010). The researchers generated a “plasma metabolite profile that is characteristic of several underlying physiological processes that are known to be altered by short-term insulin deprivation in type 1 diabetic people (e.g., mitochondrial dysfunction, oxidative stress, protein synthesis, degradation, and oxidation, gluconeogenesis, and ketogenesis)”, showing that several metabolic pathways are perturbed in Diabetics. They then found that “Twenty-one additional amino acid metabolites were detected and quantified, of which 5 significantly increased and 1 significantly decreased with insulin deprivation”, exemplifying that certain plasma metabolites are elevated during insulin deprivation. They used multivariate statistics to differentiate proton spectra from I- and I+ based on several of these derived elevated plasma metabolites. Along with that finding, they found that “Allantoin, a product of nonenzymatic urate oxidation, was significantly elevated with insulin deprivation, indicative of increased oxidative stress when insulin is withdrawn from patients with type 1 diabetes”, which evidences the claim that plasma amino acid levels are significantly perturbed during insulin deprivation. The researchers used mass spectrometry to reveal these significant perturbations in plasma amino acid levels and metabolites during insulin deprivation. The findings of the authors were to be expected, but they were still novel in that they successfully employed modern analytical methods in metabolomics to build upon the body of available research on metabolic abnormalities in insulin deficiency. This lays the foundation for further innovative research on the subject area, including possible innovations in diabetes risk detection, prevention, and adverse health mitigation.


  • The benefits of using metabolomics in research are plentiful. Metabolomics is affordable, and covers a broad range of exposures, often leading to network aping for environmental chemicals.
  • It is a wonderful tool, but there are still limitations to be recognized such as the fact that low abundance chemicals are not found in everyone, there exists analytical noise and drift, one needs extraction algorithms, and there is a practical limit to mass spectrometry of a few hundred chemicals.

Despite these drawbacks, the more we use metabolomics the more efficient it will become, and I believe that it will become a standard practice in chemical research within the next decade.

Spatial Statistics Part 1: Geographic Information Systems

In spatial statistcs, the Georgraphic Information System (GIS) tool is a “technology designed to capture, store, manipulate, analyze, and visualize georeferenced data” (Goodchild, Parks, and Steyaert 1993). Layering these data sets allows us to map the prevalence of chronic disease or various exposures.

First I will present the findings of disparities in diabetes prevalence and health outcomes pre-GIS. The paper, Genetic and Environmental Determinants of Type II Diabetes in Mexico City and San Antonio, compared the prevalence of Type II Diabetes Mellitus in populations of similar genotypes living in environments with different exposures (Mexicans living in Mexico City versus Mexicans living in San Antonio) (Stern et al. 1992). The researchers found that, despite displaying similar genetic risk for Type II Diabetes Mellitus, the prevalence of diabetes was 36% higher among Mexicans living in San Antonio than in Mexico City, displaying that environmental factors played a crucial role in the development of Type II Diabetes Mellitus. The authors found that Mexicans living in Mexico city had higher triglyceride and fasting insulin concentrations than Mexican-Americans living in San Antonio, despite being more active and leaner, supporting the idea that high carbohydrate diets stimulate carbohydrate-induced hypertriglyceridemia, even in leaner populations who partake in more physical activity. The aforementioned conclusions rely on the basis that physical activity is important for leanness, which is important to reduce/prevent Type II Diabetes Mellitus. This is complimented by the finding that Mexicans had lower rates of diabetes and were leaner than their American counterparts, despite having a comparable intake, indicating that their extra physical activity likely played the role in this healthy difference. The main conclusion of this paper is that environmental factors can override genetic susceptibility in the expression of type II diabetes trait.

These findings were probably novel at the time of study (1992), and they were definitely limited by the access to public health technology. They are fairly significant findings to society because they show that we can mitigate the increase in Type II diabetes prevalence through behavior change, and that behavior is a largely a reaction to one’s environment. However, these findings are hampered by our inability to quantify, isolate, and apply them, leaving us to mere speculation. The main question in which this research begs is how to assess the outcome of these disparities? That’s where Geographic information systems tools come in. GIS tools allow us to expand our understanding of these disparities in health outcomes within different communities.

Spatial Statistics Part 2: Bayesian Hierarchical Analysis

Almost 20 years after the Mexico City versus San Antonio Study, Geraghty et al. published the paper, Using Geographic Information Systems (GIS) to Assess Outcome Disparities in Patients with Type 2 Diabetes and Hyperlipidemia (Geraghty et al, 2010). The researchers identified patients (7288 of them!) with diabetes mellitus from a variety of clinics in the greater Sacramento area using data from the University of California Davis Health System’s electronic medical records, and converted this information into a database file suitable for GIS. They then repeated this process for socioeconomic and demographic data about these patients, and conducted a regression analysis assessing A1c (glycohemoglobin) levels based on each patient’s data characteristics. The authors found a positive association between socioeconomic status of one’s neighborhood and A1c levels, concluding that socioeconomic status is a barrier to optimal glucose control. This finding is consistent with the Mexico City versus San Antonio study, but takes the disparities one step further through GIS and has far-reaching applications towards diabetes management and socio-economic status, implicating that health literacy and access is paramount to chronic disease morbidity.

Along the same lines, a Finnish case study examined regional patterns in the incidence of Diabetes through a Bayesian hierarchical approach and GIS (Rytkonen et al, 2001). Bayesian analysis is a way to make a dynamic inference on the probability of a hypothesis based on the evolving body of evidence available, and can be combined quite effectively with GIS to derive patterns of environmental burden on disease prevalence. While GIS mapping provides a slice in time, Bayesian analysis estimates the future based off these snapshots. This study found geographic variations in diabetes risk in different areas of Finland, similar to the findings of Stern et al and Geraghty et al. However, it differentiated itself from those studies by finding that this geographic variance in disease risk changed over time, suggesting that specific environmental exposures in certain geographic locations in Finland have become greater over time. Through even more Bayesian analysis, we can infer whether this pattern will continue, and try to stop it in it’s tracks through intervention in the growingly problematic areas.

  • GIS is beneficial to map disease prevalence and analyze populations, which is useful to do something like evaluating risk for type II diabetes based off of obesogenic environments.
  • However, the main drawback of GIS is that it doesn’t actually tell us about causality or chemical burden. Bayesian statistics is the missing piece of the puzzle that connects these dots.

It is important to note that the areas that need to be most critically addressed for chronic disease prevention are not necessarily those with the highest prevalence, but those with the most rapidly growing incidence.

Genetic Epidemiology: Genome-Wide Association Studies

Genetic Epidemiology is the study of the genetic distribution and genetic determinants of disease and can be examined through the use of Genome-Wide association studies (GWAS). GWAS tests for a relationship between “common” variants and disease, using the SNP (single Nucleotide Polymorphism) unit of analysis. Most of the GWAS studies analyze half a million to 2.5 million bases of our genome (out of the 3 billion possible bases), and compare allele frequencies between groups.

The study, Common variant in MTNR1B associated with increased risk of type 2 diabetes and impaired early insulin secretion, used GWAS to find a common denominator in type II diabetics (Lyssenko et al, 2008). The researchers found that a variation in melatonin receptor 1B is associated with insulin and glucose concentrations. The authors found impairment of early insulin response to oral and intravenous glucose coupled with faster deterioration of insulin secretion over time in the risk genotype of this single nucleotide polymorphism to be a predictor future onset of Type II Diabetes Mellitus. This research suggests that “the circulating hormone melatonin, which is predominantly released from the pineal gland in the brain, is involved in the pathogenesis of T2D”, likely via a direct inhibitory effect on beta cells. The researchers also hypothesized that blocking the melatonin ligand-receptor system could be a possible therapeutic avenue for patients with Type II Diabetes Mellitus.

  • The benefits of using GWAS are that it can assess dosage changes and tell you about novel biology without the need to control for ancestry or families, which are in line with how this study was conducted.
  • The limitations however are that it can only isolate common variants, is subject to bias, has the most efficacy with huge sample sizes, requires replication, and can identify location only (identifying a specific gene or variant requires further analysis).

The meta analysis of GWAS, The Genetics of Type 2 diabetes: what have we learned from GWAS, provides further analysis needed to detail these GWAS conclusions on Type II Diabetes Mellitus (Billings, L.K., Florez, J. C., 2010). The authors of this paper bolster the findings of primary GWAS research through holistic analysis. Specifically pertaining to the first GWAS study I wrote about, the researchers find MTNR1B to be compelling enough to provide a link between circadian rhythmicity and glucose metabolism, justifying a deeper look into how environmental exposures affect circadian rhythm, which in turn affects Diabetes-related hormones like insulin, leptin, adiponectin, cortisol. This interaction warrants our attention.

A Primer on Cicadian Rhythmicity and Hormonal Glucose Metabolism

Leptin levels correspond with one’s fat mass (Schoeller, 2000), or fluctuate with acute energy balance (namely through carbohydrate intake)(Caro, 1996), and have downstream effects on influential metabolic hormones, such as neuropeptide-Y or T3, in order for body fat to remain in a comfortable equilibrium. Like leptin, higher levels of adiponectin are favorable. Adiponectin also follows a diurnal pattern, but contrary to leptin, adiponectin peaks during the daytime and has an inverse relationship to insulin levels. Concerning obese individuals, chronically high insulin may lead to insulin resistance and inflammation, spiraling chronically high leptin into leptin resistance, and causing adiponectin to be chronically low. “Stress can be defined as any challenge to homeostasis of an individuum that requires an adaptive response of that individuum.” (Newport, Nemeroff, 2002), and cortisol is the adaptive response. Thinking in these terms and applying this to the sleep/wake cycle, waking up from sleep is a grand challenge to homeostasis, which is why we have the cortisol awakening response (abbreviated CAR). Cortisol reaches it’s peak at breakfast time, agonizing the glucocorticoid receptors, which in turn alters the non-genomic interaction between insulin and cortisol to create a synergistic effect (Dallman et al, 1995). We now arrived at, what I believe to be, a key player in Type II Diabetes formation, and this mess of hormonal interactions (as responses to environmental exposures) all links back to the single nucleotide polymorphism, MTNR1B, found through GWAS.

“The Whole is Greater than the Sum of its Parts” – Aristotle

Type II Diabetes Mellitus is growing at epidemic proportions (in conjunction with obesity). It needs to be at the forefront of issues we need to address now, and if reduced, it can save us billions of dollars and lives. Mass spectrometry proved to be a valuable tool in finding perturbed metabolic pathways in insulin deficient diabetics. The use of GIS was extremely helpful as well to map data and disease/exposure prevalence, especially in conjunction with Bayesian statistics to predict incidence. Lastly, GWAS was critical in isolating SNPs in diabetics, allowing me to examine the interplay of hormonal intermediaries. Through the use of these techniques, I measured the environmental influences and associated biological responses to (endogenous), environmental, dietary, and behavioral exposures. Evaluating these exposures, known as the exposome concept, allowed me to critically analyze Type II Diabetes symptoms, causes, risk factors, complications, treatment, prevention, and overall pathogenesis, in innovative ways that gave way to current understanding and future research that can make Type II Diabetes Mellitus morbidity a thing of the past.


Sources Cited:


  1. Miller GW, Jones DP. The nature of nurture: refining the definition of the exposome. Toxicol Sci 2014;137(1):1-2.
  2. Phillips LS, Ratner RE, Buse JB, Kahn SE. We can change the natural history of type 2 diabetes. Diabetes Care. 2014;37(10):2668-2676.
  3. Article Source: Quantitative Metabolomics by 1H-NMR and LC-MS/MS Confirms Altered Metabolic Pathways in Diabetes. Lanza IR, Zhang S, Ward LE, Karakelides H, Raftery D, et al. (2010) Quantitative Metabolomics by 1H-NMR and LC-MS/MS Confirms Altered Metabolic Pathways in Diabetes. PLoS ONE 5(5): e10538. doi: 10.1371/journal.pone.0010538
  4. Goodchild, M.F., Parks, B.O, and Steyaert, L.T., 1993. Environmental Modeling with GIS. New York: Oxford University Press.
  5. Stern MP, Gonzalez C, Mitchell BD, Villalpando E, Haffner SM, Hazuda HP. Genetic and environmental determinants of type II diabetes in Mexico City and San Antonio. Diabetes. 1992;41(4):484–492. Epub 1992/04/01.
  6. Geraghty EM, Balsbaugh T, Nuovo J, Tandon S. Using geographic information systems (GIS) to assess outcome disparities in patients with type 2 diabetes and hyperlipidemia. J Am Board Fam Med 2010;23(1):88–96. 10.3122/jabfm.2010.01.090149
  7. Rytkonen M, Ranta J, Tuomilehto J, Karvonen M, SPAT Study Group and the Finnish Childhood Diabetes Registry Group Bayesian analysis of geographical variation in the incidence of type 1 diabetes in Finland. Diabetologia. 2001;44(Suppl 3):B37–44. doi: 10.1007/PL00002952.
  8. Lyssenko V, Nagorny CL, Erdos MR, Wierup N, Jonsson A, Spegel P, et al. Common variant in MTNR1B associated with increased risk of type 2 diabetes and impaired early insulin secretion. Nat Genet, 2009. 41(1): p. 82–8 doi: 10.1038/ng.28819060908
  9. Billings LK, Florez JC The genetics of type 2 diabetes: what have we learned from GWAS? Ann N Y Acad Sci. 2010;1212(1):59–77 doi: 10.1111/j.1749-6632.2010.05805.x21039589
  10. Schoeller, D. A. (2000). Twenty-Four-Hour Leptin Levels Respond to Cumulative Short-Term Energy Imbalance and Predict Subsequent Intake. The Journal of Clinical Endocrinology & Metabolism, 85, 2685-2691.
  11. Caro, J. F. (1996). Serum Immunoreactive-Leptin Concentrations in Normal-Weight and Obese Humans.New England Journal of Medicine, , 292-295.
  12. Newport, D.J. and Nemeroff, C.B. (2002) Stress. In: (Ed. in chief), Encyclopedia of the Human Brain, Vol. 4. Elsevier, pp. 449-462.
  13. Dallman MF, Akana SF, Strack AM, Hanson ES, Sebastian RJ. The neural network that regulates energy balance is responsive to gluco- corticoids and insulin and also regulates HPA axis responsivity at a site proximal to CRF neurons. Stress: Basic Mechanisms Clin Implicat 1995; 771: 730±742.

John Sabbas

John is a highly functional human being in at least three dimensions. His greatest strength is his curiosity, followed closely by his 405lb bench press.

You may also like...

Leave a Reply

Read more:
Man vs Statistics! Survivor’s Guide to Eating Mushrooms

Note: this is not a guide to surviving. This is probability. Statistics is king. You find a gray mushroom and eat...

Hands too small? No problem. How to get bigger fast.

Genetics limit everything. They limit our height, they make us susceptible to certain diseases, and they end up giving some...

Trans Fats — are we be wrong about them?

Trans fats have been so vilified that nutritional labels list how many grams of trans fats food contains. Whatever did...

Newly Constructed Cancer Drug – Carboranes

Carboranes have three main structural components: boron, carbon, and hydrogen. This compound doesn't directly target cancer cells. Instead, they help...