The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
BACKGROUND - Observations from statin clinical trials and from Mendelian randomization studies suggest that low low-density lipoprotein cholesterol (LDL-C) concentrations may be associated with increased risk of type 2 diabetes mellitus (T2DM). Despite the findings from statin clinical trials and genetic studies, there is little direct evidence implicating low LDL-C concentrations in increased risk of T2DM.
METHODS AND FINDINGS - We used de-identified electronic health records (EHRs) at Vanderbilt University Medical Center to compare the risk of T2DM in a cross-sectional study among individuals with very low (≤60 mg/dl, N = 8,943) and normal (90-130 mg/dl, N = 71,343) LDL-C levels calculated using the Friedewald formula. LDL-C levels associated with statin use, hospitalization, or a serum albumin level < 3 g/dl were excluded. We used a 2-phase approach: in 1/3 of the sample (discovery) we used T2DM phenome-wide association study codes (phecodes) to identify cases and controls, and in the remaining 2/3 (validation) we identified T2DM cases and controls using a validated algorithm. The analysis plan for the validation phase was constructed at the time of the design of that component of the study. The prevalence of T2DM in the very low and normal LDL-C groups was compared using logistic regression with adjustment for age, race, sex, body mass index (BMI), high-density lipoprotein cholesterol, triglycerides, and duration of care. Secondary analyses included prespecified stratification by sex, race, BMI, and LDL-C level. In the discovery cohort, phecodes related to T2DM were significantly more frequent in the very low LDL-C group. In the validation cohort (N = 33,039 after applying the T2DM algorithm to identify cases and controls), the risk of T2DM was increased in the very low compared to normal LDL-C group (odds ratio [OR] 2.06, 95% CI 1.80-2.37; P < 2 × 10-16). The findings remained significant in sensitivity analyses. The association between low LDL-C levels and T2DM was significant in males (OR 2.43, 95% CI 2.00-2.95; P < 2 × 10-16) and females (OR 1.74, 95% CI 1.42-2.12; P = 6.88 × 10-8); in normal weight (OR 2.18, 95% CI 1.59-2.98; P = 1.1× 10-6), overweight (OR 2.17, 95% CI 1.65-2.83; P = 1.73× 10-8), and obese (OR 2.00, 95% CI 1.65-2.41; P = 8 × 10-13) categories; and in individuals with LDL-C < 40 mg/dl (OR 2.31, 95% CI 1.71-3.10; P = 3.01× 10-8) and LDL-C 40-60 mg/dl (OR 1.99, 95% CI 1.71-2.32; P < 2.0× 10-16). The association was significant in individuals of European ancestry (OR 2.67, 95% CI 2.25-3.17; P < 2 × 10-16) but not in those of African ancestry (OR 1.09, 95% CI 0.81-1.46; P = 0.56). A limitation was that we only compared groups with very low and normal LDL-C levels; also, since this was not an inception cohort, we cannot exclude the possibility of reverse causation.
CONCLUSIONS - Very low LDL-C concentrations occurring in the absence of statin treatment were significantly associated with T2DM risk in a large EHR population; this increased risk was present in both sexes and all BMI categories, and in individuals of European ancestry but not of African ancestry. Longitudinal cohort studies to assess the relationship between very low LDL-C levels not associated with lipid-lowering therapy and risk of developing T2DM will be important.
The completion of the Human Genome Project has unleashed a wealth of human genomics information, but it remains unclear how best to implement this information for the benefit of patients. The standard approach of biomedical research, with researchers pursuing advances in knowledge in the laboratory and, separately, clinicians translating research findings into the clinic as much as decades later, will need to give way to new interdisciplinary models for research in genomic medicine. These models should include scientists and clinicians actively working as teams to study patients and populations recruited in clinical settings and communities to make genomics discoveries-through the combined efforts of data scientists, clinical researchers, epidemiologists, and basic scientists-and to rapidly apply these discoveries in the clinic for the prediction, prevention, diagnosis, prognosis, and treatment of cardiovascular diseases and stroke. The highly publicized US Precision Medicine Initiative, also known as All of Us, is a large-scale program funded by the US National Institutes of Health that will energize these efforts, but several ongoing studies such as the UK Biobank Initiative; the Million Veteran Program; the Electronic Medical Records and Genomics Network; the Kaiser Permanente Research Program on Genes, Environment and Health; and the DiscovEHR collaboration are already providing exemplary models of this kind of interdisciplinary work. In this statement, we outline the opportunities and challenges in broadly implementing new interdisciplinary models in academic medical centers and community settings and bringing the promise of genomics to fruition.
© 2018 American Heart Association, Inc.
OBJECTIVE - Hepatorenal Syndrome (HRS) is a devastating form of acute kidney injury (AKI) in advanced liver disease patients with high morbidity and mortality, but phenotyping algorithms have not yet been developed using large electronic health record (EHR) databases. We evaluated and compared multiple phenotyping methods to achieve an accurate algorithm for HRS identification.
MATERIALS AND METHODS - A national retrospective cohort of patients with cirrhosis and AKI admitted to 124 Veterans Affairs hospitals was assembled from electronic health record data collected from 2005 to 2013. AKI was defined by the Kidney Disease: Improving Global Outcomes criteria. Five hundred and four hospitalizations were selected for manual chart review and served as the gold standard. Electronic Health Record based predictors were identified using structured and free text clinical data, subjected through NLP from the clinical Text Analysis Knowledge Extraction System. We explored several dimension reduction techniques for the NLP data, including newer high-throughput phenotyping and word embedding methods, and ascertained their effectiveness in identifying the phenotype without structured predictor variables. With the combined structured and NLP variables, we analyzed five phenotyping algorithms: penalized logistic regression, naïve Bayes, support vector machines, random forest, and gradient boosting. Calibration and discrimination metrics were calculated using 100 bootstrap iterations. In the final model, we report odds ratios and 95% confidence intervals.
RESULTS - The area under the receiver operating characteristic curve (AUC) for the different models ranged from 0.73 to 0.93; with penalized logistic regression having the best discriminatory performance. Calibration for logistic regression was modest, but gradient boosting and support vector machines were superior. NLP identified 6985 variables; a priori variable selection performed similarly to dimensionality reduction using high-throughput phenotyping and semantic similarity informed clustering (AUC of 0.81 - 0.82).
CONCLUSION - This study demonstrated improved phenotyping of a challenging AKI etiology, HRS, over ICD-9 coding. We also compared performance among multiple approaches to EHR-derived phenotyping, and found similar results between methods. Lastly, we showed that automated NLP dimension reduction is viable for acute illness.
Copyright © 2018 Elsevier Inc. All rights reserved.
New therapeutic approaches are needed for gestational diabetes mellitus (GDM), but must show safety and efficacy in a historically understudied population. We studied associations between electronic medical record (EMR) phenotypes and genetic variants to uncover drugs currently considered safe in pregnancy that could treat or prevent GDM. We identified 129 systemically active drugs considered safe in pregnancy targeting the proteins produced from 196 genes. We tested for associations between GDM and/or type 2 diabetes (DM2) and 306 SNPs in 130 genes represented on the Illumina Infinium Human Exome Bead Chip (DM2 was included due to shared pathophysiological features with GDM). In parallel, we tested the association between drugs and glucose tolerance during pregnancy as measured by the glucose recorded during a routine 50-g glucose tolerance test (GTT). We found an association between GDM/DM2 and the genes targeted by 11 drug classes. In the EMR analysis, 6 drug classes were associated with changes in GTT. Two classes were identified in both analyses. L-type calcium channel blocking antihypertensives (CCBs), were associated with a 3.18 mg/dL (95% CI -6.18 to -0.18) decrease in glucose during GTT, and serotonin receptor type 3 (5HT-3) antagonist antinausea medications were associated with a 3.54 mg/dL (95% CI 1.86-5.23) increase in glucose during GTT. CCBs were identified as a class of drugs considered safe in pregnancy could have efficacy in treating or preventing GDM. 5HT-3 antagonists may be associated with worse glucose tolerance.
Copyright © 2018 Elsevier Ltd. All rights reserved.
Obstetric care refers to the care provided to patients during ante-, intra-, and postpartum periods. Predicting length of stay (LOS) for these patients during their hospitalizations can assist healthcare organizations in allocating hospital resources more effectively and efficiently, ultimately improving maternal care quality and reducing costs to patients. In this paper, we investigate the extent to which LOS can be forecast from a patient's medical history. We introduce a machine learning framework to incorporate a patient's prior conditions (e.g., diagnostic codes) as features in a predictive model for LOS. We evaluate the framework with three years of historical billing data from the electronic medical records of 9188 obstetric patients in a large academic medical center. The results indicate that our framework achieved an average accuracy of 49.3%, which is higher than the baseline accuracy 37.7% (that relies solely on a patient's age). The most predictive features were found to have statistically significant discriminative ability. These features included billing codes for normal delivery (indicative of shorter stay) and antepartum hypertension (indicative of longer stay).
OBJECTIVE - To evaluate the relationship between genetic ancestry and uterine fibroid characteristics.
DESIGN - Cross-sectional study.
SETTING - Not applicable.
PATIENT(S) - A total of 609 African American participants with image- or surgery-confirmed fibroids in a biorepository at Vanderbilt University electronic health record biorepository and the Coronary Artery Risk Development in Young Adults studies were included.
INTERVENTION(S) - None.
MAIN OUTCOME MEASURE(S) - Outcome measures include fibroid number (single vs. multiple), volume of largest fibroid, and largest fibroid dimension of all fibroid measurements.
RESULT(S) - Global ancestry meta-analyses revealed a significant inverse association between percentage of European ancestry and risk of multiple fibroids (odds ratio: 0.78; 95% confidence interval 0.66, 0.93; P=6.05 × 10). Local ancestry meta-analyses revealed five suggestive (P<4.80 × 10) admixture mapping peaks in 2q14.3-2q21.1, 3p14.2-3p14.1, 7q32.2-7q33, 10q21.1, 14q24.2-14q24.3, for number of fibroids and one suggestive admixture mapping peak (P<1.97 × 10) in 10q24.1-10q24.32 for volume of largest fibroid. Single variant association meta-analyses of the strongest associated region from admixture mapping of fibroid number (10q21.1) revealed a strong association at single nucleotide polymorphism variant rs12219990 (odds ratio: 0.41; 95% confidence interval 0.28, 0.60; P=3.82 × 10) that was significant after correction for multiple testing.
CONCLUSION(S) - Increasing African ancestry is associated with multiple fibroids but not with fibroid size. Local ancestry analyses identified several novel genomic regions not previously associated with fibroid number and increasing volume. Future studies are needed to explore the genetic impact that ancestry plays into the development of fibroid characteristics.
Copyright © 2017 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
OBJECTIVE - The traditional fee-for-service approach to healthcare can lead to the management of a patient's conditions in a siloed manner, inducing various negative consequences. It has been recognized that a bundled approach to healthcare - one that manages a collection of health conditions together - may enable greater efficacy and cost savings. However, it is not always evident which sets of conditions should be managed in a bundled manner. In this study, we investigate if a data-driven approach can automatically learn potential bundles.
METHODS - We designed a framework to infer health condition collections (HCCs) based on the similarity of their clinical workflows, according to electronic medical record (EMR) utilization. We evaluated the framework with data from over 16,500 inpatient stays from Northwestern Memorial Hospital in Chicago, Illinois. The plausibility of the inferred HCCs for bundled care was assessed through an online survey of a panel of five experts, whose responses were analyzed via an analysis of variance (ANOVA) at a 95% confidence level. We further assessed the face validity of the HCCs using evidence in the published literature.
RESULTS - The framework inferred four HCCs, indicative of (1) fetal abnormalities, (2) late pregnancies, (3) prostate problems, and (4) chronic diseases, with congestive heart failure featuring prominently. Each HCC was substantiated with evidence in the literature and was deemed plausible for bundled care by the experts at a statistically significant level.
CONCLUSIONS - The findings suggest that an automated EMR data-driven framework conducted can provide a basis for discovering bundled care opportunities. Still, translating such findings into actual care management will require further refinement, implementation, and evaluation.
Copyright © 2017 Elsevier Inc. All rights reserved.
Drug development continues to be costly and slow, with medications failing due to lack of efficacy or presence of toxicity. The promise of pharmacogenomic discovery includes tailoring therapeutics based on an individual's genetic makeup, rational drug development, and repurposing medications. Rapid growth of large research cohorts, linked to electronic health record (EHR) data, fuels discovery of new genetic variants predicting drug action, supports Mendelian randomization experiments to show drug efficacy, and suggests new indications for existing medications. New biomedical informatics and machine-learning approaches advance the ability to interpret clinical information, enabling identification of complex phenotypes and subpopulations of patients. We review the recent history of use of "big data" from EHR-based cohorts and biobanks supporting these activities. Future studies using EHR data, other information sources, and new methods will promote a foundation for discovery to more rapidly advance precision medicine.
© 2017 American Society for Clinical Pharmacology and Therapeutics.