Dan Roden
Faculty Member
Last active: 3/24/2020

Evaluating statistical approaches to leverage large clinical datasets for uncovering therapeutic and adverse medication effects.

Choi L, Carroll RJ, Beck C, Mosley JD, Roden DM, Denny JC, Van Driest SL
Bioinformatics. 2018 34 (17): 2988-2996

PMID: 29912272 · PMCID: PMC6129383 · DOI:10.1093/bioinformatics/bty306

Motivation - Phenome-wide association studies (PheWAS) have been used to discover many genotype-phenotype relationships and have the potential to identify therapeutic and adverse drug outcomes using longitudinal data within electronic health records (EHRs). However, the statistical methods for PheWAS applied to longitudinal EHR medication data have not been established.

Results - In this study, we developed methods to address two challenges faced with reuse of EHR for this purpose: confounding by indication, and low exposure and event rates. We used Monte Carlo simulation to assess propensity score (PS) methods, focusing on two of the most commonly used methods, PS matching and PS adjustment, to address confounding by indication. We also compared two logistic regression approaches (the default of Wald versus Firth's penalized maximum likelihood, PML) to address complete separation due to sparse data with low exposure and event rates. PS adjustment resulted in greater power than PS matching, while controlling Type I error at 0.05. The PML method provided reasonable P-values, even in cases with complete separation, with well controlled Type I error rates. Using PS adjustment and the PML method, we identify novel latent drug effects in pediatric patients exposed to two common antibiotic drugs, ampicillin and gentamicin.

Availability and implementation - R packages PheWAS and EHR are available at https://github.com/PheWAS/PheWAS and at CRAN (https://www.r-project.org/), respectively. The R script for data processing and the main analysis is available at https://github.com/choileena/EHR.

Supplementary information - Supplementary data are available at Bioinformatics online.

MeSH Terms (7)

Datasets as Topic Drug-Related Side Effects and Adverse Reactions Drug Discovery Electronic Health Records Humans Logistic Models Probability

Connections (1)

This publication is referenced by other Labnodes entities:

Links