The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
Genome-wide association studies have identified breast cancer risk variants in over 150 genomic regions, but the mechanisms underlying risk remain largely unknown. These regions were explored by combining association analysis with in silico genomic feature annotations. We defined 205 independent risk-associated signals with the set of credible causal variants in each one. In parallel, we used a Bayesian approach (PAINTOR) that combines genetic association, linkage disequilibrium and enriched genomic features to determine variants with high posterior probabilities of being causal. Potentially causal variants were significantly over-represented in active gene regulatory regions and transcription factor binding sites. We applied our INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression (expression quantitative trait loci), chromatin interaction and functional annotations. Known cancer drivers, transcription factors and genes in the developmental, apoptosis, immune system and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes.
Adopting a systems approach, we devise a general workflow to define actionable subtypes in human cancers. Applied to small cell lung cancer (SCLC), the workflow identifies four subtypes based on global gene expression patterns and ontologies. Three correspond to known subtypes (SCLC-A, SCLC-N, and SCLC-Y), while the fourth is a previously undescribed ASCL1+ neuroendocrine variant (NEv2, or SCLC-A2). Tumor deconvolution with subtype gene signatures shows that all of the subtypes are detectable in varying proportions in human and mouse tumors. To understand how multiple stable subtypes can arise within a tumor, we infer a network of transcription factors and develop BooleaBayes, a minimally-constrained Boolean rule-fitting approach. In silico perturbations of the network identify master regulators and destabilizers of its attractors. Specific to NEv2, BooleaBayes predicts ELF3 and NR0B1 as master regulators of the subtype, and TCF3 as a master destabilizer. Since the four subtypes exhibit differential drug sensitivity, with NEv2 consistently least sensitive, these findings may lead to actionable therapeutic strategies that consider SCLC intratumoral heterogeneity. Our systems-level approach should generalize to other cancer types.
BACKGROUND - Immune checkpoint inhibitors (ICIs) have substantially improved clinical outcomes in multiple cancer types and are increasingly being used in early disease settings and in combinations of different immunotherapies. However, ICIs can also cause severe or fatal immune-related adverse-events (irAEs). We aimed to identify and characterise cardiovascular irAEs that are significantly associated with ICIs.
METHODS - In this observational, retrospective, pharmacovigilance study, we used VigiBase, WHO's global database of individual case safety reports, to compare cardiovascular adverse event reporting in patients who received ICIs (ICI subgroup) with this reporting in the full database. This study included all cardiovascular irAEs classified by group queries according to the Medical Dictionary for Regulatory Activities, between inception on Nov 14, 1967, and Jan 2, 2018. We evaluated the association between ICIs and cardiovascular adverse events using the reporting odds ratio (ROR) and the information component (IC). IC is an indicator value for disproportionate Bayesian reporting that compares observed and expected values to find associations between drugs and adverse events. IC is the lower end of the IC 95% credibility interval, and an IC value of more than zero is deemed significant. This study is registered with ClinicalTrials.gov, number NCT03387540.
FINDINGS - We identified 31 321 adverse events reported in patients who received ICIs and 16 343 451 adverse events reported in patients treated with any drugs (full database) in VigiBase. Compared with the full database, ICI treatment was associated with higher reporting of myocarditis (5515 reports for the full database vs 122 for ICIs, ROR 11·21 [95% CI 9·36-13·43]; IC 3·20), pericardial diseases (12 800 vs 95, 3·80 [3·08-4·62]; IC 1·63), and vasculitis (33 289 vs 82, 1·56 [1·25-1·94]; IC 0·03), including temporal arteritis (696 vs 18, 12·99 [8·12-20·77]; IC 2·59) and polymyalgia rheumatica (1709 vs 16, 5·13 [3·13-8·40]; IC 1·33). Pericardial diseases were reported more often in patients with lung cancer (49 [56%] of 87 patients), whereas myocarditis (42 [41%] of 103 patients) and vasculitis (42 [60%] of 70 patients) were more commonly reported in patients with melanoma (χ test for overall subgroup comparison, p<0·0001). Vision was impaired in five (28%) of 18 patients with temporal arteritis. Cardiovascular irAEs were severe in the majority of cases (>80%), with death occurring in 61 (50%) of 122 myocarditis cases, 20 (21%) of 95 pericardial disease cases, and five (6%) of 82 vasculitis cases (χ test for overall comparison between pericardial diseases, myocarditis, and vasculitis, p<0·0001).
INTERPRETATION - Treatment with ICIs can lead to severe and disabling inflammatory cardiovascular irAEs soon after commencement of therapy. In addition to life-threatening myocarditis, these toxicities include pericardial diseases and temporal arteritis with a risk of blindness. These events should be considered in patient care and in combination clinical trial designs (ie, combinations of different immunotherapies as well as immunotherapies and chemotherapy).
FUNDING - The Cancer Institut Thématique Multi-Organisme of the French National Alliance for Life and Health Sciences (AVIESAN) Plan Cancer 2014-2019; US National Cancer Institute, National Institutes of Health; the James C. Bradford Jr. Melanoma Fund; and the Melanoma Research Foundation.
Copyright © 2018 Elsevier Ltd. All rights reserved.
Defining the full spectrum of human disease associated with a biomarker is necessary to advance the biomarker into clinical practice. We hypothesize that associating biomarker measurements with electronic health record (EHR) populations based on shared genetic architectures would establish the clinical epidemiology of the biomarker. We use Bayesian sparse linear mixed modeling to calculate SNP weightings for 53 biomarkers from the Atherosclerosis Risk in Communities study. We use the SNP weightings to computed predicted biomarker values in an EHR population and test associations with 1139 diagnoses. Here we report 116 associations meeting a Bonferroni level of significance. A false discovery rate (FDR)-based significance threshold reveals more known and undescribed associations across a broad range of biomarkers, including biometric measures, plasma proteins and metabolites, functional assays, and behaviors. We confirm an inverse association between LDL-cholesterol level and septicemia risk in an independent epidemiological cohort. This approach efficiently discovers biomarker-disease associations.
Cognitive models aim to explain complex human behavior in terms of hypothesized mechanisms of the mind. These mechanisms can be formalized in terms of mathematical structures containing parameters that are theoretically meaningful. For example, in the case of perceptual decision making, model parameters might correspond to theoretical constructs like response bias, evidence quality, response caution, and the like. Formal cognitive models go beyond verbal models in that cognitive mechanisms are instantiated in terms of mathematics and they go beyond statistical models in that cognitive model parameters are psychologically interpretable. We explore three key elements used to formally evaluate cognitive models: parameter estimation, model prediction, and model selection. We compare and contrast traditional approaches with Bayesian statistical approaches to performing each of these three elements. Traditional approaches rely on an array of seemingly ad hoc techniques, whereas Bayesian statistical approaches rely on a single, principled, internally consistent system. We illustrate the Bayesian statistical approach to evaluating cognitive models using a running example of the Linear Ballistic Accumulator model of decision making (Brown SD, Heathcote A. The simplest complete model of choice response time: linear ballistic accumulation. Cogn Psychol 2008, 57:153-178). WIREs Cogn Sci 2018, 9:e1458. doi: 10.1002/wcs.1458 This article is categorized under: Neuroscience > Computation Psychology > Reasoning and Decision Making Psychology > Theory and Methods.
© 2017 Wiley Periodicals, Inc.
Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.
Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues.
© 2017 Saha et al.; Published by Cold Spring Harbor Laboratory Press.
OBJECTIVE - Secure messaging through patient portals is an increasingly popular way that consumers interact with healthcare providers. The increasing burden of secure messaging can affect clinic staffing and workflows. Manual management of portal messages is costly and time consuming. Automated classification of portal messages could potentially expedite message triage and delivery of care.
MATERIALS AND METHODS - We developed automated patient portal message classifiers with rule-based and machine learning techniques using bag of words and natural language processing (NLP) approaches. To evaluate classifier performance, we used a gold standard of 3253 portal messages manually categorized using a taxonomy of communication types (i.e., main categories of informational, medical, logistical, social, and other communications, and subcategories including prescriptions, appointments, problems, tests, follow-up, contact information, and acknowledgement). We evaluated our classifiers' accuracies in identifying individual communication types within portal messages with area under the receiver-operator curve (AUC). Portal messages often contain more than one type of communication. To predict all communication types within single messages, we used the Jaccard Index. We extracted the variables of importance for the random forest classifiers.
RESULTS - The best performing approaches to classification for the major communication types were: logistic regression for medical communications (AUC: 0.899); basic (rule-based) for informational communications (AUC: 0.842); and random forests for social communications and logistical communications (AUCs: 0.875 and 0.925, respectively). The best performing classification approach of classifiers for individual communication subtypes was random forests for Logistical-Contact Information (AUC: 0.963). The Jaccard Indices by approach were: basic classifier, Jaccard Index: 0.674; Naïve Bayes, Jaccard Index: 0.799; random forests, Jaccard Index: 0.859; and logistic regression, Jaccard Index: 0.861. For medical communications, the most predictive variables were NLP concepts (e.g., Temporal_Concept, which maps to 'morning', 'evening' and Idea_or_Concept which maps to 'appointment' and 'refill'). For logistical communications, the most predictive variables contained similar numbers of NLP variables and words (e.g., Telephone mapping to 'phone', 'insurance'). For social and informational communications, the most predictive variables were words (e.g., social: 'thanks', 'much', informational: 'question', 'mean').
CONCLUSIONS - This study applies automated classification methods to the content of patient portal messages and evaluates the application of NLP techniques on consumer communications in patient portal messages. We demonstrated that random forest and logistic regression approaches accurately classified the content of portal messages, although the best approach to classification varied by communication type. Words were the most predictive variables for classification of most communication types, although NLP variables were most predictive for medical communication types. As adoption of patient portals increases, automated techniques could assist in understanding and managing growing volumes of messages. Further work is needed to improve classification performance to potentially support message triage and answering.
Copyright © 2017 Elsevier B.V. All rights reserved.
Objective - Predictive analytics create opportunities to incorporate personalized risk estimates into clinical decision support. Models must be well calibrated to support decision-making, yet calibration deteriorates over time. This study explored the influence of modeling methods on performance drift and connected observed drift with data shifts in the patient population.
Materials and Methods - Using 2003 admissions to Department of Veterans Affairs hospitals nationwide, we developed 7 parallel models for hospital-acquired acute kidney injury using common regression and machine learning methods, validating each over 9 subsequent years.
Results - Discrimination was maintained for all models. Calibration declined as all models increasingly overpredicted risk. However, the random forest and neural network models maintained calibration across ranges of probability, capturing more admissions than did the regression models. The magnitude of overprediction increased over time for the regression models while remaining stable and small for the machine learning models. Changes in the rate of acute kidney injury were strongly linked to increasing overprediction, while changes in predictor-outcome associations corresponded with diverging patterns of calibration drift across methods.
Conclusions - Efficient and effective updating protocols will be essential for maintaining accuracy of, user confidence in, and safety of personalized risk predictions to support decision-making. Model updating protocols should be tailored to account for variations in calibration drift across methods and respond to periods of rapid performance drift rather than be limited to regularly scheduled annual or biannual intervals.
© The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: firstname.lastname@example.org
Current approaches separately analyze concurrently acquired diffusion tensor imaging (DTI) and functional magnetic resonance imaging (fMRI) data. The primary limitation of these approaches is that they do not take advantage of the information from DTI that could potentially enhance estimation of resting-state functional connectivity (FC) between brain regions. To overcome this limitation, we develop a Bayesian hierarchical spatiotemporal model that incorporates structural connectivity (SC) into estimating FC. In our proposed approach, SC based on DTI data is used to construct an informative prior for FC based on resting-state fMRI data through the Cholesky decomposition. Simulation studies showed that incorporating the two data produced significantly reduced mean squared errors compared to the standard approach of separately analyzing the two data from different modalities. We applied our model to analyze the resting state DTI and fMRI data collected to estimate FC between the brain regions that were hypothetically important in the origination and spread of temporal lobe epilepsy seizures. Our analysis concludes that the proposed model achieves smaller false positive rates and is much robust to data decimation compared to the conventional approach.