The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
The genetic architecture of psychiatric disorders is characterized by a large number of small-effect variants located primarily in non-coding regions, suggesting that the underlying causal effects may influence disease risk by modulating gene expression. We provide comprehensive analyses using transcriptome data from an unprecedented collection of tissues to gain pathophysiological insights into the role of the brain, neuroendocrine factors (adrenal gland) and gastrointestinal systems (colon) in psychiatric disorders. In each tissue, we perform PrediXcan analysis and identify trait-associated genes for schizophrenia (n associations = 499; n unique genes = 275), bipolar disorder (n associations = 17; n unique genes = 13), attention deficit hyperactivity disorder (n associations = 19; n unique genes = 12) and broad depression (n associations = 41; n unique genes = 31). Importantly, both PrediXcan and summary-data-based Mendelian randomization/heterogeneity in dependent instruments analyses suggest potentially causal genes in non-brain tissues, showing the utility of these tissues for mapping psychiatric disease genetic predisposition. Our analyses further highlight the importance of joint tissue approaches as 76% of the genes were detected only in difficult-to-acquire tissues.
BACKGROUND - Genome-phenome studies have identified thousands of variants that are statistically associated with disease or traits; however, their functional roles are largely unclear. A comprehensive investigation of regulatory mechanisms and the gene regulatory networks between phenome-wide association study (PheWAS) and genome-wide association study (GWAS) is needed to identify novel regulatory variants contributing to risk for human diseases.
METHODS - In this study, we developed an integrative functional genomics framework that maps 215,107 significant single nucleotide polymorphism (SNP) traits generated from the PheWAS Catalog and 28,870 genome-wide significant SNP traits collected from the GWAS Catalog into a global human genome regulatory map via incorporating various functional annotation data, including transcription factor (TF)-based motifs, promoters, enhancers, and expression quantitative trait loci (eQTLs) generated from four major functional genomics databases: FANTOM5, ENCODE, NIH Roadmap, and Genotype-Tissue Expression (GTEx). In addition, we performed a tissue-specific regulatory circuit analysis through the integration of the identified regulatory variants and tissue-specific gene expression profiles in 7051 samples across 32 tissues from GTEx.
RESULTS - We found that the disease-associated loci in both the PheWAS and GWAS Catalogs were significantly enriched with functional SNPs. The integration of functional annotations significantly improved the power of detecting novel associations in PheWAS, through which we found a number of functional associations with strong regulatory evidence in the PheWAS Catalog. Finally, we constructed tissue-specific regulatory circuits for several complex traits: mental diseases, autoimmune diseases, and cancer, via exploring tissue-specific TF-promoter/enhancer-target gene interaction networks. We uncovered several promising tissue-specific regulatory TFs or genes for Alzheimer's disease (e.g. ZIC1 and STX1B) and asthma (e.g. CSF3 and IL1RL1).
CONCLUSIONS - This study offers powerful tools for exploring the functional consequences of variants generated from genome-phenome association studies in terms of their mechanisms on affecting multiple complex diseases and traits.
The impact of inherited genetic variation on gene expression in humans is well-established. The majority of known expression quantitative trait loci (eQTLs) impact expression of local genes (-eQTLs). More research is needed to identify effects of genetic variation on distant genes (-eQTLs) and understand their biological mechanisms. One common -eQTLs mechanism is "mediation" by a local () transcript. Thus, mediation analysis can be applied to genome-wide SNP and expression data in order to identify transcripts that are "-mediators" of -eQTLs, including those "-hubs" involved in regulation of many -genes. Identifying such mediators helps us understand regulatory networks and suggests biological mechanisms underlying -eQTLs, both of which are relevant for understanding susceptibility to complex diseases. The multitissue expression data from the Genotype-Tissue Expression (GTEx) program provides a unique opportunity to study -mediation across human tissue types. However, the presence of complex hidden confounding effects in biological systems can make mediation analyses challenging and prone to confounding bias, particularly when conducted among diverse samples. To address this problem, we propose a new method: Genomic Mediation analysis with Adaptive Confounding adjustment (GMAC). It enables the search of a very large pool of variables, and adaptively selects potential confounding variables for each mediation test. Analyses of simulated data and GTEx data demonstrate that the adaptive selection of confounders by GMAC improves the power and precision of mediation analysis. Application of GMAC to GTEx data provides new insights into the observed patterns of -hubs and -eQTL regulation across tissue types.
© 2017 Yang et al.; Published by Cold Spring Harbor Laboratory Press.
Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues.
© 2017 Saha et al.; Published by Cold Spring Harbor Laboratory Press.
Gastric cancer (GC) is a leading cause of cancer-related deaths worldwide. The Tff1 knockout (KO) mouse model develops gastric lesions that include low-grade dysplasia (LGD), high-grade dysplasia (HGD), and adenocarcinomas. In this study, we used Affymetrix microarrays gene expression platforms for analysis of molecular signatures in the mouse stomach [Tff1-KO (LGD) and Tff1 wild-type (normal)] and human gastric cancer tissues and their adjacent normal tissue samples. Combined integrated bioinformatics analysis of mouse and human datasets indicated that 172 genes were consistently deregulated in both human gastric cancer samples and Tff1-KO LGD lesions (P < .05). Using Ingenuity pathway analysis, these genes mapped to important transcription networks that include MYC, STAT3, β-catenin, RELA, NFATC2, HIF1A, and ETS1 in both human and mouse. Further analysis demonstrated activation of FOXM1 and inhibition of TP53 transcription networks in human gastric cancers but not in Tff1-KO LGD lesions. Using real-time RT-PCR, we validated the deregulated expression of several genes (VCAM1, BGN, CLDN2, COL1A1, COL1A2, COL3A1, EpCAM, IFITM1, MMP9, MMP12, MMP14, PDGFRB, PLAU, and TIMP1) that map to altered transcription networks in both mouse and human gastric neoplasia. Our study demonstrates significant similarities in deregulated transcription networks in human gastric cancer and gastric tumorigenesis in the Tff1-KO mouse model. The data also suggest that activation of MYC, STAT3, RELA, and β-catenin transcription networks could be an early molecular step in gastric carcinogenesis.
© 2017 Wiley Periodicals, Inc.
The incidence of esophageal adenocarcinoma (EAC) is rapidly rising in the United States and Western countries. In this study, we carried out an integrative molecular analysis to identify interactions between genomic and epigenomic alterations in regulating gene expression networks in EAC. We detected significant alterations in DNA copy numbers (CN), gene expression levels, and DNA methylation profiles. The integrative analysis demonstrated that altered expression of 1,755 genes was associated with changes in CN or methylation. We found that expression alterations in 84 genes were associated with changes in both CN and methylation. These data suggest a strong interaction between genetic and epigenetic events to modulate gene expression in EAC. Of note, bioinformatics analysis detected a prominent K-RAS signature and predicted activation of several important transcription factor networks, including β-catenin, MYB, TWIST1, SOX7, GATA3 and GATA6. Notably, we detected hypomethylation and overexpression of several pro-inflammatory genes such as COX2, IL8 and IL23R, suggesting an important role of epigenetic regulation of these genes in the inflammatory cascade associated with EAC. In summary, this integrative analysis demonstrates a complex interaction between genetic and epigenetic mechanisms providing several novel insights for our understanding of molecular events in EAC.
The comprehensive understanding of cellular signaling pathways remains a challenge due to multiple layers of regulation that may become evident only when the pathway is probed at different levels or critical nodes are eliminated. To discover regulatory mechanisms in canonical WNT signaling, we conducted a systematic forward genetic analysis through reporter-based screens in haploid human cells. Comparison of screens for negative, attenuating and positive regulators of WNT signaling, mediators of R-spondin-dependent signaling and suppressors of constitutive signaling induced by loss of the tumor suppressor adenomatous polyposis coli or casein kinase 1α uncovered new regulatory features at most levels of the pathway. These include a requirement for the transcription factor AP-4, a role for the DAX domain of AXIN2 in controlling β-catenin transcriptional activity, a contribution of glycophosphatidylinositol anchor biosynthesis and glypicans to R-spondin-potentiated WNT signaling, and two different mechanisms that regulate signaling when distinct components of the β-catenin destruction complex are lost. The conceptual and methodological framework we describe should enable the comprehensive understanding of other signaling systems.
Artificial transcription factors (ATFs) are precision-tailored molecules designed to bind DNA and regulate transcription in a preprogrammed manner. Libraries of ATFs enable the high-throughput screening of gene networks that trigger cell fate decisions or phenotypic changes. We developed a genome-scale library of ATFs that display an engineered interaction domain (ID) to enable cooperative assembly and synergistic gene expression at targeted sites. We used this ATF library to screen for key regulators of the pluripotency network and discovered three combinations of ATFs capable of inducing pluripotency without exogenous expression of Oct4 (POU domain, class 5, TF 1). Cognate site identification, global transcriptional profiling, and identification of ATF binding sites reveal that the ATFs do not directly target Oct4; instead, they target distinct nodes that converge to stimulate the endogenous pluripotency network. This forward genetic approach enables cell type conversions without a priori knowledge of potential key regulators and reveals unanticipated gene network dynamics that drive cell fate choices.
Genetic suppression occurs when the phenotypic defects caused by a mutation in a particular gene are rescued by a mutation in a second gene. To explore the principles of genetic suppression, we examined both literature-curated and unbiased experimental data, involving systematic genetic mapping and whole-genome sequencing, to generate a large-scale suppression network among yeast genes. Most suppression pairs identified novel relationships among functionally related genes, providing new insights into the functional wiring diagram of the cell. In addition to suppressor mutations, we identified frequent secondary mutations,in a subset of genes, that likely cause a delay in the onset of stationary phase, which appears to promote their enrichment within a propagating population. These findings allow us to formulate and quantify general mechanisms of genetic suppression.
Copyright © 2016, American Association for the Advancement of Science.
MOTIVATION - Analyzing genome wide association data in the context of biological pathways helps us understand how genetic variation influences phenotype and increases power to find associations. However, the utility of pathway-based analysis tools is hampered by undercuration and reliance on a distribution of signal across all of the genes in a pathway. Methods that combine genome wide association results with genetic networks to infer the key phenotype-modulating subnetworks combat these issues, but have primarily been limited to network definitions with yes/no labels for gene-gene interactions. A recent method (EW_dmGWAS) incorporates a biological network with weighted edge probability by requiring a secondary phenotype-specific expression dataset. In this article, we combine an algorithm for weighted-edge module searching and a probabilistic interaction network in order to develop a method, STAMS, for recovering modules of genes with strong associations to the phenotype and probable biologic coherence. Our method builds on EW_dmGWAS but does not require a secondary expression dataset and performs better in six test cases.
RESULTS - We show that our algorithm improves over EW_dmGWAS and standard gene-based analysis by measuring precision and recall of each method on separately identified associations. In the Wellcome Trust Rheumatoid Arthritis study, STAMS-identified modules were more enriched for separately identified associations than EW_dmGWAS (STAMS P-value 3.0 × 10; EW_dmGWAS- P-value = 0.8). We demonstrate that the area under the Precision-Recall curve is 5.9 times higher with STAMS than EW_dmGWAS run on the Wellcome Trust Type 1 Diabetes data.
AVAILABILITY AND IMPLEMENTATION - STAMS is implemented as an R package and is freely available at https://simtk.org/projects/stams CONTACT: firstname.lastname@example.orgSupplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press.