The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
BACKGROUND - Non-coding gene regulatory enhancers are essential to transcription in mammalian cells. As a result, a large variety of experimental and computational strategies have been developed to identify cis-regulatory enhancer sequences. Given the differences in the biological signals assayed, some variation in the enhancers identified by different methods is expected; however, the concordance of enhancers identified by different methods has not been comprehensively evaluated. This is critically needed, since in practice, most studies consider enhancers identified by only a single method. Here, we compare enhancer sets from eleven representative strategies in four biological contexts.
RESULTS - All sets we evaluated overlap significantly more than expected by chance; however, there is significant dissimilarity in their genomic, evolutionary, and functional characteristics, both at the element and base-pair level, within each context. The disagreement is sufficient to influence interpretation of candidate SNPs from GWAS studies, and to lead to disparate conclusions about enhancer and disease mechanisms. Most regions identified as enhancers are supported by only one method, and we find limited evidence that regions identified by multiple methods are better candidates than those identified by a single method. As a result, we cannot recommend the use of any single enhancer identification strategy in all settings.
CONCLUSIONS - Our results highlight the inherent complexity of enhancer biology and identify an important challenge to mapping the genetic architecture of complex disease. Greater appreciation of how the diverse enhancer identification strategies in use today relate to the dynamic activity of gene regulatory regions is needed to enable robust and reproducible results.
Evolutionary changes in enhancers are widely associated with variation in human traits and diseases. However, studies comprehensively quantifying levels of selection on enhancers at multiple evolutionary periods during recent human evolution and how enhancer evolution varies across human tissues are lacking. To address these questions, we integrated a dataset of 41,561 transcribed enhancers active in 41 different human tissues (FANTOM Consortium) with whole genome sequences of 1,668 individuals from the African, Asian, and European populations (1000 Genomes Project). Our analyses based on four different metrics (Tajima's , , H12, ) showed that ∼5.90% of enhancers showed evidence of recent positive selection and that genes associated with enhancers under very recent positive selection are enriched for diverse immune-related functions. The distributions of these metrics for brain and testis enhancers were often statistically significantly different and in the direction suggestive of less positive selection compared to those of other tissues; the same was true for brain and testis enhancers that are tissue-specific compared to those that are tissue-broad and for testis enhancers associated with tissue-enriched and non-tissue-enriched genes. These differences varied considerably across metrics and tissues and were generally in the form of changes in distributions' shapes rather than shifts in their values. Collectively, these results suggest that many human enhancers experienced recent positive selection throughout multiple time periods in human evolutionary history, that this selection occurred in a tissue-dependent and immune-related functional context, and that much like the evolution of their protein-coding gene counterparts, the evolution of brain and testis enhancers has been markedly different from that of enhancers in other tissues.
Copyright © 2019 Moon et al.
Enhancers and promoters both regulate gene expression by recruiting transcription factors (TFs); however, the degree to which enhancer promoter activity is due to differences in their sequences or to genomic context is the subject of ongoing debate. We examined this question by analyzing the sequences of thousands of transcribed enhancers and promoters from hundreds of cellular contexts previously identified by cap analysis of gene expression. Support vector machine classifiers trained on counts of all possible 6-bp-long sequences (6-mers) were able to accurately distinguish promoters from enhancers and distinguish their breadth of activity across tissues. Classifiers trained to predict enhancer activity also performed well when applied to promoter prediction tasks, but promoter-trained classifiers performed poorly on enhancers. This suggests that the learned sequence patterns predictive of enhancer activity generalize to promoters, but not vice versa. Our classifiers also indicate that there are functionally relevant differences in enhancer and promoter GC content beyond the influence of CpG islands. Furthermore, sequences characteristic of broad promoter or broad enhancer activity matched different TFs, with predicted ETS- and RFX-binding sites indicative of promoters, and AP-1 sites indicative of enhancers. Finally, we evaluated the ability of our models to distinguish enhancers and promoters defined by histone modifications. Separating these classes was substantially more difficult, and this difference may contribute to ongoing debates about the similarity of enhancers and promoters. In summary, our results suggest that high-confidence transcribed enhancers and promoters can largely be distinguished based on biologically relevant sequence properties.
Copyright © 2019 by the Genetics Society of America.
Placental dysfunction is implicated in many pregnancy complications, including preeclampsia and preterm birth (PTB). While both these syndromes are influenced by environmental risk factors, they also have a substantial genetic component that is not well understood. Precisely controlled gene expression during development is crucial to proper placental function and often mediated through gene regulatory enhancers. However, we lack accurate maps of placental enhancer activity due to the challenges of assaying the placenta and the difficulty of comprehensively identifying enhancers. To address the gap in our knowledge of gene regulatory elements in the placenta, we used a two-step machine learning pipeline to synthesize existing functional genomics studies, transcription factor (TF) binding patterns, and evolutionary information to predict placental enhancers. The trained classifiers accurately distinguish enhancers from the genomic background and placental enhancers from enhancers active in other tissues. Genomic features collected from tissues and cell lines involved in pregnancy are the most predictive of placental regulatory activity. Applying the classifiers genome-wide enabled us to create a map of 33,010 predicted placental enhancers, including 4,562 high-confidence enhancer predictions. The genome-wide placental enhancers are significantly enriched nearby genes associated with placental development and birth disorders and for SNPs associated with gestational age. These genome-wide predicted placental enhancers provide candidate regions for further testing in vitro, will assist in guiding future studies of genetic associations with pregnancy phenotypes, and aid interpretation of potential mechanisms of action for variants found through genetic studies.
Genomic regions with gene regulatory enhancer activity turnover rapidly across mammals. In contrast, gene expression patterns and transcription factor binding preferences are largely conserved between mammalian species. Based on this conservation, we hypothesized that enhancers active in different mammals would exhibit conserved sequence patterns in spite of their different genomic locations. To investigate this hypothesis, we evaluated the extent to which sequence patterns that are predictive of enhancers in one species are predictive of enhancers in other mammalian species by training and testing two types of machine learning models. We trained support vector machine (SVM) and convolutional neural network (CNN) classifiers to distinguish enhancers defined by histone marks from the genomic background based on DNA sequence patterns in human, macaque, mouse, dog, cow, and opossum. The classifiers accurately identified many adult liver, developing limb, and developing brain enhancers, and the CNNs outperformed the SVMs. Furthermore, classifiers trained in one species and tested in another performed nearly as well as classifiers trained and tested on the same species. We observed similar cross-species conservation when applying the models to human and mouse enhancers validated in transgenic assays. This indicates that many short sequence patterns predictive of enhancers are largely conserved. The sequence patterns most predictive of enhancers in each species matched the binding motifs for a common set of TFs enriched for expression in relevant tissues, supporting the biological relevance of the learned features. Thus, despite the rapid change of active enhancer locations between mammals, cross-species enhancer prediction is often possible. Our results suggest that short sequence patterns encoding enhancer activity have been maintained across more than 180 million years of mammalian evolution.
The role of enhancers, a key class of non-coding regulatory DNA elements, in cancer development has increasingly been appreciated. Here, we present the detection and characterization of a large number of expressed enhancers in a genome-wide analysis of 8928 tumor samples across 33 cancer types using TCGA RNA-seq data. Compared with matched normal tissues, global enhancer activation was observed in most cancers. Across cancer types, global enhancer activity was positively associated with aneuploidy, but not mutation load, suggesting a hypothesis centered on "chromatin-state" to explain their interplay. Integrating eQTL, mRNA co-expression, and Hi-C data analysis, we developed a computational method to infer causal enhancer-gene interactions, revealing enhancers of clinically actionable genes. Having identified an enhancer ∼140 kb downstream of PD-L1, a major immunotherapy target, we validated it experimentally. This study provides a systematic view of enhancer activity in diverse tumor contexts and suggests the clinical implications of enhancers.
Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
OBJECTIVE - Homozygous loss-of-function mutations in the gene coding for the homeobox transcription factor (TF) PDX1 leads to pancreatic agenesis, whereas heterozygous mutations can cause Maturity-Onset Diabetes of the Young 4 (MODY4). Although the function of Pdx1 is well studied in pre-clinical models during insulin-producing β-cell development and homeostasis, it remains elusive how this TF controls human pancreas development by regulating a downstream transcriptional program. Also, comparative studies of PDX1 binding patterns in pancreatic progenitors and adult β-cells have not been conducted so far. Furthermore, many studies reported the association between single nucleotide polymorphisms (SNPs) and T2DM, and it has been shown that islet enhancers are enriched in T2DM-associated SNPs. Whether regions, harboring T2DM-associated SNPs are PDX1 bound and active at the pancreatic progenitor stage has not been reported so far.
METHODS - In this study, we have generated a novel induced pluripotent stem cell (iPSC) line that efficiently differentiates into human pancreatic progenitors (PPs). Furthermore, PDX1 and H3K27ac chromatin immunoprecipitation sequencing (ChIP-seq) was used to identify PDX1 transcriptional targets and active enhancer and promoter regions. To address potential differences in the function of PDX1 during development and adulthood, we compared PDX1 binding profiles from PPs and adult islets. Moreover, combining ChIP-seq and GWAS meta-analysis data we identified T2DM-associated SNPs in PDX1 binding sites and active chromatin regions.
RESULTS - ChIP-seq for PDX1 revealed a total of 8088 PDX1-bound regions that map to 5664 genes in iPSC-derived PPs. The PDX1 target regions include important pancreatic TFs, such as PDX1 itself, RFX6, HNF1B, and MEIS1, which were activated during the differentiation process as revealed by the active chromatin mark H3K27ac and mRNA expression profiling, suggesting that auto-regulatory feedback regulation maintains PDX1 expression and initiates a pancreatic TF program. Remarkably, we identified several PDX1 target genes that have not been reported in the literature in human so far, including RFX3, required for ciliogenesis and endocrine differentiation in mouse, and the ligand of the Notch receptor DLL1, which is important for endocrine induction and tip-trunk patterning. The comparison of PDX1 profiles from PPs and adult human islets identified sets of stage-specific target genes, associated with early pancreas development and adult β-cell function, respectively. Furthermore, we found an enrichment of T2DM-associated SNPs in active chromatin regions from iPSC-derived PPs. Two of these SNPs fall into PDX1 occupied sites that are located in the intronic regions of TCF7L2 and HNF1B. Both of these genes are key transcriptional regulators of endocrine induction and mutations in cis-regulatory regions predispose to diabetes.
CONCLUSIONS - Our data provide stage-specific target genes of PDX1 during in vitro differentiation of stem cells into pancreatic progenitors that could be useful to identify pathways and molecular targets that predispose for diabetes. In addition, we show that T2DM-associated SNPs are enriched in active chromatin regions at the pancreatic progenitor stage, suggesting that the susceptibility to T2DM might originate from imperfect execution of a β-cell developmental program.
Copyright © 2018 The Authors. Published by Elsevier GmbH.. All rights reserved.
Studies of regulatory activity and gene expression have revealed an intriguing dichotomy: There is substantial turnover in the regulatory activity of orthologous sequences between species; however, the expression level of orthologous genes is largely conserved. Understanding how distal regulatory elements, for example, enhancers, evolve and function is critical, as alterations in gene expression levels can drive the development of both complex disease and functional divergence between species. In this study, we investigated determinants of the conservation of regulatory enhancer activity for orthologous sequences across mammalian evolution. Using liver enhancers identified from genome-wide histone modification profiles in ten diverse mammalian species, we compared orthologous sequences that exhibited regulatory activity in all species (conserved-activity enhancers) to shared sequences active only in a single species (species-specific-activity enhancers). Conserved-activity enhancers have greater regulatory potential than species-specific-activity enhancers, as quantified by both the density and diversity of transcription factor binding motifs. Consistent with their greater regulatory potential, conserved-activity enhancers have greater regulatory activity in humans than species-specific-activity enhancers: They are active across more cellular contexts, and they regulate more genes than species-specific-activity enhancers. Furthermore, the genes regulated by conserved-activity enhancers are expressed in more tissues and are less tolerant of loss-of-function mutations than those targeted by species-specific-activity enhancers. These consistent results across various stages of gene regulation demonstrate that conserved-activity enhancers are more pleiotropic than their species-specific-activity counterparts. This suggests that pleiotropy is associated with the conservation of regulatory across mammalian evolution.
© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
BACKGROUND - Enhancers are DNA regulatory elements that influence gene expression. There is substantial diversity in enhancers' activity patterns: some enhancers drive expression in a single cellular context, while others are active across many. Sequence characteristics, such as transcription factor (TF) binding motifs, influence the activity patterns of regulatory sequences; however, the regulatory logic through which specific sequences drive enhancer activity patterns is poorly understood. Recent analysis of Drosophila enhancers suggested that short dinucleotide repeat motifs (DRMs) are general enhancer sequence features that drive broad regulatory activity. However, it is not known whether the regulatory role of DRMs is conserved across species.
RESULTS - We performed a comprehensive analysis of the relationship between short DNA sequence patterns, including DRMs, and human enhancer activity in 38,538 enhancers across 411 different contexts. In a machine-learning framework, the occurrence patterns of short sequence motifs accurately predicted broadly active human enhancers. However, DRMs alone were weakly predictive of broad enhancer activity in humans and showed different enrichment patterns than in Drosophila. In general, GC-rich sequence motifs were significantly associated with broad enhancer activity, and consistent with this enrichment, broadly active human TFs recognize GC-rich motifs.
CONCLUSIONS - Our results reveal the importance of specific sequence motifs in broadly active human enhancers, demonstrate the lack of evolutionary conservation of the role of DRMs, and provide a computational framework for investigating the logic of enhancer sequences.
The visual responses of vertebrates are sensitive to the overall composition of retinal interneurons including amacrine cells, which tune the activity of the retinal circuitry. The expression of is regulated by multiple cis-DNA elements including the intronic α-enhancer, which is active in GABAergic amacrine cell subsets. Here, we report that the transforming growth factor ß1-induced transcript 1 protein (Tgfb1i1) interacts with the LIM domain transcription factors Lhx3 and Isl1 to inhibit the α-enhancer in the post-natal mouse retina. mice show elevated α-enhancer activity leading to overproduction of Pax6ΔPD isoform that supports the GABAergic amacrine cell fate maintenance. Consequently, the mouse retinas show a sustained light response, which becomes more transient in mice with the auto-stimulation-defective mutation. Together, we show the antagonistic regulation of the α-enhancer activity by Pax6 and the LIM protein complex is necessary for the establishment of an inner retinal circuitry, which controls visual adaptation.