The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
Context - Vitamin D inadequacy is common in the adult population of the United States. Although the genetic determinants underlying vitamin D inadequacy have been studied in people of European ancestry, less is known about populations with Hispanic or African ancestry.
Objective - The Trans-Ethnic Evaluation of Vitamin D (TRANSCEN-D) genomewide association study (GWAS) consortium was assembled to replicate genetic associations with 25-hydroxyvitamin D [25(OH)D] concentrations from the Study of Underlying Genetic Determinants of Vitamin D and Highly Related Traits (SUNLIGHT) meta-analyses of European ancestry and to identify genetic variants related to vitamin D concentrations in African and Hispanic ancestries.
Design - Ancestry-specific (Hispanic and African) and transethnic (Hispanic, African, and European) meta-analyses were performed with Meta-Analysis Helper software (METAL).
Patients or Other Participants - In total, 8541 African American and 3485 Hispanic American (from North America) participants from 12 cohorts and 16,124 European participants from SUNLIGHT were included in the study.
Main Outcome Measures - Blood concentrations of 25(OH)D were measured for all participants.
Results - Ancestry-specific analyses in African and Hispanic Americans replicated single nucleotide polymorphisms (SNPs) in GC (2 and 4 SNPs, respectively). An SNP (rs79666294) near the KIF4B gene was identified in the African American cohort. Transethnic evaluation replicated GC and DHCR7 region SNPs. Additionally, the transethnic analyses revealed SNPs rs719700 and rs1410656 near the ANO6/ARID2 and HTR2A genes, respectively.
Conclusions - Ancestry-specific and transethnic GWASs of 25(OH)D confirmed findings in GC and DHCR7 for African and Hispanic American samples and revealed findings near KIF4B, ANO6/ARID2, and HTR2A. The biological mechanisms that link these regions with 25(OH)D metabolism warrant further investigation.
A major challenge in evaluating the contribution of rare variants to complex disease is identifying enough copies of the rare alleles to permit informative statistical analysis. To investigate the contribution of rare variants to the risk of type 2 diabetes (T2D) and related traits, we performed deep whole-genome analysis of 1,034 members of 20 large Mexican-American families with high prevalence of T2D. If rare variants of large effect accounted for much of the diabetes risk in these families, our experiment was powered to detect association. Using gene expression data on 21,677 transcripts for 643 pedigree members, we identified evidence for large-effect rare-variant -expression quantitative trait loci that could not be detected in population studies, validating our approach. However, we did not identify any rare variants of large effect associated with T2D, or the related traits of fasting glucose and insulin, suggesting that large-effect rare variants account for only a modest fraction of the genetic risk of these traits in this sample of families. Reliable identification of large-effect rare variants will require larger samples of extended pedigrees or different study designs that further enrich for such variants.
Genome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, noncoding variants from which pinpointing causal genes remains challenging. Here we combined data from 718,734 individuals to discover rare and low-frequency (minor allele frequency (MAF) < 5%) coding variants associated with BMI. We identified 14 coding variants in 13 genes, of which 8 variants were in genes (ZBTB7B, ACHE, RAPGEF3, RAB21, ZFHX3, ENTPD6, ZFR2 and ZNF169) newly implicated in human obesity, 2 variants were in genes (MC4R and KSR2) previously observed to be mutated in extreme obesity and 2 variants were in GIPR. The effect sizes of rare variants are ~10 times larger than those of common variants, with the largest effect observed in carriers of an MC4R mutation introducing a stop codon (p.Tyr35Ter, MAF = 0.01%), who weighed ~7 kg more than non-carriers. Pathway analyses based on the variants associated with BMI confirm enrichment of neuronal genes and provide new evidence for adipocyte and energy expenditure biology, widening the potential of genetically supported therapeutic targets in obesity.
Genomic maps of local ancestry identify ancestry transitions - points on a chromosome where recent recombination events in admixed individuals have joined two different ancestral haplotypes. These events bring together alleles that evolved within separate continential populations, providing a unique opportunity to evaluate the joint effect of these alleles on health outcomes. In this work, we evaluate the impact of genetic variants in the context of nearby local ancestry transitions within a sample of nearly 10,000 adults of African ancestry with traits derived from electronic health records. Genetic data was located using the Metabochip, and used to derive local ancestry. We develop a model that captures the effect of both single variants and local ancestry, and use it to identify examples where local ancestry transitions significantly interact with nearby variants to influence metabolic traits. In our most compelling example, we find that the minor allele of rs16890640 occuring on a European background with a downstream local ancestry transition to African ancestry results in significantly lower mean corpuscular hemoglobin and volume. This finding represents a new way of discovering genetic interactions, and is supported by molecular data that suggest changes to local ancestry may impact local chromatin looping.
Biomarker definitions for preclinical Alzheimer's disease (AD) have identified individuals with neurodegeneration (ND+) without β-amyloidosis (Aβ-) and labeled them with suspected non-AD pathophysiology (SNAP). We evaluated Apolipoprotein E (APOE) ε2 and ε4 allele frequencies across biomarker definitions-Aβ-/ND- (n = 268), Aβ+/ND- (n = 236), Aβ-/ND+ or SNAP (n = 78), Aβ+/ND+ (n = 204)-hypothesizing that SNAP would have an APOE profile comparable to Aβ-/ND-. Using AD Neuroimaging Initiative data (n = 786, 72±7 years, 48% female), amyloid status (Aβ+ or Aβ-) was defined by cerebrospinal fluid (CSF) Aβ-42 levels, and neurodegeneration status (ND+ or ND-) was defined by hippocampal volume from MRI. Binary logistic regression related biomarker status to APOE ε2 and ε4 allele carrier status, adjusting for age, sex, education, and cognitive diagnosis. Compared to the biomarker negative (Aβ-/ND-) participants, higher proportions of ε4 and lower proportions of ε2 carriers were observed among Aβ+/ND- (ε4: OR = 6.23, p<0.001; ε2: OR = 0.53, p = 0.03) and Aβ+/ND+ participants (ε4: OR = 12.07, p<0.001; ε2: OR = 0.29, p = 0.004). SNAP participants were statistically comparable to biomarker negative participants (p-values>0.30). In supplemental analyses, comparable results were observed when coding SNAP using amyloid imaging and when using CSF tau levels. In contrast to APOE, a polygenic risk score for AD that excluded APOE did not show an association with amyloidosis or neurodegeneration (p-values>0.15), but did show an association with SNAP defined using CSF tau (β = 0.004, p = 0.02). Thus, in a population with low levels of cerebrovascular disease and a lower prevalence of SNAP than the general population, APOE and known genetic drivers of AD do not appear to contribute to the neurodegeneration observed in SNAP. Additional work in population based samples is needed to better elucidate the genetic contributors to various etiological drivers of SNAP.
Most genome-wide association studies have been of European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we used an exome array to examine protein-coding genetic variants in 47,532 East Asian individuals. We identified 255 variants at 41 loci that reached chip-wide significance, including 3 novel loci and 14 East Asian-specific coding variant associations. After a meta-analysis including >300,000 European samples, we identified an additional nine novel loci. Sixteen genes were identified by protein-altering variants in both East Asians and Europeans, and thus are likely to be functional genes. Our data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci.
Melanoma is the deadliest form of skin cancer and presents a significant health care burden in many countries. In addition to ultraviolet radiation in sunlight, the main causal factor for melanoma, genetic factors also play an important role in melanoma susceptibility. Although genome-wide association studies have identified many single nucleotide polymorphisms associated with melanoma, little is known about the proportion of disease risk attributable to these loci and their distribution throughout the genome. Here, we investigated the genetic architecture of melanoma in 1,888 cases and 990 controls of European non-Hispanic ancestry. We estimated the overall narrow-sense heritability of melanoma to be 0.18 (P < 0.03), indicating that genetics contributes significantly to the risk of sporadically-occurring melanoma. We then demonstrated that only a small proportion of this risk is attributable to known risk variants, suggesting that much remains unknown of the role of genetics in melanoma. To investigate further the genetic architecture of melanoma, we partitioned the heritability by chromosome, minor allele frequency, and functional annotations. We showed that common genetic variation contributes significantly to melanoma risk, with a risk model defined by a handful of genomic regions rather than many risk loci distributed throughout the genome. We also demonstrated that variants affecting gene expression in skin account for a significant proportion of the heritability, and are enriched among melanoma risk loci. Finally, by incorporating skin color into our analyses, we observed both a shift in significance for melanoma-associated loci and an enrichment of expression quantitative trait loci among melanoma susceptibility variants. These findings suggest that skin color may be an important modifier of melanoma risk. We speculate that incorporating skin color and other non-genetic factors into genetic studies may allow for an improved understanding of melanoma susceptibility and guide future investigations to identify melanoma risk genes.
We recently developed base editing, the programmable conversion of target C:G base pairs to T:A without inducing double-stranded DNA breaks (DSBs) or requiring homology-directed repair using engineered fusions of Cas9 variants and cytidine deaminases. Over the past year, the third-generation base editor (BE3) and related technologies have been successfully used by many researchers in a wide range of organisms. The product distribution of base editing-the frequency with which the target C:G is converted to mixtures of undesired by-products, along with the desired T:A product-varies in a target site-dependent manner. We characterize determinants of base editing outcomes in human cells and establish that the formation of undesired products is dependent on uracil N-glycosylase (UNG) and is more likely to occur at target sites containing only a single C within the base editing activity window. We engineered CDA1-BE3 and AID-BE3, which use cytidine deaminase homologs that increase base editing efficiency for some sequences. On the basis of these observations, we engineered fourth-generation base editors (BE4 and SaBE4) that increase the efficiency of C:G to T:A base editing by approximately 50%, while halving the frequency of undesired by-products compared to BE3. Fusing BE3, BE4, SaBE3, or SaBE4 to Gam, a bacteriophage Mu protein that binds DSBs greatly reduces indel formation during base editing, in most cases to below 1.5%, and further improves product purity. BE4, SaBE4, BE4-Gam, and SaBE4-Gam represent the state of the art in C:G-to-T:A base editing, and we recommend their use in future efforts.
Uterine fibroids are benign tumors of the uterus affecting up to 77% of women by menopause. They are the leading indication for hysterectomy, and account for $34 billion annually in the United States. Race/ethnicity and age are the strongest known risk factors. African American (AA) women have higher prevalence, earlier onset, and larger and more numerous fibroids than European American women. We conducted a multi-stage genome-wide association study (GWAS) of fibroid risk among AA women followed by in silico genetically predicted gene expression profiling of top hits. In Stage 1, cases and controls were confirmed by pelvic imaging, genotyped and imputed to 1000 Genomes. Stage 2 used self-reported fibroid and GWAS data from 23andMe, Inc. and the Black Women's Health Study. Associations with fibroid risk were modeled using logistic regression adjusted for principal components, followed by meta-analysis of results. We observed a significant association among 3399 AA cases and 4764 AA controls at rs739187 (risk-allele frequency = 0.27) in CYTH4 (OR (95% confidence interval) = 1.23 (1.16-1.30), p value = 7.82 × 10). Evaluation of the genetic association results with MetaXcan identified lower predicted gene expression of CYTH4 in thyroid tissue as significantly associated with fibroid risk (p value = 5.86 × 10). In this first multi-stage GWAS for fibroids among AA women, we identified a novel risk locus for fibroids within CYTH4 that impacts gene expression in thyroid and has potential biological relevance for fibroids.
BACKGROUND - Genomic data is increasingly collected by a wide array of organizations. As such, there is a growing demand to make summary information about such collections available more widely. However, over the past decade, a series of investigations have shown that attacks, rooted in statistical inference methods, can be applied to discern the presence of a known individual's DNA sequence in the pool of subjects. Recently, it was shown that the Beacon Project of the Global Alliance for Genomics and Health, a web service for querying about the presence (or absence) of a specific allele, was vulnerable. The Integrating Data for Analysis, Anonymization, and Sharing (iDASH) Center modeled a track in their third Privacy Protection Challenge on how to mitigate the Beacon vulnerability. We developed the winning solution for this track.
METHODS - This paper describes our computational method to optimize the tradeoff between the utility and the privacy of the Beacon service. We generalize the genomic data sharing problem beyond that which was introduced in the iDASH Challenge to be more representative of real world scenarios to allow for a more comprehensive evaluation. We then conduct a sensitivity analysis of our method with respect to several state-of-the-art methods using a dataset of 400,000 positions in Chromosome 10 for 500 individuals from Phase 3 of the 1000 Genomes Project. All methods are evaluated for utility, privacy and efficiency.
RESULTS - Our method achieves better performance than all state-of-the-art methods, irrespective of how key factors (e.g., the allele frequency in the population, the size of the pool and utility weights) change from the original parameters of the problem. We further illustrate that it is possible for our method to exhibit subpar performance under special cases of allele query sequences. However, we show our method can be extended to address this issue when the query sequence is fixed and known a priori to the data custodian, so that they may plan stage their responses accordingly.
CONCLUSIONS - This research shows that it is possible to thwart the attack on Beacon services, without substantially altering the utility of the system, using computational methods. The method we initially developed is limited by the design of the scenario and evaluation protocol for the iDASH Challenge; however, it can be improved by allowing the data custodian to act in a staged manner.