The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
The missing heritability of breast cancer could be partially attributed to rare variants (MAF < 0.5%). To identify breast cancer-associated rare coding variants, we conducted whole-exome sequencing (~50×) in genomic DNA samples obtained from 831 breast cancer cases and 839 controls of Chinese females. Using burden tests for each gene that included rare missense or predicted deleterious variants, we identified 29 genes showing promising associations with breast cancer risk. We replicated the association for two genes, OGDHL and BRCA2, at a Bonferroni-corrected p < 0.05, by genotyping an independent set of samples from 1,628 breast cancer cases and 1,943 controls. The association for OGDHL was primarily driven by three predicted deleterious variants (p.Val827Met, p.Pro839Leu, p.Phe836Ser; p < 0.01 for all). For BRCA2, we characterized a total of 27 disruptive variants, including 18 nonsense, six frameshift and three splicing variants, whereas they were only detected in cases, but none of the controls. All of these variants were either very rare (AF < 0.1%) or not detected in >4,500 East Asian women from the genome Aggregation database (gnomAD), providing additional support to our findings. Our study revealed a potential novel gene and multiple disruptive variants of BRCA2 for breast cancer risk, which may identify high-risk women in Chinese populations.
© 2019 UICC.
One of the primary goals of genomic medicine is to improve diagnosis through identification of genomic conditions, which could improve clinical management, prevent complications, and promote health. We explore how genomic medicine is being used to obtain molecular diagnoses for patients with previously undiagnosed diseases in prenatal, paediatric, and adult clinical settings. We focus on the role of clinical genomic sequencing (exome and genome) in aiding patients with conditions that are undiagnosed even after extensive clinical evaluation and testing. In particular, we explore the impact of combining genomic and phenotypic data and integrating multiple data types to improve diagnoses for patients with undiagnosed diseases, and we discuss how these genomic sequencing diagnoses could change clinical management.
Copyright © 2019 Elsevier Ltd. All rights reserved.
BACKGROUND - Congenital hydrocephalus (CH) is a highly morbid disease that features enlarged brain ventricles and impaired cerebrospinal fluid homeostasis. Although early linkage or targeted sequencing studies in large multigenerational families have localized several genes for CH, the etiology of most CH cases remains unclear. Recent advances in whole exome sequencing (WES) have identified five new bona fide CH genes, implicating impaired regulation of neural stem cell fate in CH pathogenesis. Nonetheless, in the majority of CH cases, the pathological etiology remains unknown, suggesting more genes await discovery.
METHODS - WES of family members of a sporadic and familial form of severe L1CAM mutation-negative CH associated with aqueductal stenosis was performed. Rare genetic variants were analyzed, prioritized, and validated. De novo copy number variants (CNVs) were identified using the XHMM algorithm and validated using qPCR. Xenopus oocyte experiments were performed to access mutation impact on protein function and expression.
RESULTS - A novel inherited protein-damaging mutation (p.Pro605Leu) in SLC12A6, encoding the K -Cl cotransporter KCC3, was identified in both affected members of multiplex kindred CHYD110. p.Pro605 is conserved in KCC3 orthologs and among all human KCC paralogs. The p.Pro605Leu mutation maps to the ion-transporting domain, and significantly reduces KCC3-dependent K transport. A novel de novo CNV (deletion) was identified in SLC12A7, encoding the KCC3 paralog and binding partner KCC4, in another family (CHYD130) with sporadic CH.
CONCLUSION - These findings identify two novel, related genes associated with CH, and implicate genetically encoded impairments in ion transport for the first time in CH pathogenesis.
© 2019 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.
SUMMARY - Single cell RNA sequencing is a revolutionary technique to characterize inter-cellular transcriptomics heterogeneity. However, the data are noise-prone because gene expression is often driven by both technical artifacts and genuine biological variations. Proper disentanglement of these two effects is critical to prevent spurious results. While several tools exist to detect and remove low-quality cells in one single cell RNA-seq dataset, there is lack of approach to examining consistency between sample sets and detecting systematic biases, batch effects and outliers. We present scRNABatchQC, an R package to compare multiple sample sets simultaneously over numerous technical and biological features, which gives valuable hints to distinguish technical artifact from biological variations. scRNABatchQC helps identify and systematically characterize sources of variability in single cell transcriptome data. The examination of consistency across datasets allows visual detection of biases and outliers.
AVAILABILITY AND IMPLEMENTATION - scRNABatchQC is freely available at https://github.com/liuqivandy/scRNABatchQC as an R package.
SUPPLEMENTARY INFORMATION - Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press.
Elevated serum urate levels can cause gout, an excruciating disease with suboptimal treatment. Previous GWAS identified common variants with modest effects on serum urate. Here we report large-scale whole-exome sequencing association studies of serum urate and kidney function among ≤19,517 European ancestry and African-American individuals. We identify aggregate associations of low-frequency damaging variants in the urate transporters SLC22A12 (URAT1; p = 1.3 × 10) and SLC2A9 (p = 4.5 × 10). Gout risk in rare SLC22A12 variant carriers is halved (OR = 0.5, p = 4.9 × 10). Selected rare variants in SLC22A12 are validated in transport studies, confirming three as loss-of-function (R325W, R405C, and T467M) and illustrating the therapeutic potential of the new URAT1-blocker lesinurad. In SLC2A9, mapping of rare variants of large effects onto the predicted protein structure reveals new residues that may affect urate binding. These findings provide new insights into the genetic architecture of serum urate, and highlight molecular targets in SLC22A12 and SLC2A9 for lowering serum urate and preventing gout.
Our comprehensive analysis of alternative splicing across 32 The Cancer Genome Atlas cancer types from 8,705 patients detects alternative splicing events and tumor variants by reanalyzing RNA and whole-exome sequencing data. Tumors have up to 30% more alternative splicing events than normal samples. Association analysis of somatic variants with alternative splicing events confirmed known trans associations with variants in SF3B1 and U2AF1 and identified additional trans-acting variants (e.g., TADA1, PPP2R1A). Many tumors have thousands of alternative splicing events not detectable in normal samples; on average, we identified ≈930 exon-exon junctions ("neojunctions") in tumors not typically found in GTEx normals. From Clinical Proteomic Tumor Analysis Consortium data available for breast and ovarian tumor samples, we confirmed ≈1.7 neojunction- and ≈0.6 single nucleotide variant-derived peptides per tumor sample that are also predicted major histocompatibility complex-I binders ("putative neoantigens").
Copyright © 2018 Elsevier Inc. All rights reserved.
Genome-wide association studies have identified numerous variants associated with lipid levels; yet, the majority are located in non-coding regions with unclear mechanisms. In the Insulin Resistance Atherosclerosis Family Study (IRASFS), heritability estimates suggest a strong genetic basis: low-density lipoprotein (LDL, h = 0.50), high-density lipoprotein (HDL, h = 0.57), total cholesterol (TC, h = 0.53), and triglyceride (TG, h = 0.42) levels. Exome sequencing of 1,205 Mexican Americans (90 pedigrees) from the IRASFS identified 548,889 variants and association and linkage analyses with lipid levels were performed. One genome-wide significant signal was detected in APOA5 with TG (rs651821, P = 3.67 × 10, LOD = 2.36, MAF = 14.2%). In addition, two correlated SNPs (r = 1.0) rs189547099 (P = 6.31 × 10, LOD = 3.13, MAF = 0.50%) and chr4:157997598 (P = 6.31 × 10, LOD = 3.13, MAF = 0.50%) reached exome-wide significance (P < 9.11 × 10). rs189547099 is an intronic SNP in FNIP2 and SNP chr4:157997598 is intronic in GLRB. Linkage analysis revealed 46 SNPs with a LOD > 3 with the strongest signal at rs1141070 (LOD = 4.30, P = 0.33, MAF = 21.6%) in DFFB. A total of 53 nominally associated variants (P < 5.00 × 10, MAF ≥ 1.0%) were selected for replication in six Mexican-American cohorts (N = 3,280). The strongest signal observed was a synonymous variant (rs1160983, P = 4.44 × 10, MAF = 2.7%) in TOMM40. Beyond primary findings, previously reported lipid loci were fine-mapped using exome sequencing in IRASFS. These results support that exome sequencing complements and extends insights into the genetics of lipid levels.
The Cancer Genome Atlas (TCGA) cancer genomics dataset includes over 10,000 tumor-normal exome pairs across 33 different cancer types, in total >400 TB of raw data files requiring analysis. Here we describe the Multi-Center Mutation Calling in Multiple Cancers project, our effort to generate a comprehensive encyclopedia of somatic mutation calls for the TCGA data to enable robust cross-tumor-type analyses. Our approach accounts for variance and batch effects introduced by the rapid advancement of DNA extraction, hybridization-capture, sequencing, and analysis methods over time. We present best practices for applying an ensemble of seven mutation-calling algorithms with scoring and artifact filtering. The dataset created by this analysis includes 3.5 million somatic variants and forms the basis for PanCan Atlas papers. The results have been made available to the research community along with the methods used to generate them. This project is the result of collaboration from a number of institutes and demonstrates how team science drives extremely large genomics projects.
Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Genetic association studies often examine features independently, potentially missing subpopulations with multiple phenotypes that share a single cause. We describe an approach that aggregates phenotypes on the basis of patterns described by Mendelian diseases. We mapped the clinical features of 1204 Mendelian diseases into phenotypes captured from the electronic health record (EHR) and summarized this evidence as phenotype risk scores (PheRSs). In an initial validation, PheRS distinguished cases and controls of five Mendelian diseases. Applying PheRS to 21,701 genotyped individuals uncovered 18 associations between rare variants and phenotypes consistent with Mendelian diseases. In 16 patients, the rare genetic variants were associated with severe outcomes such as organ transplants. PheRS can augment rare-variant interpretation and may identify subsets of patients with distinct genetic causes for common diseases.
Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Keloids are benign dermal tumors occurring approximately 20 times more often in individuals of African descent as compared to individuals of European descent. While most keloids occur sporadically, a genetic predisposition is supported by both familial aggregation of some keloids and large differences in risk among populations. Despite Africans and African Americans being at increased risk over lighter-skinned individuals, little genetic research exists into this phenotype. Using a combination of admixture mapping and exome analysis, we reported multiple common variants within chr15q21.2-22.3 associated with risk of keloid formation in African Americans. Here we describe a gene-based association analysis using 478 African American samples with exome genotyping data to identify genes containing low-frequency variants associated with keloids, with evaluation of genetically-predicted gene expression in skin tissues using association summary statistics. The strongest signal from gene-based association was located in C15orf63 (P-value = 6.6 × 10 ) located at 15q15.3. The top result from gene expression was increased predicted DCAF4 expression (P-value = 5.5 × 10 ) in non-sun-exposed skin, followed by increased predicted OR10A3 expression in sun-exposed skin (P-value = 6.9 × 10 ). Our findings identify variation with putative roles in keloid formation, enhanced by the use of predicted gene expression to support the biological roles of variation identified only though genetic association studies.
© 2018 John Wiley & Sons Ltd/University College London.