The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
The missing heritability of breast cancer could be partially attributed to rare variants (MAF < 0.5%). To identify breast cancer-associated rare coding variants, we conducted whole-exome sequencing (~50×) in genomic DNA samples obtained from 831 breast cancer cases and 839 controls of Chinese females. Using burden tests for each gene that included rare missense or predicted deleterious variants, we identified 29 genes showing promising associations with breast cancer risk. We replicated the association for two genes, OGDHL and BRCA2, at a Bonferroni-corrected p < 0.05, by genotyping an independent set of samples from 1,628 breast cancer cases and 1,943 controls. The association for OGDHL was primarily driven by three predicted deleterious variants (p.Val827Met, p.Pro839Leu, p.Phe836Ser; p < 0.01 for all). For BRCA2, we characterized a total of 27 disruptive variants, including 18 nonsense, six frameshift and three splicing variants, whereas they were only detected in cases, but none of the controls. All of these variants were either very rare (AF < 0.1%) or not detected in >4,500 East Asian women from the genome Aggregation database (gnomAD), providing additional support to our findings. Our study revealed a potential novel gene and multiple disruptive variants of BRCA2 for breast cancer risk, which may identify high-risk women in Chinese populations.
© 2019 UICC.
One of the primary goals of genomic medicine is to improve diagnosis through identification of genomic conditions, which could improve clinical management, prevent complications, and promote health. We explore how genomic medicine is being used to obtain molecular diagnoses for patients with previously undiagnosed diseases in prenatal, paediatric, and adult clinical settings. We focus on the role of clinical genomic sequencing (exome and genome) in aiding patients with conditions that are undiagnosed even after extensive clinical evaluation and testing. In particular, we explore the impact of combining genomic and phenotypic data and integrating multiple data types to improve diagnoses for patients with undiagnosed diseases, and we discuss how these genomic sequencing diagnoses could change clinical management.
Copyright © 2019 Elsevier Ltd. All rights reserved.
BACKGROUND - Congenital hydrocephalus (CH) is a highly morbid disease that features enlarged brain ventricles and impaired cerebrospinal fluid homeostasis. Although early linkage or targeted sequencing studies in large multigenerational families have localized several genes for CH, the etiology of most CH cases remains unclear. Recent advances in whole exome sequencing (WES) have identified five new bona fide CH genes, implicating impaired regulation of neural stem cell fate in CH pathogenesis. Nonetheless, in the majority of CH cases, the pathological etiology remains unknown, suggesting more genes await discovery.
METHODS - WES of family members of a sporadic and familial form of severe L1CAM mutation-negative CH associated with aqueductal stenosis was performed. Rare genetic variants were analyzed, prioritized, and validated. De novo copy number variants (CNVs) were identified using the XHMM algorithm and validated using qPCR. Xenopus oocyte experiments were performed to access mutation impact on protein function and expression.
RESULTS - A novel inherited protein-damaging mutation (p.Pro605Leu) in SLC12A6, encoding the K -Cl cotransporter KCC3, was identified in both affected members of multiplex kindred CHYD110. p.Pro605 is conserved in KCC3 orthologs and among all human KCC paralogs. The p.Pro605Leu mutation maps to the ion-transporting domain, and significantly reduces KCC3-dependent K transport. A novel de novo CNV (deletion) was identified in SLC12A7, encoding the KCC3 paralog and binding partner KCC4, in another family (CHYD130) with sporadic CH.
CONCLUSION - These findings identify two novel, related genes associated with CH, and implicate genetically encoded impairments in ion transport for the first time in CH pathogenesis.
© 2019 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.
SUMMARY - Single cell RNA sequencing is a revolutionary technique to characterize inter-cellular transcriptomics heterogeneity. However, the data are noise-prone because gene expression is often driven by both technical artifacts and genuine biological variations. Proper disentanglement of these two effects is critical to prevent spurious results. While several tools exist to detect and remove low-quality cells in one single cell RNA-seq dataset, there is lack of approach to examining consistency between sample sets and detecting systematic biases, batch effects and outliers. We present scRNABatchQC, an R package to compare multiple sample sets simultaneously over numerous technical and biological features, which gives valuable hints to distinguish technical artifact from biological variations. scRNABatchQC helps identify and systematically characterize sources of variability in single cell transcriptome data. The examination of consistency across datasets allows visual detection of biases and outliers.
AVAILABILITY AND IMPLEMENTATION - scRNABatchQC is freely available at https://github.com/liuqivandy/scRNABatchQC as an R package.
SUPPLEMENTARY INFORMATION - Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press.
Our comprehensive analysis of alternative splicing across 32 The Cancer Genome Atlas cancer types from 8,705 patients detects alternative splicing events and tumor variants by reanalyzing RNA and whole-exome sequencing data. Tumors have up to 30% more alternative splicing events than normal samples. Association analysis of somatic variants with alternative splicing events confirmed known trans associations with variants in SF3B1 and U2AF1 and identified additional trans-acting variants (e.g., TADA1, PPP2R1A). Many tumors have thousands of alternative splicing events not detectable in normal samples; on average, we identified ≈930 exon-exon junctions ("neojunctions") in tumors not typically found in GTEx normals. From Clinical Proteomic Tumor Analysis Consortium data available for breast and ovarian tumor samples, we confirmed ≈1.7 neojunction- and ≈0.6 single nucleotide variant-derived peptides per tumor sample that are also predicted major histocompatibility complex-I binders ("putative neoantigens").
Copyright © 2018 Elsevier Inc. All rights reserved.
Genome-wide association studies have identified numerous variants associated with lipid levels; yet, the majority are located in non-coding regions with unclear mechanisms. In the Insulin Resistance Atherosclerosis Family Study (IRASFS), heritability estimates suggest a strong genetic basis: low-density lipoprotein (LDL, h = 0.50), high-density lipoprotein (HDL, h = 0.57), total cholesterol (TC, h = 0.53), and triglyceride (TG, h = 0.42) levels. Exome sequencing of 1,205 Mexican Americans (90 pedigrees) from the IRASFS identified 548,889 variants and association and linkage analyses with lipid levels were performed. One genome-wide significant signal was detected in APOA5 with TG (rs651821, P = 3.67 × 10, LOD = 2.36, MAF = 14.2%). In addition, two correlated SNPs (r = 1.0) rs189547099 (P = 6.31 × 10, LOD = 3.13, MAF = 0.50%) and chr4:157997598 (P = 6.31 × 10, LOD = 3.13, MAF = 0.50%) reached exome-wide significance (P < 9.11 × 10). rs189547099 is an intronic SNP in FNIP2 and SNP chr4:157997598 is intronic in GLRB. Linkage analysis revealed 46 SNPs with a LOD > 3 with the strongest signal at rs1141070 (LOD = 4.30, P = 0.33, MAF = 21.6%) in DFFB. A total of 53 nominally associated variants (P < 5.00 × 10, MAF ≥ 1.0%) were selected for replication in six Mexican-American cohorts (N = 3,280). The strongest signal observed was a synonymous variant (rs1160983, P = 4.44 × 10, MAF = 2.7%) in TOMM40. Beyond primary findings, previously reported lipid loci were fine-mapped using exome sequencing in IRASFS. These results support that exome sequencing complements and extends insights into the genetics of lipid levels.
The Cancer Genome Atlas (TCGA) cancer genomics dataset includes over 10,000 tumor-normal exome pairs across 33 different cancer types, in total >400 TB of raw data files requiring analysis. Here we describe the Multi-Center Mutation Calling in Multiple Cancers project, our effort to generate a comprehensive encyclopedia of somatic mutation calls for the TCGA data to enable robust cross-tumor-type analyses. Our approach accounts for variance and batch effects introduced by the rapid advancement of DNA extraction, hybridization-capture, sequencing, and analysis methods over time. We present best practices for applying an ensemble of seven mutation-calling algorithms with scoring and artifact filtering. The dataset created by this analysis includes 3.5 million somatic variants and forms the basis for PanCan Atlas papers. The results have been made available to the research community along with the methods used to generate them. This project is the result of collaboration from a number of institutes and demonstrates how team science drives extremely large genomics projects.
Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
We hypothesize that the relative mitochondria copy number (MTCN) can be estimated by comparing the abundance of mitochondrial DNA to nuclear DNA reads using high throughput sequencing data. To test this hypothesis, we examined relative MTCN across 13 breast cancer cell lines using the RT-PCR based NovaQUANT Human Mitochondrial to Nuclear DNA Ratio Kit as the gold standard. Six distinct computational approaches were used to estimate the relative MTCN in order to compare to the RT-PCR measurements. The results demonstrate that relative MTCN correlates well with the RT-PCR measurements using exome sequencing data, but not RNA-seq data. Through analysis of copy number variants (CNVs) in The Cancer Genome Atlas, we show that the two nuclear genes used in the NovaQUANT assay to represent the nuclear genome often experience CNVs in tumor cells, questioning the accuracy of this gold-standard method when it is applied to tumor cells.
Copyright © 2017 Elsevier Inc. All rights reserved.
This study describes a 13-yr-old girl with orthostatic intolerance, respiratory weakness, multiple endocrine abnormalities, pancreatic insufficiency, and multiorgan failure involving the gut and bladder. Exome sequencing revealed a de novo, loss-of-function allele in , the gene encoding the Na-K-2Cl cotransporter-1. The 11-bp deletion in exon 22 results in frameshift (p.Val1026Phe*2) and truncation of the carboxy-terminal tail of the cotransporter. Preliminary studies in heterologous expression systems demonstrate that the mutation leads to a nonfunctional transporter, which is expressed and trafficked to the plasma membrane alongside wild-type NKCC1. The truncated protein, visible at higher molecular sizes, indicates either enhanced dimerization or misfolded aggregate. No significant dominant-negative effect was observed. K transport experiments performed in fibroblasts from the patient showed reduced total and NKCC1-mediated K influx. The absence of a bumetanide effect on K influx in patient fibroblasts only under hypertonic conditions suggests a deficit in NKCC1 regulation. We propose that disruption in NKCC1 function might affect sensory afferents and/or smooth muscle cells, as their functions depend on NKCC1 creating a Cl gradient across the plasma membrane. This Cl gradient allows the γ-aminobutyric acid (GABA) receptor or other Cl channels to depolarize the membrane affecting processes such as neurotransmission or cell contraction. Under this hypothesis, disrupted sensory and smooth muscle function in a diverse set of tissues could explain the patient's phenotype.
Genetically engineered mouse models (GEMMs) of cancer are increasingly being used to assess putative driver mutations identified by large-scale sequencing of human cancer genomes. To accurately interpret experiments that introduce additional mutations, an understanding of the somatic genetic profile and evolution of GEMM tumors is necessary. Here, we performed whole-exome sequencing of tumors from three GEMMs of lung adenocarcinoma driven by mutant epidermal growth factor receptor (EGFR), mutant Kirsten rat sarcoma viral oncogene homolog (Kras), or overexpression of MYC proto-oncogene. Tumors from EGFR- and Kras-driven models exhibited, respectively, 0.02 and 0.07 nonsynonymous mutations per megabase, a dramatically lower average mutational frequency than observed in human lung adenocarcinomas. Tumors from models driven by strong cancer drivers (mutant EGFR and Kras) harbored few mutations in known cancer genes, whereas tumors driven by MYC, a weaker initiating oncogene in the murine lung, acquired recurrent clonal oncogenic Kras mutations. In addition, although EGFR- and Kras-driven models both exhibited recurrent whole-chromosome DNA copy number alterations, the specific chromosomes altered by gain or loss were different in each model. These data demonstrate that GEMM tumors exhibit relatively simple somatic genotypes compared with human cancers of a similar type, making these autochthonous model systems useful for additive engineering approaches to assess the potential of novel mutations on tumorigenesis, cancer progression, and drug sensitivity.