The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
The human genome contains approximately 20 thousand protein-coding genes, but the size of the collection of antigen receptors of the adaptive immune system that is generated by the recombination of gene segments with non-templated junctional additions (on B cells) is unknown-although it is certainly orders of magnitude larger. It has not been established whether individuals possess unique (or private) repertoires or substantial components of shared (or public) repertoires. Here we sequence recombined and expressed B cell receptor genes in several individuals to determine the size of their B cell receptor repertoires, and the extent to which these are shared between individuals. Our experiments revealed that the circulating repertoire of each individual contained between 9 and 17 million B cell clonotypes. The three individuals that we studied shared many clonotypes, including between 1 and 6% of B cell heavy-chain clonotypes shared between two subjects (0.3% of clonotypes shared by all three) and 20 to 34% of λ or κ light chains shared between two subjects (16 or 22% of λ or κ light chains, respectively, were shared by all three). Some of the B cell clonotypes had thousands of clones, or somatic variants, within the clonotype lineage. Although some of these shared lineages might be driven by exposure to common antigens, previous exposure to foreign antigens was not the only force that shaped the shared repertoires, as we also identified shared clonotypes in umbilical cord blood samples and all adult repertoires. The unexpectedly high prevalence of shared clonotypes in B cell repertoires, and identification of the sequences of these shared clonotypes, should enable better understanding of the role of B cell immune repertoires in health and disease.
BACKGROUND - Wilms tumor (WT) is the most common childhood kidney cancer worldwide, yet its incidence and clinical behavior vary according to race and access to adequate healthcare resources. To guide and streamline therapy in the war-torn and resource-constrained city of Baghdad, Iraq, we conducted a first-ever molecular analysis of 20 WT specimens to characterize the biological features of this lethal disease within this challenged population.
METHODS - Next-generation sequencing of ten target genes associated with WT development and treatment resistance (WT1, CTNNB1, WTX, IGF2, CITED1, SIX2, p53, N-MYC, CRABP2, and TOP2A) was completed. Immunohistochemistry was performed for 6 marker proteins of WT (WT1, CTNNB1, NCAM, CITED1, SIX2, and p53). Patient outcomes were compiled.
RESULTS - Mutations were detected in previously described WT "hot spots" (e.g., WT1 and CTNNB1) as well as novel loci that may be unique to the Iraqi population. Immunohistochemistry showed expression domains most typical of blastemal-predominant WT. Remarkably, despite the challenges facing families and care providers, only one child, with combined WT1 and CTNNB1 mutations, was confirmed dead from disease. Median clinical follow-up was 40.5 months (range 6-78 months).
CONCLUSIONS - These data suggest that WT biology within a population of Iraqi children manifests features both similar to and unique from disease variants in other regions of the world. These observations will help to risk stratify WT patients living in this difficult environment to more or less intensive therapies and to focus treatment on cell-specific targets.
Hyperpolarization-activated Cyclic Nucleotide-gated (HCN) channels are important regulators of excitability in neural, cardiac, and other pacemaking cells, which are often altered in disease. In mice, loss of HCN2 leads to cardiac dysrhythmias, persistent spike-wave discharges similar to those seen in absence epilepsy, ataxia, tremor, reduced neuropathic and inflammatory pain, antidepressant-like behavior, infertility, and severely restricted growth. While many of these phenotypes have tissue-specific mechanisms, the cause of restricted growth in HCN2 knockout animals remains unknown. Here, we characterize a novel, 3kb insertion mutation of Hcn2 in the Tremor and Reduced Lifespan 2 (TRLS/2J) mouse that leads to complete loss of HCN2 protein, and we show that this mutation causes many phenotypes similar to other mice lacking HCN2 expression. We then demonstrate that while TRLS/2J mice have low blood glucose levels and impaired growth, dysfunction in hormonal secretion from the pancreas, pituitary, and thyroid are unlikely to lead to this phenotype. Instead, we find that homozygous TRLS/2J mice have abnormal gastrointestinal function that is characterized by less food consumption and delayed gastrointestinal transit as compared to wildtype mice. In summary, a novel mutation in HCN2 likely leads to impaired GI motility, causing the severe growth restriction seen in mice with mutations that eliminate HCN2 expression.
BACKGROUND - Adverse viral and medication effects on adipose tissue contribute to the development of metabolic disease in HIV-infected persons, but T cells also have a central role modulating local inflammation and adipocyte function. We sought to characterize potentially proinflammatory T-cell populations in adipose tissue among persons on long-term antiretroviral therapy and assess whether adipose tissue CD8 T cells represent an expanded, oligoclonal population.
METHODS - We recruited 10 HIV-infected, non-diabetic, overweight or obese adults on efavirenz, tenofovir, and emtricitabine for >4 years with consistent viral suppression. We collected fasting blood and subcutaneous abdominal adipose tissue to measure the percentage of CD4 and CD8 T cells expressing activation, exhaustion, late differentiation/senescence, and memory surface markers. We performed T-cell receptor (TCR) sequencing on sorted CD8 cells. We compared the proportion of each T-cell subset and the TCR repertoire diversity, in blood versus adipose tissue.
RESULTS - Adipose tissue had a higher percentage of CD3CD8 T cells compared with blood (61.0% vs. 51.7%, P < 0.01) and was enriched for both activated CD8HLA-DR T cells (5.5% vs. 0.9%, P < 0.01) and late-differentiated CD8CD57 T cells (37.4% vs. 22.7%, P < 0.01). Adipose tissue CD8 T cells displayed distinct TCRβ V and J gene usage, and the Shannon Entropy index, a measure of overall TCRβ repertoire diversity, was lower compared with blood (4.39 vs. 4.46; P = 0.05).
CONCLUSIONS - Adipose tissue is enriched for activated and late-differentiated CD8 T cells with distinct TCR usage. These cells may contribute to tissue inflammation and impaired adipocyte fitness in HIV-infected persons.
BACKGROUND - High throughput sequencing technology enables the both the human genome and transcriptome to be screened at the single nucleotide resolution. Tools have been developed to infer single nucleotide variants (SNVs) from both DNA and RNA sequencing data. To evaluate how much difference can be expected between DNA and RNA sequencing data, and among tissue sources, we designed a study to examine the single nucleotide difference among five sources of high throughput sequencing data generated from the same individual, including exome sequencing from blood, tumor and adjacent normal tissue, and RNAseq from tumor and adjacent normal tissue.
RESULTS - Through careful quality control and analysis of the SNVs, we found little difference between DNA-DNA pairs (1%-2%). However, between DNA-RNA pairs, SNV differences ranged anywhere from 10% to 20%.
CONCLUSIONS - Only a small portion of these differences can be explained by RNA editing. Instead, the majority of the DNA-RNA differences should be attributed to technical errors from sequencing and post-processing of RNAseq data. Our analysis results suggest that SNV detection using RNAseq is subject to high false positive rates.
We hypothesize that the relative mitochondria copy number (MTCN) can be estimated by comparing the abundance of mitochondrial DNA to nuclear DNA reads using high throughput sequencing data. To test this hypothesis, we examined relative MTCN across 13 breast cancer cell lines using the RT-PCR based NovaQUANT Human Mitochondrial to Nuclear DNA Ratio Kit as the gold standard. Six distinct computational approaches were used to estimate the relative MTCN in order to compare to the RT-PCR measurements. The results demonstrate that relative MTCN correlates well with the RT-PCR measurements using exome sequencing data, but not RNA-seq data. Through analysis of copy number variants (CNVs) in The Cancer Genome Atlas, we show that the two nuclear genes used in the NovaQUANT assay to represent the nuclear genome often experience CNVs in tumor cells, questioning the accuracy of this gold-standard method when it is applied to tumor cells.
Copyright © 2017 Elsevier Inc. All rights reserved.
BACKGROUND - Enhancers are DNA regulatory elements that influence gene expression. There is substantial diversity in enhancers' activity patterns: some enhancers drive expression in a single cellular context, while others are active across many. Sequence characteristics, such as transcription factor (TF) binding motifs, influence the activity patterns of regulatory sequences; however, the regulatory logic through which specific sequences drive enhancer activity patterns is poorly understood. Recent analysis of Drosophila enhancers suggested that short dinucleotide repeat motifs (DRMs) are general enhancer sequence features that drive broad regulatory activity. However, it is not known whether the regulatory role of DRMs is conserved across species.
RESULTS - We performed a comprehensive analysis of the relationship between short DNA sequence patterns, including DRMs, and human enhancer activity in 38,538 enhancers across 411 different contexts. In a machine-learning framework, the occurrence patterns of short sequence motifs accurately predicted broadly active human enhancers. However, DRMs alone were weakly predictive of broad enhancer activity in humans and showed different enrichment patterns than in Drosophila. In general, GC-rich sequence motifs were significantly associated with broad enhancer activity, and consistent with this enrichment, broadly active human TFs recognize GC-rich motifs.
CONCLUSIONS - Our results reveal the importance of specific sequence motifs in broadly active human enhancers, demonstrate the lack of evolutionary conservation of the role of DRMs, and provide a computational framework for investigating the logic of enhancer sequences.
Summary - After the introduction of high-throughput sequencing, genotyping arrays continue to be a viable source for conducting large-scale genetic studies. Currently, Illumina is one of the largest genotyping array manufacturers. One technical issue that has always plagued the post-processing of Illumina genotyping array data is the strand definition. Against convention, Illumina uses their own definition of strand, which is inconsistent with the standard reference forward and reverse definition. This issue has been a major obstacle in the consistency of reporting, meta-analysis and correct interpretation of phenotype association results. To date, the strand issue has not been adequately addressed, prompting us to develop StrandScript, a tool that can convert all genotyping data generated from Illumina genotyping arrays to the reference forward strand. StrandScript works independently of the Illumina array version and is future proof for newer Illumina array designs. Furthermore, StrandScript can examine an Illumina genotyping array manifest file and can detect all problematic SNPs, including SNPs with wrong RS ID and SNPs with mismatched probe sequences. Here, we introduce StrandScript's design and development, and demonstrate its effectiveness using real genotyping data.
Availability and Implementation - https://github.com/seasky002002/Strandscript.
Contact - email@example.com.
Supplementary information - Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org
Near the end of the Pleistocene epoch, populations of the woolly mammoth (Mammuthus primigenius) were distributed across parts of three continents, from western Europe and northern Asia through Beringia to the Atlantic seaboard of North America. Nonetheless, questions about the connectivity and temporal continuity of mammoth populations and species remain unanswered. We use a combination of targeted enrichment and high-throughput sequencing to assemble and interpret a data set of 143 mammoth mitochondrial genomes, sampled from fossils recovered from across their Holarctic range. Our dataset includes 54 previously unpublished mitochondrial genomes and significantly increases the coverage of the Eurasian range of the species. The resulting global phylogeny confirms that the Late Pleistocene mammoth population comprised three distinct mitochondrial lineages that began to diverge ~1.0-2.0 million years ago (Ma). We also find that mammoth mitochondrial lineages were strongly geographically partitioned throughout the Pleistocene. In combination, our genetic results and the pattern of morphological variation in time and space suggest that male-mediated gene flow, rather than large-scale dispersals, was important in the Pleistocene evolutionary history of mammoths.
Analyses of high throughput sequencing data starts with alignment against a reference genome, which is the foundation for all re-sequencing data analyses. Each new release of the human reference genome has been augmented with improved accuracy and completeness. It is presumed that the latest release of human reference genome, GRCh38 will contribute more to high throughput sequencing data analysis by providing more accuracy. But the amount of improvement has not yet been quantified. We conducted a study to compare the genomic analysis results between the GRCh38 reference and its predecessor GRCh37. Through analyses of alignment, single nucleotide polymorphisms, small insertion/deletions, copy number and structural variants, we show that GRCh38 offers overall more accurate analysis of human sequencing data. More importantly, GRCh38 produced fewer false positive structural variants. In conclusion, GRCh38 is an improvement over GRCh37 not only from the genome assembly aspect, but also yields more reliable genomic analysis results.
Copyright © 2017. Published by Elsevier Inc.