The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
Closely spaced clusters of tandemly duplicated genes (CTDGs) contribute to the diversity of many phenotypes, including chemosensation, snake venom, and animal body plans. CTDGs have traditionally been identified subjectively as genomic neighborhoods containing several gene duplicates in close proximity; however, CTDGs are often highly variable with respect to gene number, intergenic distance, and synteny. This lack of formal definition hampers the study of CTDG evolutionary dynamics and the discovery of novel CTDGs in the exponentially growing body of genomic data. To address this gap, we developed a novel homology-based algorithm, CTDGFinder, which formalizes and automates the identification of CTDGs by examining the physical distribution of individual members of families of duplicated genes across chromosomes. Application of CTDGFinder accurately identified CTDGs for many well-known gene clusters (e.g., Hox and beta-globin gene clusters) in the human, mouse and 20 other mammalian genomes. Differences between previously annotated gene clusters and our inferred CTDGs were due to the exclusion of nonhomologs that have historically been considered parts of specific gene clusters, the inclusion or absence of genes between the CTDGs and their corresponding gene clusters, and the splitting of certain gene clusters into distinct CTDGs. Examination of human genes showing tissue-specific enhancement of their expression by CTDGFinder identified members of several well-known gene clusters (e.g., cytochrome P450s and olfactory receptors) and revealed that they were unequally distributed across tissues. By formalizing and automating CTDG identification, CTDGFinder will facilitate understanding of CTDG evolutionary dynamics, their functional implications, and how they are associated with phenotypic diversity.
© The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Helicobacter pylori genetic variation is a crucial component of colonization and persistence within the inhospitable niche of the gastric mucosa. As such, numerous H. pylori genes have been shown to vary in terms of presence and genomic location within this pathogen. Among the variable factors, the Bab family of outer membrane proteins (OMPs) has been shown to differ within subsets of strains. To better understand genetic variation among the bab genes and to determine whether this variation differed among isolates obtained from different geographic locations, we characterized the distribution of the Bab family members in 80 American H. pylori clinical isolates (AH) and 80 South Korean H. pylori clinical isolates (KH). Overall, we identified 23 different bab genotypes (19 in AH and 11 in KH), but only 5 occurred in greater than 5 isolates. Regardless of strain origin, a strain in which locus A and locus B were both occupied by a bab gene was the most common (85%); locus C was only occupied in those isolates that carried bab paralog at locus A and B. While the babA/babB/- genotype predominated in the KH (78.8%), no single genotype could account for greater than 40% in the AH collection. In addition to basic genotyping, we also identified associations between bab genotype and well known virulence factors cagA and vacA. Specifically, significant associations between babA at locus A and the cagA EPIYA-ABD motif (P<0.0001) and the vacA s1/i1/m1 allele (P<0.0001) were identified. Log-linear modeling further revealed a three-way association between bab carried at locus A, vacA, and number of OMPs from the HOM family (P<0.002). En masse this study provides a detailed characterization of the bab genotypes from two distinct populations. Our analysis suggests greater variability in the AH, perhaps due to adaptation to a more diverse host population. Furthermore, when considering the presence or absence of both the bab and homA/B paralogs at their given loci and the vacA genotype, an association was observed. Our results highlight the multifactorial nature of H. pylori mediated disease and the importance of considering how the specific combinations of H. pylori virulence genes and their multiple interactions with the host will collectively impact disease progression.
Stalled replication forks are a critical problem for the cell because they can lead to complex genome rearrangements that underlie cell death and disease. Processes such as DNA damage tolerance and replication fork reversal protect stalled forks from these events. A central mediator of these DNA damage responses in humans is the Rad5-related DNA translocase, HLTF. Here, we present biochemical and structural evidence that the HIRAN domain, an ancient and conserved domain found in HLTF and other DNA processing proteins, is a modified oligonucleotide/oligosaccharide (OB) fold that binds to 3' ssDNA ends. We demonstrate that the HIRAN domain promotes HLTF-dependent fork reversal in vitro through its interaction with 3' ssDNA ends found at forks. Finally, we show that HLTF restrains replication fork progression in cells in a HIRAN-dependent manner. These findings establish a mechanism of HLTF-mediated fork reversal and provide insight into the requirement for distinct fork remodeling activities in the cell.
Copyright © 2015 Elsevier Inc. All rights reserved.
Titin is the largest known protein and a critical determinant of myofibril elasticity and sarcomere structure in striated muscle. Accumulating evidence that mRNA transcripts are post-transcriptionally regulated by specific motifs located in the flanking untranslated regions (UTRs) led us to consider the role of titin 5'-UTR in regulating its translational efficiency. Titin 5'-UTR is highly homologous between human, mouse, and rat, and sequence analysis revealed the presence of a stem-loop and two upstream AUG codons (uAUGs) converging on a shared in frame stop codon. We generated a mouse titin 5'-UTR luciferase reporter construct and targeted the stem-loop and each uAUG for mutation. The wild-type and mutated constructs were transfected into the cardiac HL-1 cell line and primary neonatal rat ventricular myocytes (NRVM). SV40 driven 5'-UTR luciferase activity was significantly suppressed by wild-type titin 5'-UTR (∼ 70% in HL-1 cells and ∼ 60% in NRVM). Mutating both uAUGs was found to alleviate titin 5'-UTR suppression, while eliminating the stem-loop had no effect. Treatment with various growth stimuli: pacing, PMA or neuregulin had no effect on titin 5'-UTR luciferase activity. Doxorubicin stress stimuli reduced titin 5'-UTR suppression, while H2O2 had no effect. A reported single nucleotide polymorphism (SNP) rs13422986 at position -4 of the uAUG2 was introduced and found to further repress titin 5'-UTR luciferase activity. We conclude that the uAUG motifs in titin 5'-UTR serve as translational repressors in the control of titin gene expression, and that mutations/SNPs of the uAUGs or doxorubicin stress could alter titin translational efficiency.
Copyright © 2014 Elsevier Inc. All rights reserved.
The TWEAK-fibroblast growth factor-inducible 14 (Fn14) system is a critical regulator of denervation-induced skeletal muscle atrophy. Although the expression of Fn14 is a rate-limiting step in muscle atrophy on denervation, mechanisms regulating gene expression of Fn14 remain unknown. Methylation of CpG sites within promoter region is an important epigenetic mechanism for gene silencing. Our study demonstrates that Fn14 promoter contains a CpG island close to transcription start site. Fn14 promoter also contains multiple consensus DNA sequence for transcription factors activator protein 1 (AP1) and specificity protein 1 (SP1). Denervation diminishes overall genomic DNA methylation and causes hypomethylation at specific CpG sites in Fn14 promoter leading to the increased gene expression of Fn14 in skeletal muscle. Abundance of DNA methyltransferase 3a (Dnmt3a) and its interaction with Fn14 promoter are repressed in denervated skeletal muscle of mice. Overexpression of Dnmt3a inhibits the gene expression of Fn14 and attenuates skeletal muscle atrophy upon denervation. Denervation also causes the activation of ERK1/2, JNK1/2, and ERK5 MAPKs and AP1 and SP1, which stimulate the expression of Fn14 in skeletal muscle. Collectively, our study provides novel evidence that Dnmt3a and MAPK signaling regulate the levels of Fn14 in skeletal muscle on denervation.
© 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
Current methods for engineering the segmented double-stranded RNA genome of rotavirus (RV) are limited by inefficient recovery of the recombinant virus. In an effort to expand the utility of RV reverse genetics, we developed a method to recover recombinant viruses in which independent selection strategies are used to engineer single-gene replacements. We coupled a mutant SA11 RV encoding a temperature-sensitive (ts) defect in the NSP2 protein with RNAi-mediated degradation of NSP2 mRNAs to isolate a virus containing a single recombinant gene that evades both selection mechanisms. Recovery is rapid and simple; after two rounds of selective passage the recombinant virus reaches titers of ≥10(4) pfu/mL. We used this reverse genetics method to generate a panel of viruses with chimeric NSP2 genes. For one of the chimeric viruses, the introduced NSP2 sequence was obtained from a pathogenic, noncultivated human RV isolate, demonstrating that this reverse genetics system can be used to study the molecular biology of circulating RVs. Combining characterized RV ts mutants and validated siRNA targets should permit the extension of this "two-hit" reverse genetics methodology to other RV genes. Furthermore, application of a dual selection strategy to previously reported reverse genetics methods for RV may enhance the efficiency of recombinant virus recovery.
The diversity of immunoglobulin (Ig) and T cell receptor (TCR) genes available to form the lymphocyte repertoire has the capacity to produce a broad array of both protective and harmful specificities. In type 1 diabetes (T1D), the presence of antibodies to insulin and other islet antigens predicts disease development in both mice and humans, and demonstrate that immune tolerance is lost early in the disease process. Anti-insulin T cells isolated from T1D-prone non-obese diabetic (NOD) mice use polymorphic TCRalpha chains, suggesting that the available T cell repertoire is altered in these autoimmune mice. To probe whether insulin-binding B cells also possess polymorphic V genes, Ig light chains were isolated and sequenced from NOD mice that harbor an Ig heavy chain transgene. Three insulin-binding Vkappa genes were identified, all of which were polymorphic to the closest germline sequence matches present in the GenBank database. Additional analysis of over 300 light chain sequences from multiple sources, including germline DNA, shows that polymorphisms are spread throughout the entire NOD Igkappa locus, as these polymorphic sequences represent 43 distinct Vkappa genes which belong to 14 Vkappa families. Database searches reveal that a majority of polymorphic Vkappa genes identified in NOD are identical to Vkappa genes isolated from SLE-prone NZBxNZW F1 or MRL strains of mice, suggesting that a shared Igkappa haplotype may be present. Predicted amino acid changes preferentially occur in CDR, and thus could alter antigen recognition by the germline B cell repertoire of autoimmune versus non-autoimmune mouse strains.
Next-generation sequencing has opened the door to genomic analysis of nonmodel organisms. Technologies generating long-sequence reads (200-400 bp) are increasingly used in evolutionary studies of nonmodel organisms, but the short-sequence reads (30-50 bp) that can be produced at lower cost are thought to be of limited utility for de novo sequencing applications. Here, we tested this assumption by short-read sequencing the transcriptomes of the tropical disease vectors Aedes aegypti and Anopheles gambiae, for which complete genome sequences are available. Comparison of our results to the reference genomes allowed us to accurately evaluate the quantity, quality, and functional and evolutionary information content of our "test" data. We produced more than 0.7 billion nucleotides of sequenced data per species that assembled into more than 21,000 test contigs larger than 100 bp per species and covered approximately 27% of the Aedes reference transcriptome. Remarkably, the substitution error rate in the test contigs was approximately 0.25% per site, with very few indels or assembly errors. Test contigs of both species were enriched for genes involved in energy production and protein synthesis and underrepresented in genes involved in transcription and differentiation. Ortholog prediction using the test contigs was accurate across hundreds of millions of years of evolution. Our results demonstrate the considerable utility of short-read transcriptome sequencing for genomic studies of nonmodel organisms and suggest an approach for assessing the information content of next-generation data for evolutionary studies.
Cytochrome c oxidase (COX) deficiency is a frequent cause of mitochondrial disease in infants. Mutations in the COX assembly gene SCO2 cause fatal infantile cardioencephalomyopathy. All patients reported to date with SCO2 deficiency share a common p.E140K mutation in at least 1 allele. In order to further the understanding of the genotype-phenotype spectrum associated with fatal infantile cardioencephalomyopathy, we describe a novel homozygous SCO2 mutation p.G193S in a patient with fatal infantile cardioencephalomyopathy born to consanguineous parents of Indian ancestry.