The publication data currently available has been vetted by Vanderbilt faculty, staff, administrators and trainees. The data itself is retrieved directly from NCBI's PubMed and is automatically updated on a weekly basis to ensure accuracy and completeness.
If you have any questions or comments, please contact us.
At least five of the six genes of the bikaverin secondary metabolic gene cluster were shown to have undergone horizontal transfer (HGT) from a Fusarium donor to the Botrytis lineage. Of these five, two enzyme-encoding genes are found as pseudogenes in B. cinerea whereas two regulatory genes and the transporter remain intact. To reconstruct the evolutionary events leading to decay of this gene cluster and infer a more precise timing of its transfer, we examined the genomes of nine additional broadly sampled Botrytis species. We found evidence that a Botrytis ancestor acquired the entire gene cluster through an ancient HGT that occurred before the diversification of the genus. During the subsequent evolution and diversification of the genus, four of the 10 genomes appear to have lost the gene cluster, while in the other six the cluster is in various stages of degeneration. Across the Botrytis genomes, the modes of gene decay in the cluster differed between enzyme-encoding genes, which had higher rates of transition to or retention of pseudogenes and were universally inactivated, and regulatory genes (particularly the non-pathway-specific regulator bik4), which more frequently appeared intact. Consistent with these results, the regulatory genes bik4 and bik5 showed stronger evidence of transcriptional expression than other bikaverin genes under multiple conditions in B. cinerea. These results could be explained by pleiotropy in the bikaverin regulatory genes either through rewiring or their interaction with more central pathways or by constraints on the order of gene loss driven by the intrinsic toxicity of the pathway. Our finding that most of the bikaverin pathway genes have been lost or pseudogenized in these Botrytis genomes suggests that the incidence of HGT of gene cluster-encoded metabolic pathways might be higher than what is possible to be inferred from isolated genome analyses.
Pseudogenes populate the mammalian genome as remnants of artefactual incorporation of coding messenger RNAs into transposon pathways. Here we show that a subset of pseudogenes generates endogenous small interfering RNAs (endo-siRNAs) in mouse oocytes. These endo-siRNAs are often processed from double-stranded RNAs formed by hybridization of spliced transcripts from protein-coding genes to antisense transcripts from homologous pseudogenes. An inverted repeat pseudogene can also generate abundant small RNAs directly. A second class of endo-siRNAs may enforce repression of mobile genetic elements, acting together with Piwi-interacting RNAs. Loss of Dicer, a protein integral to small RNA production, increases expression of endo-siRNA targets, demonstrating their regulatory activity. Our findings indicate a function for pseudogenes in regulating gene expression by means of the RNA interference pathway and may, in part, explain the evolutionary pressure to conserve argonaute-mediated catalysis in mammals.
Of the 57 human cytochromes P450 (P450) and 58 pseudogenes discovered to date, (http://drnelson.utmem.edu/CytochromeP450.html ), 1/4 still remain "orphans" in the sense that their function, expression sites, and regulation are still largely not elucidated. The post-human genome-sequencing project era has presented the research community with novel challenges. Despite many insights gathered about gene location and genetic variations in our human genome, we still lack important knowledge about these novel P450 enzymes and their functions in endogenous and exogenous metabolism, as well as their possible roles in the metabolism of toxicants and carcinogens. Our own list of such orphans currently consists of 13 members: P450 2A7, 2S1, 2U1, 2W1, 3A43, 4A22, 4F11, 4F22, 4V2, 4X1, 4Z1, 20A1, and 27C1. Some of the orphans, e.g. P450s 2W1 and 2U1, already have putative assigned functions in arachidonic acid metabolism and may activate carcinogens. However, at this point, for the majority of them more knowledge is available about their genes and single nucleotide polymorphisms than of their biological functions. It is noteworthy that most P450 orphans express high interspecies sequence conservation and have orthologs in rodents (e.g. CYP4X1/Cyp4x1, CYP4V2/Cyp4v3). This review summarizes recent knowledge about the P450 orphans and questions remaining about their specific roles in human metabolism.
We have performed X-inactivation and sequence analyses on 350 kb of sequence from human Xp11.2, a region shown previously to contain a cluster of genes that escape X inactivation, and we compared this region with the region of conserved synteny in mouse. We identified several new transcripts from this region in human and in mouse, which defined the full extent of the domain escaping X inactivation in both species. In human, escape from X inactivation involves an uninterrupted 235-kb domain of multiple genes. Despite highly conserved gene content and order between the two species, Smcx is the only mouse gene from the conserved segment that escapes inactivation. As repetitive sequences are believed to facilitate spreading of X inactivation along the chromosome, we compared the repetitive sequence composition of this region between the two species. We found that long terminal repeats (LTRs) were decreased in the human domain of escape, but not in the majority of the conserved mouse region adjacent to Smcx in which genes were subject to X inactivation, suggesting that these repeats might be excluded from escape domains to prevent spreading of silencing. Our findings indicate that genomic context, as well as gene-specific regulatory elements, interact to determine expression of a gene from the inactive X-chromosome.
Copyright 2004 Cold Spring Harbor Laboratory Press ISSN
Three linked genes for the CXC-chemokine melanoma growth stimulatory activity/growth related protein (MGSA/GRO) have been previously characterized and mapped to chromosome 4q12-q13. We have isolated and characterized a pseudogene, MGSA/GRO delta, which is 83% similar to the MGSA/GRO alpha gene in the region spanning the 5' UTR, first and second exons, and the first intron. The 5' upstream sequence for the MGSA/GRO delta gene, which is also very similar to the MGSA/GRO alpha, beta, gamma genes, contains a conserved NF-kappa B motif, a TATA box, and a transcription initiation site. However, the sequence becomes markedly divergent after the second exon and hybridization studies indicate that sequences similar to the third and forth exons of other MGSA/GRO genes are not present in this gene. Additional sequence differences include alteration of the MGSA/GRO delta translation initiation codon and a one base insertion resulting in an apparent frame shift and early termination within exon 2. Multiple mutations such as these are characteristic of pseudogenes.
Sterol 14alpha-demethylase (P45014DM) encoded by CYP51 is a member of the cytochrome P450 (CYP) gene superfamily involved in sterol biosynthesis in fungi, plants, and animals. Constraints imposed by the specific function of CYP51 have severely limited sequence divergence in this family. Consequently, CYP51 is the only P450 family recognizable across all eukaryotic phyla. We have determined the structure of the functional human CYP51 gene, which spans 22 kb, is divided into 10 exons, and maps to 7q21.2-q21.3. The 5' portion of intron 1 is GC-rich and contains potential binding sites for several transcription factors. Primer extension studies reveal predominant transcription initiation sites in liver, kidney, lung, and placenta 250 and 249 bp upstream from the translation start site and a second major site at -100 bp. Ubiquitous expression of human CYP51 (Strömstedt et al., Arch. Biochem. Biophys. 329: 73-81, 1996), the absence of TATA and CAAT patterns, a GC-rich sequence in the promoter region, and initiation of CYP51 transcription at more than one site indicate that CYP51 is a housekeeping gene. The 5'-flanking region, exon 1, and a portion of intron 1 show the characteristics of a CpG island, with the observed/expected CpG ratio of 0.79. Sterol responsive element-like motifs were present in this region, suggesting regulation by oxysterols via a mechanism similar to that associated with other genes involved in cholesterol homeostasis. Comparison of the human CYP51 gene structure with structures of other mammalian and fungal CYP gene families shows that 7 of the 9 CYP51 introns are located at unique positions. More than 80 intron locations exist in mammalian and fungal CYP gene families, and it seems very unlikely that all these introns could have been present in the primordial CYP gene.
The three human lanosterol 14 alpha-demethylase (CYP51) genes have been mapped to human chromosomes 3, 7, and 13 using a polychromosomal somatic cell hybrid panel. Two of the genes have been cloned from human chromosome 3-specific (CYP51P1) or from human chromosome 13-containing (CYP51P2) cell hybrids. Both were found to be processed pseudogenes, the first reported in the cytochrome P450 (CYP) gene superfamily. The functional CYP51 gene resides on human chromosome 7. CYP51P1 is 96.5% identical to the human CYP51 coding sequence and is not interrupted with introns but has six in-frame stop codons resulting from point mutations. The intronless CYP51P2 gene is 97.2% identical to the CYP51 cDNA coding region. It has a 1-bp insertion leading to a change of reading frame after codon 9 and a stop codon after amino acid 81. In addition, the CYP51P2 sequence is interrupted with a 5' truncated 131-bp LINE-1 element after nucleotide 606. The element belongs to the youngest LINE subfamily Sb and is 98.2% identical to the LINE-1 element expressed in human teratocarcinoma cells. CYP51 processed pseudogenes are the only known examples of the reverse flow of genetic information during evolution of the large (more than 480 genes) CYP superfamily, suggesting expression in the germ line and a housekeeping function of the lanosterol 14 alpha-demethylase gene. CYP51 pseudogenes evolved by two independent reverse transcription events of the human CYP51 mRNA approximately 9.5 MYR (CYP51P2) and approximately 11.7 MYR (CYP51P1) ago and were inactivated soon after the insertion. The truncated L1 element was inserted into CYP51P2 approximately 6 MYR ago.
The genomic copy multiplicity of the CCAAT transcription complex component enhancer factor I subunit A (EFIA) has been examined. When a mammalian genomic Southern blot was hybridized to a rat EFIA cDNA, a complex pattern consisting of numerous related sequences was found in all the species examined, with Bos taurus being the least complex. An EFIA#1 cDNA from Bos taurus was isolated from a primary lung endothelial cell cDNA library by screening with the 1489-bp rat EFIA cDNA. The deduced bovine EFIA#1 amino acid (aa) sequence is 98% identical to rat EFIA and 100% identical to human EFIA/DbpB/YB-1 family member DNA-binding protein B (DbpB). In addition, a processed EFIA pseudogene from Bos taurus, designated bovine psi EFIA#1, was obtained from a genomic library by screening with a rat EFIA cDNA probe. The bovine psi EFIA#1 gene has an ORF which, if expressed, would encode a 140-aa sequence, with aa 31-140 having 84% identity to bovine EFIA#1. The genomic cloning data indicate that processed pseudogenes are partially responsible for the complexity of the EFIA genomic Southern blots. The phenomenon of 'repeat induced point mutation' (ripping) at bovine psi EFIA#1 gene CpG dinucleotides occurs at a 6.5-fold higher frequency than expected from random mutagenesis. Therefore, ripping is likely to be the mechanism by which the bovine EFIA#1 pseudogene's ectopic recombination potential was inactivated.
Genomic Southern blot analysis of rat EFIA (gene encoding enhancer factor I subunit A) reveals a complex band pattern when cDNA subfragment probes are used. Screening a rat genomic library with a rat EFIA cDNA probe yields two different processed EFIA pseudogenes, designated rat psi EFIA#(2/3) and #(4/7), in addition to two other different, but less extensively characterized clones. psi EFIA#(4/7) has no open reading frame (ORF) sequences. psi EFIA#(2/3) contains two ORFs (83 and 178 codons), the products of which (if expressed) might be negative-acting EFIA transcription factors. Located nearly 0.6 kb upstream from psi EFIA#(2/3) is a perfect 69-bp dinucleotide (CT) tandem repeat, a sequence element associated with other isolated pseudogenes. Additionally, the 3' end of this processed gene is interrupted by an unusual retroposon, an inverted dimeric B1-like short interspersed repetitive element (SINE). The isolation of several independent clones of the same EFIA processed pseudogenes indicates that they comprise a significant component of the rat EFIA copy multiplicity. The phenomenon of repeat induced point mutagenesis (ripping) at rat EFIA pseudogene CpG doublets occurs at a frequency at least 6.5 times higher than predicted from random mutagenesis. This is consonant with the proposal that ripping may be the mechanism which inactivates the ectopic recombination potential of the rat EFIA pseudogenes.
Platelet endothelial cell adhesion molecule-1 (PECAM-1) is a cell-cell adhesion molecule that is expressed on circulating platelets, on leukocytes, and at the intercellular junctions of vascular endothelial cells and mediates the interactions of these cells during the process of transendothelial cell migration. The cDNA for PECAM-1 encodes an open reading frame of 738 amino acids (aa) that is organized into a 27-aa signal peptide, a 574-aa extracellular domain composed of 6 Ig homology units, and a relatively long cytoplasmic tail of 118 aa containing multiple sites for posttranslational modification and postreceptor signal transduction. To provide a molecular basis for the precise evaluation of the structure and function of this transmembrane glycoprotein, we have determined the organization of the human PECAM-1 gene. The PECAM-1 gene, which has been localized to human chromosome 17, is a single-copy gene of approximately 65 kb in length and is broken into 16 exons by introns ranging in size from 86 to greater than 12,000 bp in length. Typical of other members of the Ig superfamily, each of the extracellular Ig homology domains is encoded by a separate exon, consistent with PECAM-1 having arisen by gene duplication and exon shuffling of ancestral Ig superfamily genes. However, the cytoplasmic domain was found to be surprisingly complex, being encoded by seven short exons that may represent discrete functional entities. Alternative splicing of the cytoplasmic tail appears to generate multiple PECAM-1 isoforms that may regulate phosphorylation, cytoskeletal association, and affinity modulation of the mature protein. Finally, a processed pseudogene having 76% identity with PECAM-1 cDNA was identified and localized to human chromosome 3. These findings should have important implications for structure/function analysis of PECAM-1 and its role in vascular adhesive interactions.