We determined the structural basis for the presence of electrophoretically-distinct, antigenically-related forms of invariant chains in Ia oligomers, and established the mechanisms by which they can be expressed from a single gene. S1 nuclease protection assays indicated that, in B cells, transcription of this gene initiates at a minimum of three sites. Thus, unlike previously thought, invariant chain mRNAs have heterogeneous 5' untranslated segments that may differentially affect initiation of translation. Further, restriction mapping and nucleotide sequencing of cDNAs revealed two kinds of invariant chain mRNAs differing by an internal coding segment of 192 bp. This segment represents an alternatively spliced exon, as demonstrated by nucleotide sequencing of corresponding genomic regions. The exon (exon X) encodes a cysteine-rich stretch of 64 amino acids near the COOH terminus that displays a striking and surprising homology to an internal amino acid repeat of thyroglobulin, suggesting an evolutionary mechanism of exon shuffling. Transient expression of cDNAs indicated that both types of alternatively spliced mRNAs contain two in-frame AUGs functioning as alternate start sites for translation. Thus, transfections with exon X-lacking cDNAs resulted in the expression of Mr 33,000 and 31,000 proteins, detected by immunoprecipitation with anti-invariant chain antisera, and identical by two-dimensional gel (2-D) analyses to the B cell invariant-chain forms gamma 1 (Mr 31,000), gamma 2, and gamma 3 (Mr 33,000). Similarly, exon X-containing cDNAs expressed Mr 43,000 and 41,000 proteins, also identical by 2-D migration to Ia-associated proteins. Thus, human Ia molecules contain four forms of invariant chain of closely related but nonidentical primary structure that are generated from a single gene by a complex pattern of alternate transcriptional start, exon splicing, and translational start.