The gene encoding rat seminal vesicle secretion II (SVS II) protein has been cloned from a rat genomic DNA library using a cDNA probe generated from rat dorsal prostate androgen-dependent mRNA. The cloned 7.3-kilobase pair genomic fragment contains approximately 5000 base pairs (bp) of the 5'-flanking region and the entire coding region of the SVS II protein within two exons. A sequence of 4156 bp of the rat SVS II gene has been determined, including 2037 bp of the 5'-flanking region, exon 1 (95 bp), intron 1 (236 bp), exon 2 (1171 bp), and 614 bp of the 3'-flanking region. The 5'-flanking region contains three conserved elements found in other seminal vesicle secretion genes (SVS IV-VI proteins) within 250 bp of the transcription start site as well as a glucocorticoid response element at position -314 in the SVS II gene. The first exon encodes a 22-amino acid leader peptide plus the first 2 amino acids of the secreted protein. The second exon encodes the remaining amino acids in the SVS II protein sequence. The mature protein contains 392 residues and has an Mr of 43,116. Concomitant with the gene analysis, the rat SVS II protein was purified to homogeneity, and 333 residues (85%) of the amino acid sequence were determined by automated Edman degradation. The DNA-deduced sequence and that determined by direct analysis of the protein are in complete agreement. The blocked NH2-terminal amino acid was identified as pyroglutamic acid by mass spectrometry and aminopeptidase digestion. A 13-residue structure with the consensus sequence GSQLKSFGQVKSS is repeated 13 times within the SVS II protein and appears to be involved in the formation of the rat copulatory plug via a transglutaminase reaction cross-linking glutamine and lysine residues. Overall, the SVS II protein sequence exhibits little structural relatedness to any other known protein sequence; however, some similarity can be found between the 13-residue repeat and another repeating structure and apparent transglutaminase substrate in the guinea pig seminal vesicle clotting protein.