While the the role of the homeodomain in HOX function has been evaluated extensively, little attention has been given to the non-homeodomain portions of the HOX proteins. To investigate the evolution of the HOXA13 protein and to identify conserved residues in the N-terminal region of the protein with potential functional significance, N-terminal Hoxa13 coding sequences were PCR-amplified from fish, amphibian, reptile, chicken, and marsupial and eutherian mammal genomic DNA. Compared with fish HOXA13, the mammalian protein has increased in size by 35% primarily owing to the accumulation of alanine repeats and flanking segments rich in proline, glycine, or serine within the first 215 amino acids. Certain residues and amino acid motifs were strongly conserved, and several HOXA13 N-terminal domains were also shared in the paralogous HOXB 13 and HOXD13 genes; however, other conserved regions appear to be unique to HOXA13. Two domains highly conserved in HOXA13 orthologs are shared with Drosophila AbdB and other vertebrate AbdB-like proteins. Marsupial and eutherian mammalian HOXA13 proteins have three large homopolymeric alanine repeats of 14, 12, and 17-18 residues that are absent in reptiles, birds, and fish. Thus, the repeats arose after the divergence of reptiles from the lineage that would give rise to the mammals. In contrast, other short homopolymeric alanine repeats in mammalian HOXA13 have remained virtually the same length, suggesting that forces driving or limiting repeat expansion are context dependent. Consecutive stretches of identical third-base usage in alanine codons within the large repeats were found, supporting replication slippage as a mechanism for their generation. However, numerous species-specific base substitutions affecting third-base alanine repeat codon positions were observed, particularly in the largest repeat. Therefore, if the large alanine repeats were present prior to eutherian mammal development as is suggested by the opossum data, then a dynamic process of recurring replication slippage and point mutation within alanine repeat codons must be considered to reconcile these observations. This model might also explain why the alanine repeats are flanked by proline, serine, and glycine-rich sequences, and it reveals a biological mechanism that promotes increases in protein size and, potentially, acquisition of new functions.