Common diseases with a genetic basis are likely to have a very complex etiology, in which the mapping between genotype and phenotype is far from straightforward. A new comprehensive statistical and computational strategy for identifying the missing link between genotype and phenotype has been proposed, which emphasizes the need to address heterogeneity in the first stage of any analysis and gene-gene interactions in the second stage. We applied this two-stage analysis strategy to late-onset Alzheimer disease (LOAD) data, which included functional and positional candidate genes and markers in a region of interest on chromosome 10. Bayesian classification found statistically significant clusterings for independent family-based and case-control datasets, which used the same five markers in leucine-rich repeat transmembrane neuronal 3 (LRRTM3) as the most influential in determining cluster assignment. In subsequent analyses to detect main effects and gene-gene interactions, markers in three genes--urokinase-type plasminogen activator (PLAU), angiotensin 1 converting enzyme (ACE) and cell division cycle 2 (CDC2)--were found to be associated with LOAD in particular subsets of the data based on their LRRTM3 multilocus genotype. All of these genes are viable candidates for LOAD based on their known biological function, even though PLAU, CDC2 and LRRTM3 were initially identified as positional candidates. Further studies are needed to replicate these statistical findings and to elucidate possible biological interaction mechanisms between LRRTM3 and these genes.