Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction.

Bush WS, Edwards TL, Dudek SM, McKinney BA, Ritchie MD
BMC Bioinformatics. 2008 9: 238

PMID: 18485205 · PMCID: PMC2412877 · DOI:10.1186/1471-2105-9-238

BACKGROUND - Multifactor Dimensionality Reduction (MDR) has been introduced previously as a non-parametric statistical method for detecting gene-gene interactions. MDR performs a dimensional reduction by assigning multi-locus genotypes to either high- or low-risk groups and measuring the percentage of cases and controls incorrectly labelled by this classification - the classification error. The combination of variables that produces the lowest classification error is selected as the best or most fit model. The correctly and incorrectly labelled cases and controls can be expressed as a two-way contingency table. We sought to improve the ability of MDR to detect gene-gene interactions by replacing classification error with a different measure to score model quality.

RESULTS - In this study, we compare the detection and power of MDR using a variety of measures for two-way contingency table analysis. We simulated 40 genetic models, varying the number of disease loci in the model (2 - 5), allele frequencies of the disease loci (.2/.8 or .4/.6) and the broad-sense heritability of the model (.05 - .3). Overall, detection using NMI was 65.36% across all models, and specific detection was 59.4% versus detection using classification error at 62% and specific detection was 52.2%.

CONCLUSION - Of the 10 measures evaluated, the likelihood ratio and normalized mutual information (NMI) are measures that consistently improve the detection and power of MDR in simulated data over using classification error. These measures also reduce the inclusion of spurious variables in a multi-locus model. Thus, MDR, which has already been demonstrated as a powerful tool for detecting gene-gene interactions, can be improved with the use of alternative fitness functions.

MeSH Terms (13)

Bias Diagnostic Errors Gene Frequency Gene Regulatory Networks Genetic Markers Genotype Information Storage and Retrieval Models, Genetic Odds Ratio Risk Assessment Sensitivity and Specificity Statistics, Nonparametric Weights and Measures

Connections (1)

This publication is referenced by other Labnodes entities:

Links