Supplementary Material

Pattaro C, Ruczinski I, Fallin DM, Parmigiani G (2008)
Haplotype Block Partitioning as a Tool for Dimensionality Reduction in SNP Association Studies
Unpublished.

Supplementary Figures

Figure 1:

Agreement between all block partitioning methods. Upper panels: pairwise kappa's between the four most common methods, by sample size. Lower panel: kappa's between MATILDE and the four most common methods, by probability cutoff (x-axis) and sample size. Symbols: triangle = DprimeCI, rhomb = SSD, reverse triangle = 4Gamete, and square = HapBlock.


Figure 2:

Comparison of the block partitions on a simulated sample of (from left to right) 200, 400, 600, and 800 subjects. The method used is indicated on the left. On the fifth, unlabeled line, ticks are at the positions where at least three of the four methods above it agreed. MATILDE block structures are reported at different probability cutoffs.


Figure 3:

Comparison of methods' sensitivity and specificity. Data refer to the simulation of (from left to right) 100, 200, 300, 400 cases and 100, 200, 300, 400 controls assuming a dominant model. Each panel reports the sensitivity/specificity tradeoff for DIprimeCI (triangle), 4Gamete (reversed triangle), the SSD (rhomb), HapBlock (square) and MATILDE (represented by points on the ROC curves, graphed as circles, and a smooth estimate of the ROC curve). In addition an allele-at-single-locus association analysis is represented by an "x" while a SNP-by-SNP association analysis is represented by a "+". Four effect sizes were considered: the OR is 1.2, 1.4, 1.6 and 1.8, respectively.


Figure 4:

Parallel distribution of the statistics R (relative position of the block containing the right SNP) and B (number of SNPs belonging to blocks classified not worse than the true SNP) for a sample size of (from left to right) 100, 200, 300, 400 cases and 100, 200, 300, 400 controls. Four effect sizes were considered: the OR is 1.2, 1.4, 1.6 and 1.8, respectively. For each panel, the results of simulation with the allele-at-single-locus method, the SNP-by-SNP analysis, the four common methods (DprimeCI, 4Gamete, SSD and HapBlock) and the MATILDE at various cutoff thresholds are listed.


Questions? Mail Ingo