Statistics for Laboratory Scientists ( 140.615 )

Analysis of Variance - Nested Models

Example 1 from class: mosquitos

Read the data, and verify that the ANOVA factors are stored as factors in the R object.

library(SPH.140.615)
summary(mosq)
##      length      individual cage 
##  Min.   :49.30   1:6        A:8  
##  1st Qu.:58.25   2:6        B:8  
##  Median :67.05   3:6        C:8  
##  Mean   :66.63   4:6             
##  3rd Qu.:72.03                   
##  Max.   :84.00
str(mosq)
## 'data.frame':    24 obs. of  3 variables:
##  $ length    : num  58.5 59.5 77.8 80.9 84 83.6 70.1 68.3 69.8 69.8 ...
##  $ individual: Factor w/ 4 levels "1","2","3","4": 1 1 2 2 3 3 4 4 1 1 ...
##  $ cage      : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 2 2 ...

Plot the data.

par(las=1)
stripchart(split(jitter(mosq$length,factor=2), list(mosq$individual, mosq$cage)), ylab="", pch=1, vertical=TRUE)
abline(v=c(4.5,8.5),lty=2)

Let’s give this a go.

mosq.aov <- aov(length ~ cage / individual, data=mosq)
summary(mosq.aov)
##                 Df Sum Sq Mean Sq F value   Pr(>F)    
## cage             2  665.7   332.8   255.7 1.45e-10 ***
## cage:individual  9 1720.7   191.2   146.9 6.98e-11 ***
## Residuals       12   15.6     1.3                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This is wrong. By default, R tests both factors by calculating the mean-squares with the error in the denominator, and there is no option to tell it otherwise. Later, we will learn how to do this analysis with maximum likelihood estimation. For now, we use a custom function for the sums of squares approach, which is available in the SPH.140.615 package.

nested.anova(mosq.aov)
## Analysis of Variance Table
## 
## Response: length
##                 Df  Sum Sq Mean Sq  F value    Pr(>F)    
## cage             2  665.68  332.84   1.7409    0.2295    
## cage:individual  9 1720.68  191.19 146.8781 6.981e-11 ***
## Residuals       12   15.62    1.30                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Example 2 from class: flies

The data are available in the SPH.140.615 package.

str(flies)
## 'data.frame':    192 obs. of  3 variables:
##  $ response: num  19 19.6 22.9 26 28.6 ...
##  $ jar     : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 2 2 ...
##  $ strain  : Factor w/ 8 levels "BS","LC","LDD",..: 3 3 3 3 3 3 3 3 3 3 ...
summary(flies)
##     response     jar        strain  
##  Min.   :17.15   A:64   BS     :24  
##  1st Qu.:25.51   B:64   LC     :24  
##  Median :28.95   C:64   LDD    :24  
##  Mean   :29.52          NH     :24  
##  3rd Qu.:33.23          NKS    :24  
##  Max.   :48.45          OL     :24  
##                         (Other):48

Plot the data.

par(las=2)
stripchart(split(flies$response, list(flies$jar, flies$strain)), ylab="", pch=1, vertical=TRUE, xlim=c(1.2,23.8))
abline(v=seq(3.5,24,by=3),lty=2)

nested.anova(aov(response ~ strain / jar, data=flies))
## Analysis of Variance Table
## 
## Response: response
##             Df Sum Sq Mean Sq F value    Pr(>F)    
## strain       7 1323.4 189.059  8.4673 0.0002196 ***
## strain:jar  16  357.3  22.328  0.8044 0.6793007    
## Residuals  168 4663.2  27.757                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1