Read the data, and verify that the ANOVA factors are stored as factors in the R object.
library(SPH.140.615)
summary(mosq)
## length individual cage
## Min. :49.30 1:6 A:8
## 1st Qu.:58.25 2:6 B:8
## Median :67.05 3:6 C:8
## Mean :66.63 4:6
## 3rd Qu.:72.03
## Max. :84.00
str(mosq)
## 'data.frame': 24 obs. of 3 variables:
## $ length : num 58.5 59.5 77.8 80.9 84 83.6 70.1 68.3 69.8 69.8 ...
## $ individual: Factor w/ 4 levels "1","2","3","4": 1 1 2 2 3 3 4 4 1 1 ...
## $ cage : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 2 2 ...
Plot the data.
par(las=1)
stripchart(split(jitter(mosq$length,factor=2), list(mosq$individual, mosq$cage)), ylab="", pch=1, vertical=TRUE)
abline(v=c(4.5,8.5),lty=2)
Let’s give this a go.
mosq.aov <- aov(length ~ cage / individual, data=mosq)
summary(mosq.aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## cage 2 665.7 332.8 255.7 1.45e-10 ***
## cage:individual 9 1720.7 191.2 146.9 6.98e-11 ***
## Residuals 12 15.6 1.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This is wrong. By default, R tests both factors by calculating the mean-squares with the error in the denominator, and there is no option to tell it otherwise. Later, we will learn how to do this analysis with maximum likelihood estimation. For now, we use a custom function for the sums of squares approach, which is available in the SPH.140.615 package.
nested.anova(mosq.aov)
## Analysis of Variance Table
##
## Response: length
## Df Sum Sq Mean Sq F value Pr(>F)
## cage 2 665.68 332.84 1.7409 0.2295
## cage:individual 9 1720.68 191.19 146.8781 6.981e-11 ***
## Residuals 12 15.62 1.30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The data are available in the SPH.140.615 package.
str(flies)
## 'data.frame': 192 obs. of 3 variables:
## $ response: num 19 19.6 22.9 26 28.6 ...
## $ jar : Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 2 2 ...
## $ strain : Factor w/ 8 levels "BS","LC","LDD",..: 3 3 3 3 3 3 3 3 3 3 ...
summary(flies)
## response jar strain
## Min. :17.15 A:64 BS :24
## 1st Qu.:25.51 B:64 LC :24
## Median :28.95 C:64 LDD :24
## Mean :29.52 NH :24
## 3rd Qu.:33.23 NKS :24
## Max. :48.45 OL :24
## (Other):48
Plot the data.
par(las=2)
stripchart(split(flies$response, list(flies$jar, flies$strain)), ylab="", pch=1, vertical=TRUE, xlim=c(1.2,23.8))
abline(v=seq(3.5,24,by=3),lty=2)
nested.anova(aov(response ~ strain / jar, data=flies))
## Analysis of Variance Table
##
## Response: response
## Df Sum Sq Mean Sq F value Pr(>F)
## strain 7 1323.4 189.059 8.4673 0.0002196 ***
## strain:jar 16 357.3 22.328 0.8044 0.6793007
## Residuals 168 4663.2 27.757
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1