Statistics for Laboratory Scientists ( 140.615 )

Some of the data used in the following examples are in the SPH.140.615 package.

library(SPH.140.615)

Linear Mixed Effects - Analysis of Variance

Use the function install.packages() if you haven’t installed the following packages, i.e. use install.packages(“lme4”) etc in the code chunk. Alternatively, hit the ‘Knit’ button, and you should be prompted about installing the packages.

library(lattice)
library(Matrix)
library(lme4)

The more complicated ANOVA examples in class (involving random effects) had balanced designs, so the sums of squares approach was appropriate. When you have an unbalanced designs, all sorts of problems can arise which can invalidate your inference (for example, the null distribution of the test statistics might not be F anymore), and using likelihood-based approaches are typically better for the inference. Here, we re-visit some of the examples and designs, using likelihood-based linear mixed effects methods.

The mosquito example from class - a nested ANOVA

Plot the data.

xyplot(length ~ individual|cage, data=mosq,type=c("p"), layout=c(3,1), xlab="mosquito",
       strip=strip.custom(var.name=c("Cage"), strip.names=TRUE))

Fit cage as a fixed effect
lme.fit <- lmer(length ~ cage + (1|cage:individual), data=mosq)
summary(lme.fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula: length ~ cage + (1 | cage:individual)
##    Data: mosq
## 
## REML criterion at convergence: 116.3
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.31971 -0.57739  0.05871  0.57288  1.39743 
## 
## Random effects:
##  Groups          Name        Variance Std.Dev.
##  cage:individual (Intercept) 94.942   9.744   
##  Residual                     1.302   1.141   
## Number of obs: 24, groups:  cage:individual, 12
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   72.838      4.889  14.899
## cageB        -12.875      6.914  -1.862
## cageC         -5.738      6.914  -0.830
## 
## Correlation of Fixed Effects:
##       (Intr) cageB 
## cageB -0.707       
## cageC -0.707  0.500

Test for the cage effect.

lme.fit.red <- lmer(length ~ 1 + (1|cage:individual), data=mosq)
summary(lme.fit.red)
## Linear mixed model fit by REML ['lmerMod']
## Formula: length ~ 1 + (1 | cage:individual)
##    Data: mosq
## 
## REML criterion at convergence: 130.8
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.29169 -0.58861  0.01665  0.54393  1.42545 
## 
## Random effects:
##  Groups          Name        Variance Std.Dev.
##  cage:individual (Intercept) 107.820  10.384  
##  Residual                      1.302   1.141  
## Number of obs: 24, groups:  cage:individual, 12
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   66.633      3.007   22.16
anova(lme.fit,lme.fit.red)
## refitting model(s) with ML (instead of REML)
## Data: mosq
## Models:
## lme.fit.red: length ~ 1 + (1 | cage:individual)
## lme.fit: length ~ cage + (1 | cage:individual)
##             npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## lme.fit.red    3 140.78 144.32 -67.392   134.78                     
## lme.fit        5 140.86 146.75 -65.430   130.86 3.9246  2     0.1405
Fit cage as a random effect
lme.fit <- lmer(length ~ 1 + (1|cage) + (1|cage:individual), data=mosq)
summary(lme.fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula: length ~ 1 + (1 | cage) + (1 | cage:individual)
##    Data: mosq
## 
## REML criterion at convergence: 130.5
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.29844 -0.58190  0.03584  0.55002  1.41870 
## 
## Random effects:
##  Groups          Name        Variance Std.Dev.
##  cage:individual (Intercept) 94.942   9.744   
##  cage            (Intercept) 17.706   4.208   
##  Residual                     1.302   1.141   
## Number of obs: 24, groups:  cage:individual, 12; cage, 3
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   66.633      3.724   17.89
summary(lme.fit)$varcor
##  Groups          Name        Std.Dev.
##  cage:individual (Intercept) 9.7438  
##  cage            (Intercept) 4.2079  
##  Residual                    1.1409

The penicillin example - a two-way ANOVA without replicates

After loading the lme4 package, type ?Penicillin to see a description of the experiment.

attach(Penicillin)
interaction.plot(plate, sample, diameter, col=rainbow(6))

interaction.plot(sample, plate, diameter, col=rainbow(24))

Fit both plate and sample as random effects
lme.fit <- lmer(diameter ~ 1 + (1|plate) + (1|sample), data=Penicillin)
summary(lme.fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula: diameter ~ 1 + (1 | plate) + (1 | sample)
##    Data: Penicillin
## 
## REML criterion at convergence: 330.9
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.07923 -0.67140  0.06292  0.58377  2.97959 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  plate    (Intercept) 0.7169   0.8467  
##  sample   (Intercept) 3.7311   1.9316  
##  Residual             0.3024   0.5499  
## Number of obs: 144, groups:  plate, 24; sample, 6
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  22.9722     0.8086   28.41
Fit sample as fixed and plate as random effect
lme.fit <- lmer(diameter ~ sample + (1|plate), data=Penicillin)
summary(lme.fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula: diameter ~ sample + (1 | plate)
##    Data: Penicillin
## 
## REML criterion at convergence: 308.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.0730 -0.6603  0.0565  0.5949  2.9797 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  plate    (Intercept) 0.7169   0.8467  
##  Residual             0.3024   0.5499  
## Number of obs: 144, groups:  plate, 24
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  25.1667     0.2061 122.117
## sampleB      -3.2083     0.1587 -20.210
## sampleC      -0.2500     0.1587  -1.575
## sampleD      -2.2917     0.1587 -14.436
## sampleE      -2.2083     0.1587 -13.911
## sampleF      -5.2083     0.1587 -32.809
## 
## Correlation of Fixed Effects:
##         (Intr) samplB samplC samplD samplE
## sampleB -0.385                            
## sampleC -0.385  0.500                     
## sampleD -0.385  0.500  0.500              
## sampleE -0.385  0.500  0.500  0.500       
## sampleF -0.385  0.500  0.500  0.500  0.500

Test for the sample effect.

lme.fit.red <- lmer(diameter ~ 1 + (1|plate), data=Penicillin)
summary(lme.fit.red)
## Linear mixed model fit by REML ['lmerMod']
## Formula: diameter ~ 1 + (1 | plate)
##    Data: Penicillin
## 
## REML criterion at convergence: 613.3
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.41583 -0.55523  0.01212  0.59232  1.95241 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  plate    (Intercept) 0.09509  0.3084  
##  Residual             4.03333  2.0083  
## Number of obs: 144, groups:  plate, 24
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  22.9722     0.1788   128.5
anova(lme.fit,lme.fit.red)
## refitting model(s) with ML (instead of REML)
## Data: Penicillin
## Models:
## lme.fit.red: diameter ~ 1 + (1 | plate)
## lme.fit: diameter ~ sample + (1 | plate)
##             npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## lme.fit.red    3 617.63 626.54 -305.81   611.63                         
## lme.fit        8 311.66 335.42 -147.83   295.66 315.97  5  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

A simulated example - a two-way ANOVA with replicates

Five replicates for each of four treatments (A-D; fixed effects) recorded on each of 5 separate days (blocks B1-B5; random effects).

head(simdat,20)
##    response treat block
## 1  75.65923     A    B1
## 2  79.58293     A    B1
## 3  80.33561     A    B1
## 4  79.84798     A    B1
## 5  77.20447     A    B1
## 6  85.08627     A    B2
## 7  81.72046     A    B2
## 8  78.68721     A    B2
## 9  73.90683     A    B2
## 10 83.92572     A    B2
## 11 77.35831     A    B3
## 12 77.44454     A    B3
## 13 80.32462     A    B3
## 14 79.95678     A    B3
## 15 79.27482     A    B3
## 16 87.54277     A    B4
## 17 87.13225     A    B4
## 18 85.00954     A    B4
## 19 78.81779     A    B4
## 20 86.64532     A    B4
summary(simdat)
##     response     treat  block  
##  Min.   :67.08   A:25   B1:20  
##  1st Qu.:77.65   B:25   B2:20  
##  Median :82.89   C:25   B3:20  
##  Mean   :83.14   D:25   B4:20  
##  3rd Qu.:88.89          B5:20  
##  Max.   :96.46
str(simdat)
## 'data.frame':    100 obs. of  3 variables:
##  $ response: num  75.7 79.6 80.3 79.8 77.2 ...
##  $ treat   : Factor w/ 4 levels "A","B","C","D": 1 1 1 1 1 1 1 1 1 1 ...
##  $ block   : Factor w/ 5 levels "B1","B2","B3",..: 1 1 1 1 1 2 2 2 2 2 ...

Plot the data.

interaction.plot(simdat$block, simdat$treat, simdat$response)

interaction.plot(simdat$treat, simdat$block, simdat$response)

A two-way mixed effects ANOVA with interaction.

lme.fit <- lmer(response ~ treat + (1|block) + (1|treat:block), data=simdat)
## boundary (singular) fit: see help('isSingular')
summary(lme.fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula: response ~ treat + (1 | block) + (1 | treat:block)
##    Data: simdat
## 
## REML criterion at convergence: 484
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.76976 -0.57671  0.09739  0.62893  2.38758 
## 
## Random effects:
##  Groups      Name        Variance Std.Dev.
##  treat:block (Intercept) 0.000    0.000   
##  block       (Intercept) 5.314    2.305   
##  Residual                7.060    2.657   
## Number of obs: 100, groups:  treat:block, 20; block, 5
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  80.6071     1.1598  69.498
## treatB        5.1383     0.7515   6.837
## treatC       -4.8547     0.7515  -6.460
## treatD        9.8301     0.7515  13.080
## 
## Correlation of Fixed Effects:
##        (Intr) treatB treatC
## treatB -0.324              
## treatC -0.324  0.500       
## treatD -0.324  0.500  0.500
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Note the REML estimate for the (random) interaction. The formal test for no interaction.

lme.fit.red <- lmer(response ~ treat + (1|block), data=simdat)
summary(lme.fit.red)
## Linear mixed model fit by REML ['lmerMod']
## Formula: response ~ treat + (1 | block)
##    Data: simdat
## 
## REML criterion at convergence: 484
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.76976 -0.57671  0.09739  0.62893  2.38758 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept) 5.314    2.305   
##  Residual             7.060    2.657   
## Number of obs: 100, groups:  block, 5
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  80.6071     1.1598  69.498
## treatB        5.1383     0.7515   6.837
## treatC       -4.8547     0.7515  -6.460
## treatD        9.8301     0.7515  13.080
## 
## Correlation of Fixed Effects:
##        (Intr) treatB treatC
## treatB -0.324              
## treatC -0.324  0.500       
## treatD -0.324  0.500  0.500
anova(lme.fit,lme.fit.red)
## refitting model(s) with ML (instead of REML)
## Data: simdat
## Models:
## lme.fit.red: response ~ treat + (1 | block)
## lme.fit: response ~ treat + (1 | block) + (1 | treat:block)
##             npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)
## lme.fit.red    6 500.95 516.58 -244.48   488.95                    
## lme.fit        7 502.95 521.19 -244.48   488.95     0  1          1

The two-way mixed effects ANOVA without interaction, re-fit.

lme.fit <- lmer(response ~ treat + (1|block), data=simdat)
summary(lme.fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula: response ~ treat + (1 | block)
##    Data: simdat
## 
## REML criterion at convergence: 484
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.76976 -0.57671  0.09739  0.62893  2.38758 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept) 5.314    2.305   
##  Residual             7.060    2.657   
## Number of obs: 100, groups:  block, 5
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  80.6071     1.1598  69.498
## treatB        5.1383     0.7515   6.837
## treatC       -4.8547     0.7515  -6.460
## treatD        9.8301     0.7515  13.080
## 
## Correlation of Fixed Effects:
##        (Intr) treatB treatC
## treatB -0.324              
## treatC -0.324  0.500       
## treatD -0.324  0.500  0.500

Testing for a treatment effect.

lme.fit.red <- lmer(response ~ 1 + (1|block), data=simdat)
summary(lme.fit.red)
## Linear mixed model fit by REML ['lmerMod']
## Formula: response ~ 1 + (1 | block)
##    Data: simdat
## 
## REML criterion at convergence: 651.7
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.38755 -0.74849  0.03201  0.85945  1.75378 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  block    (Intercept)  3.733   1.932   
##  Residual             38.687   6.220   
## Number of obs: 100, groups:  block, 5
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   83.136      1.065   78.09
anova(lme.fit,lme.fit.red)
## refitting model(s) with ML (instead of REML)
## Data: simdat
## Models:
## lme.fit.red: response ~ 1 + (1 | block)
## lme.fit: response ~ treat + (1 | block)
##             npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## lme.fit.red    3 659.60 667.41 -326.80   653.60                         
## lme.fit        6 500.95 516.58 -244.48   488.95 164.65  3  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1