Maximum Likelihood Estimation

Statistics for Laboratory Scientists ( 140.615 )

Example from class: Binomial

We have 22 successes out of 100 trials. Some Binomial probabilities.

dbinom(22,100,prob=22/100)
## [1] 0.09591895
dbinom(22,100,prob=0.10)
## [1] 0.0001977617
dbinom(22,100,prob=0.20)
## [1] 0.08489953
dbinom(22,100,prob=0.25)
## [1] 0.07493508
p <- seq(0,1,by=0.01)
p
##   [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20
##  [22] 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40 0.41
##  [43] 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.50 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.60 0.61 0.62
##  [64] 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.70 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.80 0.81 0.82 0.83
##  [85] 0.84 0.85 0.86 0.87 0.88 0.89 0.90 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00
plot(p,dbinom(22,100,prob=p))

That is the likelihood function!

curve(dbinom(22,100,x),from=0,to=1,n=251,ylab="likelihood")
abline(v=22/100,lwd=2,col="blue")     

curve(dbinom(22,100,x),from=0.1,to=0.4,n=251,ylab="likelihood")
abline(v=22/100,lwd=2,col="blue")     

The log likelihood function from class.

ll <- function(x,n,p){
  x*log(p)+(n-x)*log(1-p)+log(choose(n,x))
}

The log likelihood for some values of p.

p <- seq(0.1,0.4,by=0.01)
p
##  [1] 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.30
## [22] 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40
ll(x=22,n=100,p=p)
##  [1] -8.528448 -7.303141 -6.270256 -5.400755 -4.672124 -4.066572 -3.569812 -3.170213 -2.858194 -2.625783
## [11] -2.466287 -2.374048 -2.344252 -2.372779 -2.456089 -2.591133 -2.775273 -3.006227 -3.282018 -3.600936
## [21] -3.961503 -4.362448 -4.802683 -5.281284 -5.797473 -6.350610 -6.940177 -7.565771 -8.227097 -8.923957
## [31] -9.656250
curve(ll(22,100,x),from=0.1,to=0.4,n=251,ylab="log-likelihood")
abline(v=22/100,lwd=2,col="blue") 

You can also use the dbinom() function with argument log=TRUE.

dbinom(22,100,p,log=TRUE)
##  [1] -8.528448 -7.303141 -6.270256 -5.400755 -4.672124 -4.066572 -3.569812 -3.170213 -2.858194 -2.625783
## [11] -2.466287 -2.374048 -2.344252 -2.372779 -2.456089 -2.591133 -2.775273 -3.006227 -3.282018 -3.600936
## [21] -3.961503 -4.362448 -4.802683 -5.281284 -5.797473 -6.350610 -6.940177 -7.565771 -8.227097 -8.923957
## [31] -9.656250
curve(dbinom(22,100,x,log=TRUE),from=0.1,to=0.4,n=251,ylab="log-likelihood")
abline(v=22/100,lwd=2,col="blue")     

To find the maximum likelihood estimate you do not need the constant in the log likelihood function, i.e., the last part that does not depend on p.

ll.2 <- function(x,n,p){
  x*log(p)+(n-x)*log(1-p)
}
curve(ll.2(22,100,x),from=0.1,to=0.4,n=251,ylab="log-likelihood (version 2)")
abline(v=22/100,lwd=2,col="blue") 

Example from class: Poisson log likelihood

We observe the following outcomes from a Poisson distribution.

dat <- c(1,1,0,1,2,5,2,4,0,0,1,0,3,1,0,0,1,1,3,4)
dat
##  [1] 1 1 0 1 2 5 2 4 0 0 1 0 3 1 0 0 1 1 3 4
table(dat)
## dat
## 0 1 2 3 4 5 
## 6 7 2 2 2 1
length(dat)
## [1] 20
mean(dat)
## [1] 1.5

The log likelihood function.

ll <- function(dat,lambda){
  -length(dat)*lambda+sum(dat)*log(lambda)-sum(log(gamma(dat+1)))
}

The log likelihood for some values of lambda.

ll(dat,seq(0.5,4.5,by=0.1))
##  [1] -46.90783 -43.43818 -40.81366 -38.80772 -37.27423 -36.11341 -35.25411 -34.64377 -34.24248 -34.01925
## [11] -33.94946 -34.01330 -34.19457 -34.47981 -34.85780 -35.31900 -35.85529 -36.45969 -37.12614 -37.84935
## [21] -38.62469 -39.44807 -40.31586 -41.22483 -42.17209 -43.15504 -44.17135 -45.21889 -46.29574 -47.40015
## [31] -48.53052 -49.68540 -50.86343 -52.06338 -53.28412 -54.52458 -55.78380 -57.06088 -58.35496 -59.66528
## [41] -60.99109

The plot.

curve(ll(dat,x),from=0.5,to=4,xlab=expression(lambda),ylab="log likelihood",main="n=20, mean=1.5")
abline(v=mean(dat),lwd=2,col="blue")

You can also use the dpois() function with argument log=TRUE.

dpois(dat,lambda=1.5,log=TRUE)
##  [1] -1.094535 -1.094535 -1.500000 -1.094535 -1.382217 -4.260166 -1.382217 -3.056193 -1.500000 -1.500000
## [11] -1.094535 -1.500000 -2.075364 -1.094535 -1.500000 -1.500000 -1.094535 -1.094535 -2.075364 -3.056193
sum(dpois(dat,lambda=1.5,log=TRUE))
## [1] -33.94946