Statistics for Laboratory Scientists ( 140.615 )

Confidence Intervals for Proportions

Tick example 1 from class

Suppose X ~ Binomial(n=29,p) and we wish to test H\(_0\): p=1/2.

Our observed data are X = 24.

The easy way.

binom.test(24,29)
## 
##  Exact binomial test
## 
## data:  24 and 29
## number of successes = 24, number of trials = 29, p-value = 0.0005461
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.6422524 0.9415439
## sample estimates:
## probability of success 
##              0.8275862
binom.test(24,29)$p.value
## [1] 0.0005461127

The drawn-out way. First, find the lower endpoint of rejection region.

qbinom(0.025,29,0.5) 
## [1] 9
pbinom(9,29,0.5)
## [1] 0.03071417

So 9 is too big, and 8 is the lower critical value.

pbinom(8,29,0.5)
## [1] 0.01205977

The actual significance level is Pr(X \(\leq\) 8 or X \(\geq\) 21 | p = 1/2).

pbinom(8,29,0.5) + 1-pbinom(20,29,0.5) 
## [1] 0.02411954

The p-value for the observed data X = 24 is 2*Pr(X \(\geq\) 24 | p = 1/2).

2*(1-pbinom(23,29,0.5)) 
## [1] 0.0005461127

The 95% confidence interval for p.

binom.test(24,29)$conf.int
## [1] 0.6422524 0.9415439
## attr(,"conf.level")
## [1] 0.95

Tick example 2 from class

Suppose X ~ Binomial(n=25,p) and we wish to test H\(_0\): p=1/2.

Our observed data are X = 17.

binom.test(17,25)
## 
##  Exact binomial test
## 
## data:  17 and 25
## number of successes = 17, number of trials = 25, p-value = 0.1078
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
##  0.4649993 0.8505046
## sample estimates:
## probability of success 
##                   0.68
binom.test(17,25)$p.value
## [1] 0.1077521
binom.test(17,25)$conf.int 
## [1] 0.4649993 0.8505046
## attr(,"conf.level")
## [1] 0.95

The case X = 0

X ~ Binomial(n=15,p), observe X = 0.

The 95% confidence interval for p.

The easy way.

binom.test(0,15)$conf.int
## [1] 0.0000000 0.2180194
## attr(,"conf.level")
## [1] 0.95

The direct way. The lower limit is 0, and the upper limit is

1-(0.025)^(1/15)
## [1] 0.2180194

The rule of thumb for the upper limit: 3/n.

3/15
## [1] 0.2

The case X = n

X ~ Binomial(n=15,p), observe X = 15.

The 95% confidence interval for p.

The easy way.

binom.test(15,15)$conf.int
## [1] 0.7819806 1.0000000
## attr(,"conf.level")
## [1] 0.95

The direct way. The upper limit is 1, and the lower limit is

(0.025)^(1/15)
## [1] 0.7819806

The rule of thumb for the lower limit: 1-3/n.

1-3/15
## [1] 0.8

The Normal approximation

Image you observe 22 successes in a Binomial experiment with n=100. Calculate a 95% confidence interval for the success probability p. 

x <- 22
n <- 100
phat <- x/n
phat
## [1] 0.22

The exact Binomial confidence interval.

binom.test(x,n)$conf.int
## [1] 0.1433036 0.3139197
## attr(,"conf.level")
## [1] 0.95

Rule of thumb: if both \(n \times \hat{p}\) and \(n \times \hat{p} \times (1-\hat{p})\) are larger than 5, you can also use a Normal approximation for the confidence interval.

n*phat
## [1] 22
n*phat*(1-phat)
## [1] 17.16

All clear - using the formula from class, we get:

phat+c(-1,1)*1.96*sqrt(phat*(1-phat)/n)
## [1] 0.1388077 0.3011923

Note that the two confidence intervals are indeed very close.

round(binom.test(x,n)$conf.int,2)
## [1] 0.14 0.31
## attr(,"conf.level")
## [1] 0.95
round(phat+c(-1,1)*1.96*sqrt(phat*(1-phat)/n),2)
## [1] 0.14 0.30

A simulation study, using a Binomial experiment with n=100 and p=0.3. Simulate 10,000 outcomes, make a histogram, and add the Normal curve with mean \(p\) and standard deviation \(\sqrt{p\times(1-p)/n}\).

n <- 100
p <- 0.3
set.seed(1)
x <- rbinom(10000,n,p)
phats <- x/n
hist(phats,breaks=seq(0.1,0.5,0.01),prob=TRUE)
curve(dnorm(x,mean=p,sd=sqrt(p*(1-p)/n)),add=TRUE,col="red",lwd=2)