```{r setup, include=FALSE} knitr::opts_chunk$set(echo=TRUE, fig.align="center") ``` #### Statistics for Laboratory Scientists ( 140.615 ) ## Confidence Intervals for Proportions ### Tick example 1 from class Suppose X ~ Binomial(n=29,p) and we wish to test H$_0$: p=1/2. Our observed data are X = 24. The easy way. ```{r} binom.test(24,29) binom.test(24,29)$p.value ``` The drawn-out way. First, find the lower endpoint of rejection region. ```{r} qbinom(0.025,29,0.5) pbinom(9,29,0.5) ``` So 9 is too big, and 8 is the lower critical value. ```{r} pbinom(8,29,0.5) ``` The actual significance level is Pr(X $\leq$ 8 or X $\geq$ 21 | p = 1/2). ```{r} pbinom(8,29,0.5) + 1-pbinom(20,29,0.5) ``` The p-value for the observed data X = 24 is 2*Pr(X $\geq$ 24 | p = 1/2). ```{r} 2*(1-pbinom(23,29,0.5)) ``` The 95% confidence interval for p. ```{r} binom.test(24,29)$conf.int ``` ### Tick example 2 from class Suppose X ~ Binomial(n=25,p) and we wish to test H$_0$: p=1/2. Our observed data are X = 17. ```{r} binom.test(17,25) binom.test(17,25)$p.value binom.test(17,25)$conf.int ``` ### The case X = 0 X ~ Binomial(n=15,p), observe X = 0. The 95% confidence interval for p. The easy way. ```{r} binom.test(0,15)$conf.int ``` The direct way. The lower limit is 0, and the upper limit is ```{r} 1-(0.025)^(1/15) ``` The rule of thumb for the upper limit: 3/n. ```{r} 3/15 ``` ### The case X = n X ~ Binomial(n=15,p), observe X = 15. The 95% confidence interval for p. The easy way. ```{r} binom.test(15,15)$conf.int ``` The direct way. The upper limit is 1, and the lower limit is ```{r} (0.025)^(1/15) ``` The rule of thumb for the lower limit: 1-3/n. ```{r} 1-3/15 ``` ### The Normal approximation Image you observe 22 successes in a Binomial experiment with n=100. Calculate a 95% confidence interval for the success probability p. ```{r} x <- 22 n <- 100 phat <- x/n phat ``` The exact Binomial confidence interval. ```{r} binom.test(x,n)$conf.int ``` Rule of thumb: if both $n \times \hat{p}$ and $n \times \hat{p} \times (1-\hat{p})$ are larger than 5, you can also use a Normal approximation for the confidence interval. ```{r} n*phat n*phat*(1-phat) ``` All clear - using the formula from class, we get: ```{r} phat+c(-1,1)*1.96*sqrt(phat*(1-phat)/n) ``` Note that the two confidence intervals are indeed very close. ```{r} round(binom.test(x,n)$conf.int,2) round(phat+c(-1,1)*1.96*sqrt(phat*(1-phat)/n),2) ``` A simulation study, using a Binomial experiment with n=100 and p=0.3. Simulate 10,000 outcomes, make a histogram, and add the Normal curve with mean $p$ and standard deviation $\sqrt{p\times(1-p)/n}$. ```{r} n <- 100 p <- 0.3 set.seed(1) x <- rbinom(10000,n,p) phats <- x/n hist(phats,breaks=seq(0.1,0.5,0.01),prob=TRUE) curve(dnorm(x,mean=p,sd=sqrt(p*(1-p)/n)),add=TRUE,col="red",lwd=2) ```