```{r setup, include=FALSE}
knitr::opts_chunk$set(echo=TRUE, fig.align="center")
```

#### Statistics for Laboratory Scientists ( 140.615 )

## Confidence Intervals for Proportions

### Tick example 1 from class

Suppose X ~ Binomial(n=29,p) and we wish to test H$_0$: p=1/2.

Our observed data are X = 24.

The easy way.

```{r}
binom.test(24,29)
binom.test(24,29)$p.value
```

The drawn-out way. First, find the lower endpoint of rejection region.

```{r}
qbinom(0.025,29,0.5) 
pbinom(9,29,0.5)
```

So 9 is too big, and 8 is the lower critical value.

```{r}
pbinom(8,29,0.5)
```

The actual significance level is Pr(X $\leq$ 8 or X $\geq$ 21 | p = 1/2).

```{r}
pbinom(8,29,0.5) + 1-pbinom(20,29,0.5) 
```

The p-value for the observed data X = 24 is 2*Pr(X $\geq$ 24 | p = 1/2).

```{r}
2*(1-pbinom(23,29,0.5)) 
```

The 95% confidence interval for p.

```{r}
binom.test(24,29)$conf.int
```

### Tick example 2 from class

Suppose X ~ Binomial(n=25,p) and we wish to test H$_0$: p=1/2.

Our observed data are X = 17.

```{r}
binom.test(17,25)
binom.test(17,25)$p.value
binom.test(17,25)$conf.int 
```

### The case X = 0

X ~ Binomial(n=15,p), observe X = 0.

The 95% confidence interval for p.

The easy way.

```{r}
binom.test(0,15)$conf.int
```

The direct way. The lower limit is 0, and the upper limit is

```{r}
1-(0.025)^(1/15)
```

The rule of thumb for the upper limit: 3/n.

```{r}
3/15
```

### The case X = n

X ~ Binomial(n=15,p), observe X = 15.

The 95% confidence interval for p.

The easy way.

```{r}
binom.test(15,15)$conf.int
```

The direct way. The upper limit is 1, and the lower limit is

```{r}
(0.025)^(1/15)
```

The rule of thumb for the lower limit: 1-3/n.

```{r}
1-3/15
```

### The Normal approximation

Image you observe 22 successes in a Binomial experiment with n=100. Calculate a 95% confidence interval for the success probability p. 

```{r}
x <- 22
n <- 100
phat <- x/n
phat
```

The exact Binomial confidence interval.

```{r}
binom.test(x,n)$conf.int
```

Rule of thumb: if both $n \times \hat{p}$ and $n \times \hat{p} \times (1-\hat{p})$ are larger than 5, you can also use a Normal approximation for the confidence interval.

```{r}
n*phat
n*phat*(1-phat)
```

All clear - using the formula from class, we get:

```{r}
phat+c(-1,1)*1.96*sqrt(phat*(1-phat)/n)
```

Note that the two confidence intervals are indeed very close.

```{r}
round(binom.test(x,n)$conf.int,2)
round(phat+c(-1,1)*1.96*sqrt(phat*(1-phat)/n),2)
```

A simulation study, using a Binomial experiment with n=100 and p=0.3. Simulate 10,000 outcomes, make a histogram, and add the Normal curve with mean $p$ and standard deviation $\sqrt{p\times(1-p)/n}$.

```{r}
n <- 100
p <- 0.3
set.seed(1)
x <- rbinom(10000,n,p)
phats <- x/n
hist(phats,breaks=seq(0.1,0.5,0.01),prob=TRUE)
curve(dnorm(x,mean=p,sd=sqrt(p*(1-p)/n)),add=TRUE,col="red",lwd=2)
```