t-tests

University of Sydney

Recap

Example: Soft drink cans

Soft drink contents in a pack of six cans (in milliliters) are:

374.8 375.0 375.3 374.8 374.4 374.9

Is the soft drink content less than the 375 mL claimed on the label?

Example: Soft drink cans

Revision of hypothesis testing

Hypothesis testing

Hypotheses: \(H_0: \ \theta=\theta_0\) vs \(H_1: \ \theta>\theta_0\)
Test statistic: \(T=f(X_1, X_2,..., X_n)\).
Assumptions: Sample \(X_1, X_2, ..., X_n \sim F_{\theta}\)
Significance: \(p\mbox{-value} = P(T\geq t_0)\)
Decision: If \(p\mbox{-value} < \alpha\), there is evidence against \(H_0\).

Hypothesis

The statement against which you search for evidence is called the null hypothesis, and is denoted by \(H_0\). It is generally a “no difference” statement.
The statement you claim is called the alternative hypothesis, and is denoted by \(H_1\).

Typical Hypotheses:

\(H_0 : \theta = \theta_0\)

\(H_1 : \theta > \theta_0\) (upper-side alternative)

\(H_1 : \theta < \theta_0\) (lower-side alternative)

\(H_1 : \theta \ne \theta_0\) (two-sided alternative)

Test statistic

Since observations \(X_i\) vary from sample to sample we can never be sure whether \(H_0\) is true or not.
We use a test statistic \(T = f(X_1, ..., X_n)\) to test if the data are consistent with \(H_0\) such that…
1. the distribution of \(T\) is known assuming \(H_0\) is true;
2. the large (positive or negative depending on \(H_1\)) observed value of \(T\) is taken as evidence of poor agreement with \(H_0\).

Assumptions

Each observation \(X_1, X_2, ..., X_n\) is chosen at random from a population.
We say that such random variables are iid (independently and identically distributed).

Significance

The \(p\)-value (observed significance level) is defined as the probability of getting the observed test statistic \(t_0\) and more extreme values assuming that \(H_0\) is true…

\(p\)-value \(= P(T \ge t_0 )\); for \(H_1: \theta > \theta_0\)

\(p\)-value \(= P(T \le -t_0 )\); for \(H_1: \theta < \theta_0\)

\(p\)-value \(= P(T \ge t_0 )\); for \(H_1: \theta \ne \theta_0\)

## TableGrob (1 x 3) "arrange": 3 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]
## 3 3 (1-1,3-3) arrange gtable[layout]

Decision

An observed large positive or negative value of \(t_0\) and hence small \(p\)-value is taken as evidence of poor agreement with \(H_0\).

If the p-value is small, then either \(H_0\) is true and the poor agreement is due to an unlikely event, or \(H_0\) is false. Therefore..
- the smaller the p-value, the stronger the evidence against \(H_0\) in favour of \(H_1\).
A large \(p\)-value does not mean that there is evidence that \(H_0\) is true
The level of significance, \(\alpha\), is the strength of evidence needed to reject \(H_0\) (often \(\alpha = 0.05\)).

t-test

Suppose we have a sample \(X_1, X_2, ..., X_n\) of the size \(n\) drawn from a normal population with an unknown variance \(\sigma^2\). Let \(x_1, x_2, ..., x_n\) be the observed values. We want to test the population mean \(\mu\).

t-test

Hypothesis: \(H_0: \mu = \mu_0\) vs \(H_1: \mu > \mu_0, \ \mu < \mu_0\) (one-sided) or \(\mu \ne \mu_0\) (two-sided)
Test statistic: \(\tau_0 = \frac{\bar{x} - \mu_0}{s /\sqrt{n}}\)
Assumptions: \(X_i\) are iid rv and follow \(N(\mu, \sigma^2)\).
\(P\)-value: \(P(t_{n-1} \ge \tau_0)\), \(P(t_{n-1} \le \tau_0)\) or \(P(t_{n-1} \ge |\tau_0|)\)
Decision: Reject \(H_0\) in favour of \(H_1\) if the p-value is less than \(\alpha\).

Example: Soft drink cans

Soft drink contents in a pack of six cans (in milliliters) are:

374.8 375.0 375.3 374.8 374.4 374.9

Is the soft drink content less than the 375 mL claimed on the label?

x = c(374.8, 375, 375.3, 374.8, 374.4, 374.9)
mean(x)

## [1] 374.8667

sd(x)

## [1] 0.294392

In R

t.test(x, mu = 375, alternative = "less")

## 
##  One Sample t-test
## 
## data:  x
## t = -1.1094, df = 5, p-value = 0.1589
## alternative hypothesis: true mean is less than 375
## 95 percent confidence interval:
##      -Inf 375.1088
## sample estimates:
## mean of x 
##  374.8667

What if you have two samples?

There are times that we want to test if the population means of two samples are different.

Here we are left with two possible scenarios

The two samples are independent
The two samples are dependent

Discuss and write down a few examples of situations were the two samples would be independent or dependent.

Smokers and blood platelet aggregation

Blood samples are taken from \(11\) smokers and \(11\) non-smokers to measure aggregation of blood platelets.

Non.Smokers = c(25, 25, 27, 44, 30, 67, 53, 53, 52, 60, 28)
Smokers = c(27, 29, 37, 36, 46, 82, 57, 80, 61, 59, 43)

c(mean(Smokers), mean(Non.Smokers), sd(Smokers), sd(Non.Smokers))

## [1] 50.63636 42.18182 18.89589 15.61293

Is the aggregation affected by smoking?

Smokers and blood platelet aggregation

library(ggplot2)
Blood.platelet.levels = c(Non.Smokers, Smokers)
Group = rep(c("Non-smokers", "Smokers"), c(length(Non.Smokers), length(Smokers)))
qplot(Group, Blood.platelet.levels, geom = "boxplot")

Two-sample t-test

Hypotheses: \(H_0: \mu_x=\mu_y\) vs \(H_1: \mu_x>\mu_y, \ \mu_x<\mu_y\) or \(\mu_x\not =\mu_y\)
Test statistic: \(\tau_0 = \frac{{\bar x} - {\bar y}}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}\) where \(s^2_p = \frac{(n_x-1) s_{x}^2 + (n_y-1) s_{y}^2}{n_x+n_y-2}\).
Assumptions: \(X_1,...,X_{n_x}\) are iid \(N(\mu_X,\sigma^2)\), \(Y_1,...,Y_{n_y}\) are iid \(N(\mu_Y,\sigma^2)\) and \(X_i's\) are independent of \(Y_i's\). Hence \(\tau_0 \sim t_{n_x+n_y-2}\)
\(P\)-value: \(P(t_{n_x+n_y-2}\le \tau_{0})\), \(P(t_{n_x+n_y-2}\ge \tau_{0})\) or \(2P(t_{n_x+n_y-2} \ge |\tau_{0}|)\).
Decision: If \(p\mbox{-value}<\alpha\), there is evidence against \(H_0\). If \(p\mbox{-value}> \alpha\), the data are consistent with \(H_0\).

In R

t.test(Smokers, Non.Smokers, alternative = "two.sided")

## 
##  Welch Two Sample t-test
## 
## data:  Smokers and Non.Smokers
## t = 1.144, df = 19.313, p-value = 0.2666
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.997031 23.906122
## sample estimates:
## mean of x mean of y 
##  50.63636  42.18182

Paired sample t-test

Example: Smoking

Blood samples from \(11\) individuals before and after they smoked a cigarette are used to measure aggregation of blood platelets.

Before = c(25, 25, 27, 44, 30, 67, 53, 53, 52, 60, 28)
After = c(27, 29, 37, 36, 46, 82, 57, 80, 61, 59, 43)

Is the aggregation affected by smoking?

Graphical exploration

df = data.frame(Difference = After - Before, group = 1)
p1 = ggplot(df, aes(group, Difference)) + geom_boxplot() + theme_bw() + geom_hline(yintercept = 0, 
    linetype = "dashed") + ylab("Difference in blood platelet levels") + theme(axis.title.x = element_blank(), 
    axis.text.x = element_blank(), axis.ticks.x = element_blank())
p1

Paired sample t-test

t.test(Before - After)

## 
##  One Sample t-test
## 
## data:  Before - After
## t = -2.9065, df = 10, p-value = 0.01566
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -14.93577  -1.97332
## sample estimates:
## mean of x 
## -8.454545

t.test(Before, After, paired = TRUE)

## 
##  Paired t-test
## 
## data:  Before and After
## t = -2.9065, df = 10, p-value = 0.01566
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -14.93577  -1.97332
## sample estimates:
## mean of the differences 
##               -8.454545

Paired sample t-test

Examples

Will a paired-sample t-test always give smaller p-values than a two-sample t-test?
Discuss and write down a few examples that you think illustrate your answer.