Exercises: Hypothesis tests, parametric

Exercise 1 The hemoglobin value (Hb) in women is on average 140 g/L. You observe the following Hb values in a set of five male blood donors: 154, 140, 147, 162, 172. Assume that Hb is normally distributed. Is there a reason to believe that the mean Hb value in men differ from that in women?

Let \(X\) denote the Hb value in g/L for male blood donors.

\[H_0: \bar X = 140\] \[H_1: \bar X \neq 140\]

Use the significance level \(\alpha=0.05\).

Use the test statistic \(T = \frac{\bar X - \mu_0}{SE},\) where \(\mu_0\) is the population mean if \(H_0\) is true, i.e. 140 and \(SE\) is the standard error of mean, \(SE=\frac{\sigma}{\sqrt{n}} \approx \frac{s}{\sqrt{n}}\).

Compute \(t_{obs} = \frac{m-140}{s/\sqrt{n}}\)

x <- c(154, 140, 147, 162, 172)
## sample mean
m <- mean(x)
## standard error of mean
SE <- sd(x)/sqrt(5)
## Observed value of test statistic
(tobs <- (m-140)/SE)
[1] 2.676865

Compute the p-value, \(P(|T|>|t_{obs}|) = P(T>|t_{obs}|) + P(T<-|t_{obs}|)\)

## P(T>tobs)
pt(tobs, df=4, lower.tail=FALSE)
[1] 0.02770443
## p = P(t>tobs) + P(t<-tobs) = 2 * P(t>tobs)
(p <- 2*pt(tobs, df=4, lower.tail=FALSE))
[1] 0.05540887

As \(p > \alpha\) the null hypothesis is accepted and we conclude that there is not reason to believe that the Hb values for men differ from that of women.

Note that the same can be achieved uing the function t.test;


    One Sample t-test

data:  x
t = 2.6769, df = 4, p-value = 0.05541
alternative hypothesis: true mean is not equal to 140
95 percent confidence interval:
 139.442 170.558
sample estimates:
mean of x 
      155 

Exercise 2 The hemoglobin value (Hb) in men is on average 188 g/L. The Hb values in Exercise 1 were actually measured after the men had donated blood. Is there a reason to believe that the mean Hb level for men after blood donation is less than 188 g/L?

\(H_0: \bar X = 188\) \(H_1: \bar X < 188\)

Use the significance level \(\alpha=0.5\).

Use the test statistic \(T = \frac{\bar X - \mu_0}{SE},\) where \(\mu_0=188\).

Compute \(t_{obs} = \frac{m-188}{s/\sqrt{n}}\)

## Observed value of test statistic
(tobs <- (m-188)/SE)
[1] -5.889103

Compute the p-value, \(P(T<t_{obs})\)

## p=P(T<tobs)
(p <- pt(tobs, df=4, lower.tail=TRUE))
[1] 0.002078429

As \(p < \alpha\) the null hypothesis is rejected and we can conclude that there is reason to believe that the Hb values after blood donation is less than 188 g/L.

Note that the same can be achieved uing the function t.test;

t.test(x, mu=188, alternative="less")

    One Sample t-test

data:  x
t = -5.8891, df = 4, p-value = 0.002078
alternative hypothesis: true mean is less than 188
95 percent confidence interval:
    -Inf 166.946
sample estimates:
mean of x 
      155 

Exercise 3 By observing the Hb values in 5 male blood donors; 154, 140, 147, 162, 172 g/L, and 5 female blood donors: 123, 140, 137, 132, 127 g/L, is there a reason to believe that the Hb level is higher in men than in women?

Let \(X_m\) denote the Hb value in men and \(X_w\) the Hb value in women.

\(H_0: \bar X_m = \bar X_w\) \(H_1: \bar X_m > \bar X_w\)

Use the significance level \(\alpha=0.5\).

Use the test statistic \(T = \frac{\bar X_m - \bar X_w}{SE}\). If we assume equal variances we can use Student’s t-test and compute SE as the pooled standard deviation. An alternative is to use Welch t-test (unequal variances t-test) and compute \(SE=\sqrt{\frac{s_m^2}{n_m} + \frac{s_w^2}{n_w}}\), the test statistic is t-distributed and the degrees of freedom can be approximated using Welch-Satterthwaite’s equation, as implemented in t.test.

x <- c(154, 140, 147, 162, 172)
y <- c(123, 140, 137, 132, 127)
## Perform t-test with unequal variances
t.test(x, y, alternative="greater")

    Welch Two Sample t-test

data:  x and y
t = 3.6171, df = 6.2637, p-value = 0.00517
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 10.8297     Inf
sample estimates:
mean of x mean of y 
    155.0     131.8 

As the p-value \(p=0.0051699\) is \(<\alpha\), we reject the null hypothesis and conclude that there is reason to belive that Hb on average is higher in male blood donors than in female.

Exercise 4 Based on statistics from blodcentralen we learn that male Hb values (before donation) is normally distributed with mean 188 g/L and standard deviation 16 g/L. Using this new information and the following observed Hb values in five male donors after donation; 154, 140, 147, 162, 172 g/L, is there reason to believe that the mean Hb value for men after blood donation is lower than 188 g/L?

\(H_0: \bar X = 188\) \(H_1: \bar X < 188\)

Use \(\alpha = 0.05\)

Under \(H_0\);

\(X \sim N(188, 16)\)

\(\bar X \sim N(188, 16/\sqrt{5})\)

Use the test statistic

\(Z = \frac{\bar X - 188}{16/\sqrt{5}} \sim N(0,1)\)

Compute \(z_{obs}\)

Compute the p-value $P(X ) = P(Z zobs) = $

[1] 1.995119e-06

As \(p<<\alpha\), reject the \(H_0\), there is reason to believe that the mean Hb value for men after blood donation is lower than 188 g/L.

Exercise 5 In order to study the effect of high-fat diet 12 mice are fed normal diet (control group) and 12 mice are fed high-fat diet. After a couple of weeks the mouse weights in gram are recorded;

High-fat mice (g): 25, 30, 23, 18, 31, 24, 39, 26, 36, 29, 23, 32 Normal diet mice (g): 27, 25, 22, 23, 25, 37, 24, 26, 21, 26, 30, 24

Does high fat diet increase body weight in mice?

  1. Assume equal variances.
  2. Don’t assume equal variances.
## Student's t-test with pooled variances
t.test(xHF, xN, var.equal=TRUE, alternative="greater")

    Two Sample t-test

data:  xHF and xN
t = 1.0234, df = 22, p-value = 0.1586
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -1.468785       Inf
sample estimates:
mean of x mean of y 
 28.00000  25.83333 
## Unequal variances with Welch approximation to the degrees of freedom (the default)
t.test(xHF, xN, var.equal=FALSE, alternative="greater")

    Welch Two Sample t-test

data:  xHF and xN
t = 1.0234, df = 19.818, p-value = 0.1592
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
 -1.486449       Inf
sample estimates:
mean of x mean of y 
 28.00000  25.83333