In the population we can see that the average age for women is \[μ_f = 34.763\text{ years}\]
And for men it is \[μ_m = 33.560\text{ years}\]
so women are older than men, on average
This time we have a sample of men and a sample of women
We calculate \(\bar{X}_f\) and \(\bar{X}_m\)
Each one can be modelled by a Normal distribution (why?)
Then \((\bar{X}_f - \bar{X}_m)\) follows a normal
What are the parameters of this distribution?
A normal distribution is defined by two parameters
Since we deal with averages, we have \[\begin{aligned}\bar{X}_f &∼ N(μ_f, σ^2_f/n_f) \\ \bar{X}_m &∼ N(μ_m, σ^2_m/n_m) \end{aligned}\]
What are the parameters for \((\bar{X}_f - \bar{X}_m)\)?
“Expected value of sum is sum of expected values” \[μ=μ_f - μ_m\]
“Variance of sum is sum of variances” \[σ^2=\frac{σ^2_f}{n_f} + \frac{σ^2_m}{n_m}\]
(how do we handle the signs?)
First, define the confidence level. Call it \((1-α)\)
Then, each tail must include \(α/2\) of the cases
We look in the inverse Normal function to find \(k_l\) and \(k_u\)
The interval for \((\bar{X}_f - \bar{X}_m)\) is \[[μ + k_l⋅ σ, μ + k_u⋅σ ]\]
Now we know \(y=(\bar{X}_f - \bar{X}_m)\) and want to find \(μ\)
We build a similar interval \[I=[y + k_l⋅ σ, y + k_u⋅σ ]\] then \[ℙ(μ∈I) = 1-α\]
If the interval \(I\) does not contain 0, we can be confident that \(μ_f ≠ μ_m\)
How confident? Well, \((1-α)\)
If \(0∈I\) then it is possible that \(μ_f = μ_m\)
We do not have enough evidence to decide
The smallest \(α\) is the one that makes one of the interval limits equal to 0
If \(μ_f = μ_m\) (that is, \(μ=0\)), what is the probability that we observe \((\bar{X}_f - \bar{X}_m)= y\)?
There is a standard framework for these questions
We start by defining what do we want to test, and what is the alternative \[\begin{aligned} H_0:&μ_f = μ_m\\ H_a:&μ_f ≠ μ_m\\ \end{aligned}\]
Here \(H_0\) is called null hypothesis and \(H_a\) is the alternative hypothesis
Basically we want to know the probability of observing \(\bar{X}_f ≠ \bar{X}_m\) equal to \(y\) or more, assuming \(H_0\)
In this case we want to calculate \[ℙ(|\bar{X}_f- \bar{X}_m| ≥ y | μ = 0, σ )?\]
This is called a two-sided test
If we declare our test as \[\begin{aligned} H_0:&μ_f = μ_m\\ H_a:&μ_f > μ_m\\ \end{aligned}\]
Then the question is
\[ℙ(\bar{X}_f - \bar{X}_m ≥ y| μ = 0, σ )?\]