# Are men same age as women, on average?

## If we know the population

In the population we can see that the average age for women is \[μ_f = 34.763\text{ years}\]

And for men it is \[μ_m = 33.560\text{
years}\]

so women are older than men, on average

## But we do not know the population

This time we have a sample of men and a sample of women

We calculate \(\bar{X}_f\) and \(\bar{X}_m\)

Each one can be modelled by a Normal distribution (why?)

Then \((\bar{X}_f - \bar{X}_m)\)
follows a normal

What are the parameters of this distribution?

## Parameters of a Normal distribution

A normal distribution is defined by two parameters

- The mean \(μ\)
- The variance \(σ^2\)

Since we deal with averages, we have \[\begin{aligned}\bar{X}_f &∼ N(μ_f, σ^2_f/n_f)
\\
\bar{X}_m &∼ N(μ_m, σ^2_m/n_m)
\end{aligned}\]

What are the parameters for \((\bar{X}_f -
\bar{X}_m)\)?

## Parameters of \((\bar{X}_f -
\bar{X}_m)\)

“Expected value of sum is sum of expected values” \[μ=μ_f - μ_m\]

“Variance of sum is sum of variances” \[σ^2=\frac{σ^2_f}{n_f} +
\frac{σ^2_m}{n_m}\]

(how do we handle the signs?)

## Confidence interval for \((\bar{X}_f -
\bar{X}_m)\)

First, define the confidence level. Call it \((1-α)\)

Then, each tail must include \(α/2\)
of the cases

We look in the *inverse Normal* function to find \(k_l\) and \(k_u\)

The interval for \((\bar{X}_f -
\bar{X}_m)\) is \[[μ + k_l⋅ σ, μ +
k_u⋅σ ]\]

## We do not know \(μ.\) Finding
it

Now we know \(y=(\bar{X}_f -
\bar{X}_m)\) and want to find \(μ\)

We build a similar interval \[I=[y + k_l⋅
σ, y + k_u⋅σ ]\] then \[ℙ(μ∈I) =
1-α\]

## Are men the same age of women?

If the interval \(I\) does not
contain 0, we can be confident that \(μ_f ≠
μ_m\)

How confident? Well, \((1-α)\)

If \(0∈I\) then it is possible that
\(μ_f = μ_m\)

We do not have enough evidence to decide

## What is the smallest \(α\) that
works?

The smallest \(α\) is the one that
makes one of the interval limits equal to 0

If \(μ_f = μ_m\) (that is, \(μ=0\)), what is the probability that we
observe \((\bar{X}_f - \bar{X}_m)=
y\)?

## Hypothesis declaration

There is a standard framework for these questions

We start by defining what do we want to test, and what is the
alternative \[\begin{aligned}
H_0:&μ_f = μ_m\\
H_a:&μ_f ≠ μ_m\\
\end{aligned}\]

Here \(H_0\) is called *null
hypothesis* and \(H_a\) is the
*alternative hypothesis*

## Hypothesis test

Basically we want to know the probability of observing \(\bar{X}_f ≠ \bar{X}_m\) equal to \(y\) *or more*, assuming \(H_0\)

In this case we want to calculate \[ℙ(|\bar{X}_f- \bar{X}_m| ≥ y | μ = 0, σ
)?\]

This is called a *two-sided test*

## One-sided test

If we declare our test as \[\begin{aligned}
H_0:&μ_f = μ_m\\
H_a:&μ_f > μ_m\\
\end{aligned}\]

Then the question is

\[ℙ(\bar{X}_f - \bar{X}_m ≥ y| μ = 0, σ
)?\]