# Are men same age as women, on average?

## If we know the population

In the population we can see that the average age for women is $μ_f = 34.763\text{ years}$

And for men it is $μ_m = 33.560\text{ years}$

so women are older than men, on average

## But we do not know the population

This time we have a sample of men and a sample of women

We calculate $$\bar{X}_f$$ and $$\bar{X}_m$$

Each one can be modelled by a Normal distribution (why?)

Then $$(\bar{X}_f - \bar{X}_m)$$ follows a normal

What are the parameters of this distribution?

## Parameters of a Normal distribution

A normal distribution is defined by two parameters

• The mean $$μ$$
• The variance $$σ^2$$

Since we deal with averages, we have \begin{aligned}\bar{X}_f &∼ N(μ_f, σ^2_f/n_f) \\ \bar{X}_m &∼ N(μ_m, σ^2_m/n_m) \end{aligned}

What are the parameters for $$(\bar{X}_f - \bar{X}_m)$$?

## Parameters of $$(\bar{X}_f - \bar{X}_m)$$

“Expected value of sum is sum of expected values” $μ=μ_f - μ_m$

“Variance of sum is sum of variances” $σ^2=\frac{σ^2_f}{n_f} + \frac{σ^2_m}{n_m}$

(how do we handle the signs?)

## Confidence interval for $$(\bar{X}_f - \bar{X}_m)$$

First, define the confidence level. Call it $$(1-α)$$

Then, each tail must include $$α/2$$ of the cases

We look in the inverse Normal function to find $$k_l$$ and $$k_u$$

The interval for $$(\bar{X}_f - \bar{X}_m)$$ is $[μ + k_l⋅ σ, μ + k_u⋅σ ]$

## We do not know $$μ.$$ Finding it

Now we know $$y=(\bar{X}_f - \bar{X}_m)$$ and want to find $$μ$$

We build a similar interval $I=[y + k_l⋅ σ, y + k_u⋅σ ]$ then $ℙ(μ∈I) = 1-α$

## Are men the same age of women?

If the interval $$I$$ does not contain 0, we can be confident that $$μ_f ≠ μ_m$$

How confident? Well, $$(1-α)$$

If $$0∈I$$ then it is possible that $$μ_f = μ_m$$

We do not have enough evidence to decide

## What is the smallest $$α$$ that works?

The smallest $$α$$ is the one that makes one of the interval limits equal to 0

If $$μ_f = μ_m$$ (that is, $$μ=0$$), what is the probability that we observe $$(\bar{X}_f - \bar{X}_m)= y$$?

## Hypothesis declaration

There is a standard framework for these questions

We start by defining what do we want to test, and what is the alternative \begin{aligned} H_0:&μ_f = μ_m\\ H_a:&μ_f ≠ μ_m\\ \end{aligned}

Here $$H_0$$ is called null hypothesis and $$H_a$$ is the alternative hypothesis

## Hypothesis test

Basically we want to know the probability of observing $$\bar{X}_f ≠ \bar{X}_m$$ equal to $$y$$ or more, assuming $$H_0$$

In this case we want to calculate $ℙ(|\bar{X}_f- \bar{X}_m| ≥ y | μ = 0, σ )?$

This is called a two-sided test

## One-sided test

If we declare our test as \begin{aligned} H_0:&μ_f = μ_m\\ H_a:&μ_f > μ_m\\ \end{aligned}

Then the question is

$ℙ(\bar{X}_f - \bar{X}_m ≥ y| μ = 0, σ )?$