23 September 2016

## Introduction to Probabilities

• A platonic ideal of data
• Random variables as representations of infinite sets
• Distributions as descriptions of random variables

## A vector of infinite size

• Now we imagine that we throw the dice infinite times
• Instead of a vector $$\mathrm{y}$$ we have a random variable $$Y$$
• The size $$n$$ goes to infinite, but $$p(y)$$ remains bounded
• Now the $$p(y)$$ function is called probability distribution
• As before, all information about $$Y$$ is contained in $$p(y)$$

## Distributions

• Each random process has a corresponding distribution
• Usually the distribution is a formula
• The formula depends on the mechanics of the process
• It also depends on some parameters (constants)
• The parameters depend on the specific conditions

## Examples: Coin

• Has 2 sides: Head and Tail $p(\mathrm{Head}) = 1 - p(\mathrm{Tail})$ This is the mechanics of the coin
• If the coin is symmetric then $p(\mathrm{Head}) = p(\mathrm{Tail}) = 0.5$ This parameter depends on the specific coin

## Important parameters

Expected value: equivalent to the mean of the infinite set $E(Y) = \sum_y y \cdot p(y)$ Variance: equivalent to the mean square error $V(Y) = \sum_y (y-E(Y))^2\cdot p(y)$

## Important case: Normal distribution

• The bell-shaped Gaussian distribution is the most common in experiments
• It arises in many experimental conditions
• In particular for additive experimental noise
• There are some important exceptions

## Statistical inference

• How to estimate the parameters of a distribution
• how to determine confidence intervals

## Sampling the “infinite set”

We make an experiment

• We cannot measure all the random variable
• We get only a few values ($$n$$) in a sample $$\mathbf y$$
• We assume that the sample is a subset of the random variable
• If we repeat the experiment we will have different values

What can we learn from the sample?

## Estimating the Expected Value

Fortunate each $$y_i$$ has expected value $$E(Y)$$

But probably not exactly equal

Chebyshev’s inequality: $\Pr\left(|y_i-E(Y)|\geq V(Y) \cdot k\right)\leq \frac{1}{k^2}$ The probability that $$E(Y)$$ is outside $$y_i\pm V(Y) \cdot k$$ is $$1/k^2$$

## Using averages

$E(\bar{\mathbf{y}})=E(Y)$ $V(\bar{\mathbf{y}})=\frac{V(Y)}{n}$ $\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}$

## Meaning

$\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}$ The probability that $$E(Y)$$ is outside $\bar{\mathbf{y}}\pm V(Y) \cdot \frac{k}{\sqrt{n}}$ is $$1/k^2$$

k
1 1
1.414 0.5
3 0.1111
4 0.0625
10 0.01

## Normal distribution case

Here intervals are narrower

But we need to know $$V(Y)$$

k
1 0.1587
1.414 0.07865
3 0.00135
4 3.167e-05
10 7.62e-24

## But we don’t know $$V(Y)$$

We find that $E(\mathrm{S}_n(\bar{y}, \mathbf{y})) = \frac{n-1}{n}V(Y)$ so the variance of the sample is biased

Instead we use the variance of the population $\mathrm{S}_{n-1}(\mathbf{y})=\frac{1}{n-1}\sum_i (y_i-\bar{\mathbf{y}})^2$