Probabilities and its probable parameters

23 September 2016

Probabilities

Introduction to Probabilities

A platonic ideal of data
Random variables as representations of infinite sets
Distributions as descriptions of random variables

A vector of infinite size

Now we imagine that we throw the dice infinite times
Instead of a vector \(\mathrm{y}\) we have a random variable \(Y\)
The size \(n\) goes to infinite, but \(p(y)\) remains bounded
Now the \(p(y)\) function is called probability distribution
As before, all information about \(Y\) is contained in \(p(y)\)

Distributions

Each random process has a corresponding distribution
Usually the distribution is a formula
- The formula depends on the mechanics of the process
It also depends on some parameters (constants)
- The parameters depend on the specific conditions

Examples: Coin

Has 2 sides: Head and Tail \[p(\mathrm{Head}) = 1 - p(\mathrm{Tail})\] This is the mechanics of the coin
If the coin is symmetric then \[p(\mathrm{Head}) = p(\mathrm{Tail}) = 0.5\] This parameter depends on the specific coin

Important parameters

Expected value: equivalent to the mean of the infinite set \[E(Y) = \sum_y y \cdot p(y)\] Variance: equivalent to the mean square error \[V(Y) = \sum_y (y-E(Y))^2\cdot p(y)\]

Important case: Normal distribution

The bell-shaped Gaussian distribution is the most common in experiments
It arises in many experimental conditions
In particular for additive experimental noise
There are some important exceptions

Statistical Inference

How to observe the “ideals”

Statistical inference

How to estimate the parameters of a distribution
how to determine confidence intervals

Sampling the “infinite set”

We make an experiment

We cannot measure all the random variable
We get only a few values (\(n\)) in a sample \(\mathbf y\)
We assume that the sample is a subset of the random variable
If we repeat the experiment we will have different values

What can we learn from the sample?

Estimating the Expected Value

Fortunate each \(y_i\) has expected value \(E(Y)\)

But probably not exactly equal

Chebyshev’s inequality: \[\Pr\left(|y_i-E(Y)|\geq V(Y) \cdot k\right)\leq \frac{1}{k^2}\] The probability that \(E(Y)\) is outside \(y_i\pm V(Y) \cdot k\) is \(1/k^2\)

Using averages

\[E(\bar{\mathbf{y}})=E(Y)\] \[V(\bar{\mathbf{y}})=\frac{V(Y)}{n}\] \[\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}\]

Meaning

\[\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}\] The probability that \(E(Y)\) is outside \[\bar{\mathbf{y}}\pm V(Y) \cdot \frac{k}{\sqrt{n}}\] is \(1/k^2\)

Numbers

k
1	1
1.414	0.5
3	0.1111
4	0.0625
10	0.01

Normal distribution case

Here intervals are narrower

But we need to know \(V(Y)\)

k
1	0.1587
1.414	0.07865
3	0.00135
4	3.167e-05
10	7.62e-24

But we don’t know \(V(Y)\)

We find that \[E(\mathrm{S}_n(\bar{y}, \mathbf{y})) = \frac{n-1}{n}V(Y)\] so the variance of the sample is biased

Instead we use the variance of the population \[\mathrm{S}_{n-1}(\mathbf{y})=\frac{1}{n-1}\sum_i (y_i-\bar{\mathbf{y}})^2\]