23 September 2016

- A platonic ideal of data
*Random variables*as representations of infinite sets*Distributions*as descriptions of random variables

- Now we imagine that we throw the dice infinite times
- Instead of a vector \(\mathrm{y}\) we have a
*random variable*\(Y\) - The size \(n\) goes to infinite, but \(p(y)\) remains bounded
- Now the \(p(y)\) function is called
*probability distribution* - As before, all information about \(Y\) is contained in \(p(y)\)

- Each random process has a corresponding distribution
- Usually the distribution is a formula
- The formula depends on the
*mechanics*of the process

- The formula depends on the
- It also depends on some
*parameters*(constants)- The parameters depend on the specific conditions

- Has 2 sides:
*Head*and*Tail*\[p(\mathrm{Head}) = 1 - p(\mathrm{Tail})\] This is the mechanics of the coin - If the coin is symmetric then \[p(\mathrm{Head}) = p(\mathrm{Tail}) = 0.5\] This parameter depends on the specific coin

**Expected value:** equivalent to the *mean* of the infinite set \[E(Y) = \sum_y y \cdot p(y)\] **Variance:** equivalent to the *mean square error* \[V(Y) = \sum_y (y-E(Y))^2\cdot p(y)\]

- The bell-shaped Gaussian distribution is the most common in experiments
- It arises in many experimental conditions
- In particular for additive experimental noise
- There are some important exceptions

- How to estimate the parameters of a distribution
- how to determine confidence intervals

We make an experiment

- We cannot measure all the random variable
- We get only a few values (\(n\)) in a
*sample*\(\mathbf y\) - We
*assume*that the sample is a*subset*of the random variable - If we repeat the experiment we will have different values

What can we learn from the sample?

Fortunate each \(y_i\) has expected value \(E(Y)\)

But probably not exactly equal

Chebyshev’s inequality: \[\Pr\left(|y_i-E(Y)|\geq V(Y) \cdot k\right)\leq \frac{1}{k^2}\] The probability that \(E(Y)\) is outside \(y_i\pm V(Y) \cdot k\) is \(1/k^2\)

\[E(\bar{\mathbf{y}})=E(Y)\] \[V(\bar{\mathbf{y}})=\frac{V(Y)}{n}\] \[\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}\]

\[\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}\] The probability that \(E(Y)\) is outside \[\bar{\mathbf{y}}\pm V(Y) \cdot \frac{k}{\sqrt{n}}\] is \(1/k^2\)

k | |
---|---|

1 | 1 |

1.414 | 0.5 |

3 | 0.1111 |

4 | 0.0625 |

10 | 0.01 |

Here intervals are narrower

But we need to know \(V(Y)\)

k | |
---|---|

1 | 0.1587 |

1.414 | 0.07865 |

3 | 0.00135 |

4 | 3.167e-05 |

10 | 7.62e-24 |

We find that \[E(\mathrm{S}_n(\bar{y}, \mathbf{y})) = \frac{n-1}{n}V(Y)\] so the *variance of the sample* is biased

Instead we use the *variance of the population* \[\mathrm{S}_{n-1}(\mathbf{y})=\frac{1}{n-1}\sum_i (y_i-\bar{\mathbf{y}})^2\]