23 September 2016


Introduction to Probabilities

  • A platonic ideal of data
  • Random variables as representations of infinite sets
  • Distributions as descriptions of random variables

A vector of infinite size

  • Now we imagine that we throw the dice infinite times
  • Instead of a vector \(\mathrm{y}\) we have a random variable \(Y\)
  • The size \(n\) goes to infinite, but \(p(y)\) remains bounded
  • Now the \(p(y)\) function is called probability distribution
  • As before, all information about \(Y\) is contained in \(p(y)\)


  • Each random process has a corresponding distribution
  • Usually the distribution is a formula
    • The formula depends on the mechanics of the process
  • It also depends on some parameters (constants)
    • The parameters depend on the specific conditions

Examples: Coin

  • Has 2 sides: Head and Tail \[p(\mathrm{Head}) = 1 - p(\mathrm{Tail})\] This is the mechanics of the coin
  • If the coin is symmetric then \[p(\mathrm{Head}) = p(\mathrm{Tail}) = 0.5\] This parameter depends on the specific coin

Important parameters

Expected value: equivalent to the mean of the infinite set \[E(Y) = \sum_y y \cdot p(y)\] Variance: equivalent to the mean square error \[V(Y) = \sum_y (y-E(Y))^2\cdot p(y)\]

Important case: Normal distribution

  • The bell-shaped Gaussian distribution is the most common in experiments
  • It arises in many experimental conditions
  • In particular for additive experimental noise
  • There are some important exceptions

Statistical Inference

How to observe the “ideals”

Statistical inference

  • How to estimate the parameters of a distribution
  • how to determine confidence intervals

Sampling the “infinite set”

We make an experiment

  • We cannot measure all the random variable
  • We get only a few values (\(n\)) in a sample \(\mathbf y\)
  • We assume that the sample is a subset of the random variable
  • If we repeat the experiment we will have different values

What can we learn from the sample?

Estimating the Expected Value

Fortunate each \(y_i\) has expected value \(E(Y)\)

But probably not exactly equal

Chebyshev’s inequality: \[\Pr\left(|y_i-E(Y)|\geq V(Y) \cdot k\right)\leq \frac{1}{k^2}\] The probability that \(E(Y)\) is outside \(y_i\pm V(Y) \cdot k\) is \(1/k^2\)

Using averages

\[E(\bar{\mathbf{y}})=E(Y)\] \[V(\bar{\mathbf{y}})=\frac{V(Y)}{n}\] \[\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}\]


\[\Pr\left(|\bar{\mathbf{y}}-E(Y)|\geq V(Y) \cdot \frac{k}{\sqrt{n}}\right)\leq \frac{1}{k^2}\] The probability that \(E(Y)\) is outside \[\bar{\mathbf{y}}\pm V(Y) \cdot \frac{k}{\sqrt{n}}\] is \(1/k^2\)


1 1
1.414 0.5
3 0.1111
4 0.0625
10 0.01

Normal distribution case

Here intervals are narrower

But we need to know \(V(Y)\)

1 0.1587
1.414 0.07865
3 0.00135
4 3.167e-05
10 7.62e-24

But we don’t know \(V(Y)\)

We find that \[E(\mathrm{S}_n(\bar{y}, \mathbf{y})) = \frac{n-1}{n}V(Y)\] so the variance of the sample is biased

Instead we use the variance of the population \[\mathrm{S}_{n-1}(\mathbf{y})=\frac{1}{n-1}\sum_i (y_i-\bar{\mathbf{y}})^2\]

t-Student’s distribution