English explorer, Inventor, Anthropologist

(1822–1911)

- Cousin of Charles Darwin

He studied medicine and mathematics at Cambridge University.

He invented the phrase

*“nature versus nurture”*In his will, he donated funds for a professorship in genetics to University College London.

We take each ball independently

In every *level*, the ball bounces either *left* or
*right*

We represent these options as -1 and 1

At the last level the position is **the sum** of all
bounces

`M`

, larger varianceNow we have a coin 𝑋 with two possible outcomes: +1 and -1

To make life easy, we assume 𝑝=0.5

What are the expected value and variance of X ?

We throw the coin 𝑛 times, and we calculate 𝑌, the sum of all 𝑋 \[Y=\sum_{i=1}^𝑛 X_i\]

What are the expected value and variance of 𝑌 ?

- \(𝑌\) is basically a Binomial random variable
- \(𝔼 𝑌 = 0,\) because \(𝔼 𝑋 = 0\)
- \(𝕍 𝑌 = 𝑁,\) because \(𝕍 𝑋 = 1\)

Now consider \(Z_n=Y/\sqrt{𝑛}\)

It is easy to see that \(𝔼Z_n = 0\) and \(𝕍Z_n = 1\) independent of 𝑛

The possible values of \(Z_n\) are not integers. Not even rationals

What happens with \(Z_n\) when 𝑛 is
*really big*?

This “bell-shaped” curve is found in many experiments, especially when they involve the sum of many small contributions

- Measurement errors
- Height of a population
- Scores on University Admission exams

It is called *Gaussian distribution*, or also *Normal
distribution*

Here *outcomes* are *real numbers*

*Any* real number is possible

Probability of any \(x\) is zero (!)

We look for probabilities of *intervals*

“The sum of several **independent** random variables
converges to a *Normal* distribution”

The sum should have many terms, they should be independent, and they should have a well defined variance

(In Biology sometimes the variables are not independent, so be careful)

When \(n→∞,\) the distribution of
\(Z_n=∑ X/\sqrt{𝑛}\) will converge to a
**Normal** distribution \[\lim_{n→∞} Z_n ∼ Normal(0,1)\]

If \(X_i\) is a set of i.i.d. random variables, with \[𝔼X_i=μ\quad\text{and}\quad 𝕍X_i=σ^2\quad\text{for all }i\] then, when \(n\) is large \[\lim_{n→∞} \frac{\sum_i X_i-μ}{σ\sqrt{𝑛}} ∼ \text{Normal}(0,1)\]

If \(X_i\) is a set of i.i.d. random variables, with \[𝔼X_i=μ\quad\text{and}\quad 𝕍X_i=σ^2\quad\text{for all }i\] then, when \(n\) is large \[\lim_{n→∞} \frac{\sum_i X_i-μ}{\sqrt{𝑛}} ∼ \text{Normal}(0, σ^2)\]

- Thermal noise is the sum of many small vibrations in all directions
- they sum and usually cancel each other

- Phenotype depends on several genetic conditions
- Height, weight and similar attributes depend on the combination of several attributes

- Not all combined effects are
*sums*- some effects are multiplicative

- Some effects may not have finite variance
- sometimes variance is infinite

- Not all effects are independent
- this is the most critical issue

≈95% of *normal population* is between \(-2\cdot\text{sd}(\mathbf x)\) and \(2\cdot\text{sd}(\mathbf x)\)

≈99% of *normal population* is between \(-3\cdot\text{sd}(\mathbf x)\) and \(3\cdot\text{sd}(\mathbf x)\)