Class 2: Statistical significance

Methodology of Scientific Research

Andrés Aravena, PhD

24 March 2021

Using GEO

Series, Platform, Samples

Data to analyze

Using GEO2R

control vs treatment

Results

Expression and Log

Raw data is light intensity (luminescence) for Control and Treatment \[C, T\]

We work with the logarithm (base 2) of these values \[LC=\log_2(C)\\LT=\log_2(T)\]

Average Expression & Fold Change

Then Average expression is \[AvgExp = \frac{LT + LC}{2}\] and Fold change is \[logFC = LT - LC=\log_2\left(\frac{T}{C}\right)\]

Expression v/s Fold Change

Volcano plot

Hypothesis test

Classical statistics. We have two scenarios, called hypothesis

H₀: Nothing happens
H₁: Something happens

But our experiment cannot tell us directly which one is true

What we measure

Every experiment has 4 contributions

Biology (or Nature)
Natural variability
Instrument
Noise

Example: Black and white horses

We want to test the hypothesis “black horses are taller or shorter than white horses”

We want to compare average height of black horses v/s white horses
Each particular horse can be short or tall
We use a measuring tape
Horses do not stay put

Experiments are random variables

We may have bad luck. Maybe black and white horses have the same average height, but

we got only the small black horses and the tall white ones
the horses move a lot and we cannot measure the correct values

Therefore, we cannot be 100% sure that our results correspond to reality

But we can have a degree of confidence that we are not far away

p-value and null hypothesis

In terms of hypothesis test, we have

H₀: the real averages are the same. We only see variability and noise.
H₁: the real averages are different. The measurements are too different to be only noise.

The p-value is the probability of observing the experimental data \(X\), assuming that H₀ is true \[ℙ(X|H_0)\]

Is this what we want?

Notice that \[ℙ(X|H_0)≠ℙ(H_0|X)\] In other words, the p-value is not the probability that the null hypothesis is true, given the experimental result

Ideally we will like to know this last probability, but it is hard to do so

Calculating the probability

https://xkcd.com/892/

Calculating the probability

Under the null hypothesis, the height difference is 0

If we can also assume that the noise and variability follow a Normal distribution \(N(0,σ^2),\) we have

Student’s t-test

We have another problem. We do not know σ²

A clever biologist found a solution

We measure the variance in our data and we use it

But we have to pay a price: We have less confidence

Student’s distribution

When is it significant?

Joke

https://xkcd.com/1478/

When is it significant?

Traditionally, 5% and 1% are used as p-value thresholds

But there is nothing to decide that these are good values

Indeed, in Gene Expression, these value are usually too big

The problem with multiple tests

https://xkcd.com/882/

Multiple test correction

There are several approaches

Family Wise Error Rate control (Bonferroni)
False Discovery Rate control (Hochberg)
Others

The basic difference is the trade-off between False Positives and False Negatives

Two ways of being wrong

In every hypothesis test, we can be wrong in two ways

Type 1: False positive
- putting innocents in the jail
Type 2: False negatives
- freeing criminals

Usually improving one means worsening of the other

How does FWER work

Family Wise Error Rate multiplies each p-value by the number of cases \[p.adj = p.value ⋅ N\]

It reduces False Positives and increases False Negatives

Sometimes we get nothing significant

How does FDR work

FDR sorts the p-values and multiplies each by an increasing value \[p.adj = p.value \cdot\frac{i}{N}\]

If we get \(p.adj<0.05\) then the probability that it is a false positive is 5%

Class 2: Statistical significance

Methodology of Scientific Research

Andrés Aravena, PhD

24 March 2021

Using GEO

Series, Platform, Samples

Data to analyze

Using GEO2R

control vs treatment

Results

Expression and Log

Average Expression & Fold Change

Expression v/s Fold Change

Volcano plot

Hypothesis test

What we measure

Example: Black and white horses

Experiments are random variables

p-value and null hypothesis

Is this what we want?

Calculating the probability

Calculating the probability

Student’s t-test

Student’s distribution

When is it significant?

When is it significant?

The problem with multiple tests

Multiple test correction

Two ways of being wrong

How does FWER work

How does FDR work

Original v/s adjusted p-value

That is how we make a Volcano plot