Class 4: Probabilities

Methodology of Scientific Research

AndrΓ©s Aravena, PhD

April 7, 2021

An event is a set of outcomes

The set of all possible outcomes is often called Ξ©

An event 𝐴 can be seen as the set of all outcomes that make the event true

For example, \[Fever=\{Temp>37.5Β°C\}\]

Evaluating rational beliefs

An event will become either true or false after an experiment

For example, a dice can be either 4 or not

We want to give a value to our rational belief that the event will become true after the experiment

The numeric value is called Probability

Probabilities as Areas

It is useful to think that the probability of an event is the area in the drawing

The total area of Ξ© is 1

Usually we do not know the shape of 𝐴

Probabilities depend on our knowledge

Our rational beliefs depend on our knowledge

If we represent our knowledge (or hypothesis) by 𝑍, the the probability of an event 𝐴 is written as \[β„™(A|Z)\] We read β€œthe probability of event 𝐴, given that we know 𝑍”

For example, β€œthe probability that we get a 4, given that the dice is symmetrical”

Important idea

The order is relevant \[β„™(A|Z)β‰ β„™(Z|A)\] There are two events, 𝐴 and 𝑍

The one written after | is what we assume to be true

The one written before | is what we are asking for

One we know, the other we do not

Visually

Now outcomes are limited only to the 𝑍 region

We measure the area of \(β„™(A|Z)\) with respect to the area of 𝑍 instead of Ξ©

The shape of 𝑍 is often unknown

Degrees of belief

If, given our knowledge 𝑍, the event 𝐡 is more plausible than the event 𝐴, then \[β„™(A|Z)≀ℙ(B|Z)\]

For example, the probability that we get either 4, 5 or 6 is greater than the probability that we get a 4, given that the dice is symmetrical \[β„™(\{4\}|Z)≀ℙ(\{4,5,6\}|Z)\]

Degrees of belief

On the other hand, if we get new information, the probabilities may change

The same event 𝐴 may be more plausible under a new hypothesis π‘Œ than under the initial hypothesis 𝑍

Then \[β„™(A|Z)≀ℙ(A|Y)\]

Probability rules based on these two ideas

It has been proven that probabilities must be like this

  1. A probability is a number between 0 and 1 inclusive \[β„™(A) β‰₯ 0\textrm{ and }β„™(A)≀1\]

  2. The probability of an sure event is 1 \[β„™(\textrm{True}) = 1\]

  3. The probability of an impossible event is 0 \[β„™(\textrm{False}) = 0\]

Complex events

We are interested in non-trivial events, that are usually combinations of smaller events

For example, we may ask β€œwhat is the probability that, in a group of 𝑛 people, at least two persons have the same birthday”

Fortunately, any complex event can be decomposed into simpler events, combined with and, or and not connectors

Exercise: decompose the birthday event into simpler ones

Probability of not 𝐴

If the event 𝐴 becomes more and more plausible, then the opposite event not 𝐴 becomes less and less plausible

It can be shown that we always have \[β„™(\textrm{not }A) = 1-β„™(A)\]

Joint Probability

The probability of of 𝐴 and 𝐡 happening simultaneously must be connected to the probability of each one

It can be shown that there are only two ways to calculate it

  • Start with the prob. of \(A\) and then of \(B\) given that \(A\) is true \[β„™(A,B)=β„™(A)β‹…β„™(B|A)\]
  • Start with the prob. of \(B\) and then of \(A\) given that \(B\) is true \[β„™(A,B)=β„™(B)β‹…β„™(A|B)\]

It must be a multiplication

It can be proven that the only way to combine \(β„™(A)\) and \(β„™(B|A)\) to get \(β„™(A,B)\) is to multiply them.

Both are true, since \(β„™(A,B)=β„™(B,A).\) The order that we write them is irrelevant.

Probability of 𝐴 or 𝐡

We know how to calculate \(β„™(A\textrm{ and }B)\) and \(β„™(\textrm{not }A)\)

We also know the De Morgan’s law, to swap ANDs with ORs
\[\textrm{not }(A \textrm{ or }B) = (\textrm{not }A) \textrm{ and }(\textrm{not }B)\]

Therefore we can write

\[ \begin{aligned} β„™(A \textrm{ or }B) & = 1 - β„™(\textrm{not }(A \textrm{ or }B))\\ & = 1-β„™( (\textrm{not }A) \textrm{ and }(\textrm{not }B)) \end{aligned} \]

Using the multiplication rule

\[β„™(A \textrm{ or }B) = 1-β„™( (\textrm{not }A) \textrm{ and }(\textrm{not }B)) \\ = 1-β„™(\textrm{not }A)β‹…P(\textrm{not }B|\textrm{not }A)\]

using negation rule \[ \begin{aligned} β„™(A \textrm{ or }B) & = 1-β„™(\textrm{not }A)β‹…(1- β„™(B|\textrm{not }A)) \\ & = 1-β„™(\textrm{not }A) + β„™(\textrm{not }A)β‹…P(B|\textrm{not }A) \end{aligned} \]

Using the multiplication rule again

\[ \begin{aligned} β„™(A \textrm{ or }B) & = 1 -β„™(\textrm{not }A) + β„™(\textrm{not }A,B) \\ β„™(A \textrm{ or }B) & = 1 -(1-β„™(A)) + β„™(\textrm{not }A|B)β„™(B) \\ β„™(A \textrm{ or }B) & = β„™(A) + (1-β„™(A|B))β„™(B) \\ β„™(A \textrm{ or }B) & = β„™(A) + β„™(B)-β„™(A|B)β„™(B) \\ β„™(A \textrm{ or }B) & = β„™(A) + β„™(B)-β„™(A,B) \end{aligned} \] You need to remember only the last line

The previous lines justify why the last one is always true

Do not count twice

If A and B can happen at the same time, then \(β„™(A) + β„™(B)\) counts the intersection twice

So we have to take out the intersection \(β„™(A,B)\) \[β„™(A \textrm{ or }B) = \\ β„™(A) + β„™(B)-β„™(A,B)\]

It gets complicated

If there are three compatible events, things get messy

\[\begin{aligned} & β„™(A \textrm{ or }B \textrm{ or }C) \\ & β„™(A) + β„™(B \textrm{ or }C)-β„™(A,(B \textrm{ or }C)) \\ & β„™(A) + β„™(B) + β„™(C)-β„™(B,C) - β„™(A,B \textrm{ or }A,C) \\ & β„™(A) + β„™(B) + β„™(C)-β„™(B,C) - (β„™(A,B) + β„™(A,C) - β„™(A,B,C)) \\ & β„™(A) + β„™(B) + β„™(C)-β„™(B,C) - β„™(A,B) - β„™(A,C) + β„™(A,B,C) \end{aligned} \]

It gets worse with more events

There is a better way

Using De Morgan’s rule

\[\begin{aligned} & β„™(A \textrm{ or }B \textrm{ or }C) \\ & 1 - β„™((\textrm{not }A) \textrm{ and }(\textrm{not }B) \textrm{ and }(\textrm{not }C))\\ & 1 - β„™(\textrm{not }A)β‹…β„™(\textrm{not }B | \textrm{not }A)β‹…β„™(\textrm{not }C | \textrm{not }A, \textrm{not }B)\\ & 1 - (1-β„™(A))β‹…(1-β„™(B | \textrm{not }A))β‹…(1-β„™(C | \textrm{not }A, \textrm{not }B)) \end{aligned} \]

This is often easier to calculate

Example: Multiple Birthdays

Let’s say we have three people, with birthday \(x_1, x_2\) and \(x_3.\)

The probability that there are at least two people with the same birthday is \[β„™(x_2=x_1 \textrm{ or }x_3=x_2 \textrm{ or }x_3=x_1)\] which can be rewritten as \[1-β„™(x_2β‰ x_1 \textrm{ and }x_3β‰ x_2 \textrm{ and }x_3β‰ x_1)\]

Using the multiplication rule

We want to calculate \[1-β„™(x_2β‰ x_1 \textrm{ and }x_3β‰ x_2 \textrm{ and }x_3β‰ x_1)\] We can separate like this (only the first and) \[1-β„™(x_2β‰ x_1)β‹…β„™(x_3β‰ x_2 \textrm{ and }x_3β‰ x_1|x_2β‰ x_1)\] Assuming 365 possible birthdays, we have \[1-\frac{364}{365}β‹…\frac{363}{365}\]

Exercise

  • What is the probability that, in a group of N people, at least two of them share the same birthday?

  • How many people do we need to have at least 50% probability of least two of them sharing the same birthday?

Special case

If A and B are incompatible

if A and B cannot happen at the same time, then \((A \textrm{ and }B)\) is impossible, therefore \(β„™(A,B)=0\)

In that case (and only in that case) \[β„™(A \textrm{ or }B) = β„™(A) + β„™(B)\]

Splitting a set into pieces

In particular we have \[β„™(A) = β„™(A\textrm{ and }(B \textrm{ or }\textrm{not }B)) = β„™(A,B) + β„™(A, \textrm{not }B)\] because

  • \((A \textrm{ and }B)\) is incompatible with \((A \textrm{ and }\textrm{not }B)\),
  • \((A \textrm{ and }(B \textrm{ or }\textrm{not }B))\) is equal to \(A\)

Splitting \(Ξ©\)

If we partition Ξ© into 𝑛 subsets \(A_i\), such that they cover all Ξ© \[\Omega=A_1 βˆͺ A_2 βˆͺ … βˆͺ A_n\] and each pair of events are mutually incompatible \[A_i ∩ A_j=\phi\] then we have \[β„™(\Omega)=β„™(A_1) + β„™(A_2) + … + β„™(A_n)=1\]

All outcomes

One kind of events are the set of each single outcome

If \(a_i ∈ Ω\) is an outcome, then \(A_i=\{a_i\}\) is an event

β€œThe experiment outcome is exactly \(a_i\)”

It is easy to see that these events are mutually incompatible and cover all Ξ©

Thus, \[β„™(\{a_1\}) + β„™(\{a_2\}) + … + β„™(\{a_n\})=1\]