Class 25: Probabilities of complex events

Methodology of Scientific Research

AndrΓ©s Aravena, PhD

May 3, 2023

Complex events

We are interested in non-trivial events, that are usually combinations of smaller events

For example, we may ask β€œwhat is the probability that, in a group of 𝑛 people, at least two persons have the same birthday”

Fortunately, any complex event can be decomposed into simpler events, combined with and, or and not connectors

Exercise: decompose the birthday event into simpler ones

Probability of not 𝐴

If the event 𝐴 becomes more and more plausible, then the opposite event not 𝐴 becomes less and less plausible

It can be shown that we always have \[β„™(\text{not } A) = 1-β„™(A)\]

Probability of 𝐴 and 𝐡

\[β„™(A\text{ and } B)=\frac{\text{Number of cases where }(A\text{ and } B)\text{ is true}}{\text{Total cases of combinations of }A\text{ and } B}\]

If \(n_A\) and \(n_B\) are the total number of cases for \(A\) and \(B\), then the total number of cases is \(n_Aβ‹…n_B\)

In the same way, if \(m_A\) and \(m_B\) are the number of cases where \(A\) and \(B\) are true, respectively, then the number of cases where \((A\text{ and }B)\) is true is \(m_Aβ‹…m_B\)

\[β„™(A\text{ and } B)=\frac{m_Aβ‹…m_B}{n_Aβ‹…n_B}=\frac{m_A}{n_A}β‹…\frac{m_B}{n_B}\]

Interpretation

We could say that \[\frac{m_A}{n_A}=β„™(A)\qquad\frac{m_B}{n_B}=β„™(B)\] but we have to be careful. The result of A may affect \(m_B\) and \(n_B\). We better write \[\frac{m_A}{n_A}=β„™(A)\qquad\frac{m_B}{n_B}=β„™(B|A)\]

Rewriting the Probability of 𝐴 and 𝐡

\[β„™(A\text{ and } B)=\frac{m_A}{n_A}β‹…\frac{m_B}{n_B}=β„™(A)β‹…β„™(B|A)\] To simplify, instead of \(β„™(A\text{ and } B)\) we write \(β„™(A, B)\)

Thus, we write \[β„™(A,B)=β„™(A)β‹…β„™(B|A)\] β€œProb that (𝐴 and 𝐡) happens is Prob that 𝐴 happens times Prob that 𝐡 happens given that A happens”

Joint Probability

We know that \((A\text{ and } B)\) is always the same as \((B\text{ and } A)\)

There are two ways to calculate the probability of of 𝐴 and 𝐡 happening simultaneously

  • Start with the prob. of \(A\) and then of \(B\) given that \(A\) is true \[β„™(A,B)=β„™(A)β‹…β„™(B|A)\]
  • Start with the prob. of \(B\) and then of \(A\) given that \(B\) is true \[β„™(A,B)=β„™(B)β‹…β„™(A|B)\]

Exercises

  • Prob of getting heads twice when throwing coins
  • Prob of getting 6 and 6 on two dice
  • Prob of getting heads and a 6
  • Prob of getting a green card

Probability of 𝐴 or 𝐡

We know how to calculate \(β„™(A\text{ and } B)\) and \(β„™(\text{not } A)\)

We also know the De Morgan’s law, to swap ANDs with ORs
\[\text{not } (A \text{ or } B) = (\text{not } A) \text{ and } (\text{not } B)\]

Therefore we can write

\[ \begin{aligned} β„™(A \text{ or } B) & = 1 - β„™(\text{not }(A \text{ or } B))\\ & = 1-β„™( (\text{not } A) \text{ and } (\text{not } B)) \end{aligned} \]

Using the multiplication rule

\[β„™(A \text{ or } B) = 1-β„™( (\text{not } A) \text{ and } (\text{not } B)) \\ = 1-β„™(\text{not } A)β‹…P(\text{not } B|\text{not } A)\]

using negation rule \[ \begin{aligned} β„™(A \text{ or } B) & = 1-β„™(\text{not } A)β‹…(1- β„™(B|\text{not } A)) \\ & = 1-β„™(\text{not } A) + β„™(\text{not } A)β‹…P(B|\text{not } A) \end{aligned} \]

Using the multiplication rule again

\[ \begin{aligned} β„™(A \text{ or } B) & = 1 -β„™(\text{not } A) + β„™(\text{not } A,B) \\ β„™(A \text{ or } B) & = 1 -(1-β„™(A)) + β„™(\text{not } A|B)β„™(B) \\ β„™(A \text{ or } B) & = β„™(A) + (1-β„™(A|B))β„™(B) \\ β„™(A \text{ or } B) & = β„™(A) + β„™(B)-β„™(A|B)β„™(B) \\ β„™(A \text{ or } B) & = β„™(A) + β„™(B)-β„™(A,B) \end{aligned} \] You need to remember only the last line

The previous lines justify why the last one is always true

Do not count twice

If A and B can happen at the same time, then \(β„™(A) + β„™(B)\) counts the intersection twice

So we have to take out the intersection \(β„™(A,B)\) \[β„™(A \text{ or } B) = \\ β„™(A) + β„™(B)-β„™(A,B)\]

It gets complicated

If there are three compatible events, things get messy

\[\begin{aligned} & β„™(A \text{ or } B \text{ or } C) \\ & β„™(A) + β„™(B \text{ or } C)-β„™(A,(B \text{ or } C)) \\ & β„™(A) + β„™(B) + β„™(C)-β„™(B,C) - β„™(A,B \text{ or } A,C) \\ & β„™(A) + β„™(B) + β„™(C)-β„™(B,C) - (β„™(A,B) + β„™(A,C) - β„™(A,B,C)) \\ & β„™(A) + β„™(B) + β„™(C)-β„™(B,C) - β„™(A,B) - β„™(A,C) + β„™(A,B,C) \end{aligned} \]

It gets worse with more events

If A and B are incompatible

if A and B cannot happen at the same time, then \((A \text{ and } B)\) is impossible, therefore \(β„™(A,B)=0\)

In that case (and only in that case) \[β„™(A \text{ or } B) = β„™(A) + β„™(B)\]

Splitting a set into pieces

In particular we have \[β„™(A) = β„™(A\text{ and } (B \text{ or } \text{not } B)) = β„™(A,B) + β„™(A, \text{not } B)\] because

  • \((A \text{ and } B)\) is incompatible with \((A \text{ and } \text{not } B)\),
  • \((A \text{ and } (B \text{ or } \text{not } B))\) is equal to \(A\)

Splitting \(Ξ©\)

If we partition Ξ© into 𝑛 subsets \(A_i\), such that they cover all Ξ© \[\Omega=A_1 βˆͺ A_2 βˆͺ … βˆͺ A_n\] and each pair of events are mutually incompatible \[A_i ∩ A_j=\phi\] then we have \[β„™(\Omega)=β„™(A_1) + β„™(A_2) + … + β„™(A_n)=1\]

There is an easier way

Using De Morgan’s rule

\[\begin{aligned} & β„™(A \text{ or } B \text{ or } C) \\ & 1 - β„™((\text{not } A) \text{ and } (\text{not } B) \text{ and } (\text{not } C))\\ & 1 - β„™(\text{not } A)β‹…β„™(\text{not } B | \text{not } A)β‹…β„™(\text{not } C | \text{not } A, \text{not } B)\\ & 1 - (1-β„™(A))β‹…(1-β„™(B | \text{not } A))β‹…(1-β„™(C | \text{not } A, \text{not } B)) \end{aligned} \]

This is often easier to calculate

Example: Multiple Birthdays

Let’s say we have three people, with birthday \(x_1, x_2\) and \(x_3.\)

The probability that there are at least two people with the same birthday is \[β„™(x_2=x_1 \text{ or } x_3=x_2 \text{ or } x_3=x_1)\] which can be rewritten as \[1-β„™(x_2β‰ x_1 \text{ and } x_3β‰ x_2 \text{ and } x_3β‰ x_1)\]

Now we onlu have and combinations

We want to calculate \[1-β„™(x_2β‰ x_1 \text{ and } x_3β‰ x_2 \text{ and } x_3β‰ x_1)\] We can separate like this (only the first and) \[1-β„™(x_2β‰ x_1)β‹…β„™(x_3β‰ x_2 \text{ and } x_3β‰ x_1|x_2β‰ x_1)\] Assuming 365 possible birthdays, we have \[1-\frac{364}{365}β‹…\frac{363}{365}\]

Exercise

  • What is the probability that, in a group of N people, at least two of them share the same birthday?

  • How many people do we need to have at least 50% probability of least two of them sharing the same birthday?