# Methodology of Scientific Research

## Complex events

We are interested in non-trivial events, that are usually combinations of smaller events

For example, we may ask βwhat is the probability that, in a group of π people, at least two persons have the same birthdayβ

Fortunately, any complex event can be decomposed into simpler events, combined with and, or and not connectors

Exercise: decompose the birthday event into simpler ones

## Probability of not π΄

If the event π΄ becomes more and more plausible, then the opposite event not π΄ becomes less and less plausible

It can be shown that we always have $β(\text{not } A) = 1-β(A)$

## Probability of π΄ and π΅

$β(A\text{ and } B)=\frac{\text{Number of cases where }(A\text{ and } B)\text{ is true}}{\text{Total cases of combinations of }A\text{ and } B}$

If $$n_A$$ and $$n_B$$ are the total number of cases for $$A$$ and $$B$$, then the total number of cases is $$n_Aβ n_B$$

In the same way, if $$m_A$$ and $$m_B$$ are the number of cases where $$A$$ and $$B$$ are true, respectively, then the number of cases where $$(A\text{ and }B)$$ is true is $$m_Aβ m_B$$

$β(A\text{ and } B)=\frac{m_Aβ m_B}{n_Aβ n_B}=\frac{m_A}{n_A}β \frac{m_B}{n_B}$

## Interpretation

We could say that $\frac{m_A}{n_A}=β(A)\qquad\frac{m_B}{n_B}=β(B)$ but we have to be careful. The result of A may affect $$m_B$$ and $$n_B$$. We better write $\frac{m_A}{n_A}=β(A)\qquad\frac{m_B}{n_B}=β(B|A)$

## Rewriting the Probability of π΄ and π΅

$β(A\text{ and } B)=\frac{m_A}{n_A}β \frac{m_B}{n_B}=β(A)β β(B|A)$ To simplify, instead of $$β(A\text{ and } B)$$ we write $$β(A, B)$$

Thus, we write $β(A,B)=β(A)β β(B|A)$ βProb that (π΄ and π΅) happens is Prob that π΄ happens times Prob that π΅ happens given that A happensβ

## Joint Probability

We know that $$(A\text{ and } B)$$ is always the same as $$(B\text{ and } A)$$

There are two ways to calculate the probability of of π΄ and π΅ happening simultaneously

• Start with the prob. of $$A$$ and then of $$B$$ given that $$A$$ is true $β(A,B)=β(A)β β(B|A)$
• Start with the prob. of $$B$$ and then of $$A$$ given that $$B$$ is true $β(A,B)=β(B)β β(A|B)$

## Exercises

• Prob of getting heads twice when throwing coins
• Prob of getting 6 and 6 on two dice
• Prob of getting heads and a 6
• Prob of getting a green card

## Probability of π΄ or π΅

We know how to calculate $$β(A\text{ and } B)$$ and $$β(\text{not } A)$$

We also know the De Morganβs law, to swap ANDs with ORs
$\text{not } (A \text{ or } B) = (\text{not } A) \text{ and } (\text{not } B)$

Therefore we can write

\begin{aligned} β(A \text{ or } B) & = 1 - β(\text{not }(A \text{ or } B))\\ & = 1-β( (\text{not } A) \text{ and } (\text{not } B)) \end{aligned}

## Using the multiplication rule

$β(A \text{ or } B) = 1-β( (\text{not } A) \text{ and } (\text{not } B)) \\ = 1-β(\text{not } A)β P(\text{not } B|\text{not } A)$

using negation rule \begin{aligned} β(A \text{ or } B) & = 1-β(\text{not } A)β (1- β(B|\text{not } A)) \\ & = 1-β(\text{not } A) + β(\text{not } A)β P(B|\text{not } A) \end{aligned}

## Using the multiplication rule again

\begin{aligned} β(A \text{ or } B) & = 1 -β(\text{not } A) + β(\text{not } A,B) \\ β(A \text{ or } B) & = 1 -(1-β(A)) + β(\text{not } A|B)β(B) \\ β(A \text{ or } B) & = β(A) + (1-β(A|B))β(B) \\ β(A \text{ or } B) & = β(A) + β(B)-β(A|B)β(B) \\ β(A \text{ or } B) & = β(A) + β(B)-β(A,B) \end{aligned} You need to remember only the last line

The previous lines justify why the last one is always true

## Do not count twice

If A and B can happen at the same time, then $$β(A) + β(B)$$ counts the intersection twice

So we have to take out the intersection $$β(A,B)$$ $β(A \text{ or } B) = \\ β(A) + β(B)-β(A,B)$

## It gets complicated

If there are three compatible events, things get messy

\begin{aligned} & β(A \text{ or } B \text{ or } C) \\ & β(A) + β(B \text{ or } C)-β(A,(B \text{ or } C)) \\ & β(A) + β(B) + β(C)-β(B,C) - β(A,B \text{ or } A,C) \\ & β(A) + β(B) + β(C)-β(B,C) - (β(A,B) + β(A,C) - β(A,B,C)) \\ & β(A) + β(B) + β(C)-β(B,C) - β(A,B) - β(A,C) + β(A,B,C) \end{aligned}

It gets worse with more events

## If A and B are incompatible

if A and B cannot happen at the same time, then $$(A \text{ and } B)$$ is impossible, therefore $$β(A,B)=0$$

In that case (and only in that case) $β(A \text{ or } B) = β(A) + β(B)$

## Splitting a set into pieces

In particular we have $β(A) = β(A\text{ and } (B \text{ or } \text{not } B)) = β(A,B) + β(A, \text{not } B)$ because

• $$(A \text{ and } B)$$ is incompatible with $$(A \text{ and } \text{not } B)$$,
• $$(A \text{ and } (B \text{ or } \text{not } B))$$ is equal to $$A$$

## Splitting $$Ξ©$$

If we partition Ξ© into π subsets $$A_i$$, such that they cover all Ξ© $\Omega=A_1 βͺ A_2 βͺ β¦ βͺ A_n$ and each pair of events are mutually incompatible $A_i β© A_j=\phi$ then we have $β(\Omega)=β(A_1) + β(A_2) + β¦ + β(A_n)=1$

## There is an easier way

Using De Morganβs rule

\begin{aligned} & β(A \text{ or } B \text{ or } C) \\ & 1 - β((\text{not } A) \text{ and } (\text{not } B) \text{ and } (\text{not } C))\\ & 1 - β(\text{not } A)β β(\text{not } B | \text{not } A)β β(\text{not } C | \text{not } A, \text{not } B)\\ & 1 - (1-β(A))β (1-β(B | \text{not } A))β (1-β(C | \text{not } A, \text{not } B)) \end{aligned}

This is often easier to calculate

## Example: Multiple Birthdays

Letβs say we have three people, with birthday $$x_1, x_2$$ and $$x_3.$$

The probability that there are at least two people with the same birthday is $β(x_2=x_1 \text{ or } x_3=x_2 \text{ or } x_3=x_1)$ which can be rewritten as $1-β(x_2β x_1 \text{ and } x_3β x_2 \text{ and } x_3β x_1)$

## Now we onlu have and combinations

We want to calculate $1-β(x_2β x_1 \text{ and } x_3β x_2 \text{ and } x_3β x_1)$ We can separate like this (only the first and) $1-β(x_2β x_1)β β(x_3β x_2 \text{ and } x_3β x_1|x_2β x_1)$ Assuming 365 possible birthdays, we have $1-\frac{364}{365}β \frac{363}{365}$

## Exercise

• What is the probability that, in a group of N people, at least two of them share the same birthday?

• How many people do we need to have at least 50% probability of least two of them sharing the same birthday?