The set of all possible outcomes is often called Ξ©

An event π΄ can be seen as the set of all outcomes that make the event true

For example, \[Fever=\{Temp>37.5Β°C\}\]

An **event** will become either *true* or *false* after an **experiment**

For example, a dice can be either 4 or not

We want to give a value to our rational belief that the event will become true after the experiment

The numeric value is called **Probability**

It is useful to think that the probability of an event is the area in the drawing

The total area of Ξ© is 1

Usually we do not know the shape of π΄

Our rational beliefs depend on our knowledge

If we represent our knowledge (or hypothesis) by π, the the probability of an event π΄ is written as \[β(A|Z)\] We read βthe probability of event π΄, given that we know πβ

For example, βthe probability that we get a 4, given that the dice is symmetricalβ

The order is relevant \[β(A|Z)β β(Z|A)\] There are two events, π΄ and π

The one written after `|`

is what we assume to be true

The one written before `|`

is what we are asking for

One we know, the other we do not

Now outcomes are limited only to the π region

We measure the area of \(β(A|Z)\) with respect to the area of π instead of Ξ©

The shape of π is often unknown

If, given our knowledge π, the event π΅ is more plausible than the event π΄, then \[β(A|Z)β€β(B|Z)\]

For example, the probability that we get either 4, 5 or 6 is greater than the probability that we get a 4, given that the dice is symmetrical \[β(\{4\}|Z)β€β(\{4,5,6\}|Z)\]

On the other hand, if we get new information, the probabilities may change

The same event π΄ may be more plausible under a new hypothesis π than under the initial hypothesis π

Then \[β(A|Z)β€β(A|Y)\]

It has been proven that probabilities must be like this

A probability is a number between 0 and 1 inclusive \[β(A) β₯ 0\textrm{ and }β(A)β€1\]

The probability of an sure event is 1 \[β(\textrm{True}) = 1\]

The probability of an impossible event is 0 \[β(\textrm{False}) = 0\]

We are interested in non-trivial events, that are usually combinations of smaller events

For example, we may ask βwhat is the probability that, in a group of π people, at least two persons have the same birthdayβ

Fortunately, any complex event can be decomposed into simpler events, combined with **and**, **or** and **not** connectors

Exercise: decompose the *birthday* event into simpler ones

If the event π΄ becomes more and more plausible, then the opposite event **not** π΄ becomes less and less plausible

It can be shown that we always have \[β(\textrm{not }A) = 1-β(A)\]

The probability of of π΄ *and* π΅ happening simultaneously must be connected to the probability of each one

It can be shown that there are only two ways to calculate it

- Start with the prob. of \(A\) and then of \(B\) given that \(A\) is true \[β(A,B)=β(A)β β(B|A)\]
- Start with the prob. of \(B\) and then of \(A\) given that \(B\) is true \[β(A,B)=β(B)β β(A|B)\]

It can be proven that the only way to combine \(β(A)\) and \(β(B|A)\) to get \(β(A,B)\) is to multiply them.

Both are true, since \(β(A,B)=β(B,A).\) The order that we write them is irrelevant.

We know how to calculate \(β(A\textrm{ and }B)\) and \(β(\textrm{not }A)\)

We also know the De Morganβs law, to swap ANDs with ORs

\[\textrm{not }(A \textrm{ or }B) = (\textrm{not }A) \textrm{ and }(\textrm{not }B)\]

Therefore we can write

\[ \begin{aligned} β(A \textrm{ or }B) & = 1 - β(\textrm{not }(A \textrm{ or }B))\\ & = 1-β( (\textrm{not }A) \textrm{ and }(\textrm{not }B)) \end{aligned} \]

\[β(A \textrm{ or }B) = 1-β( (\textrm{not }A) \textrm{ and }(\textrm{not }B)) \\ = 1-β(\textrm{not }A)β P(\textrm{not }B|\textrm{not }A)\]

using negation rule \[ \begin{aligned} β(A \textrm{ or }B) & = 1-β(\textrm{not }A)β (1- β(B|\textrm{not }A)) \\ & = 1-β(\textrm{not }A) + β(\textrm{not }A)β P(B|\textrm{not }A) \end{aligned} \]

\[ \begin{aligned} β(A \textrm{ or }B) & = 1 -β(\textrm{not }A) + β(\textrm{not }A,B) \\ β(A \textrm{ or }B) & = 1 -(1-β(A)) + β(\textrm{not }A|B)β(B) \\ β(A \textrm{ or }B) & = β(A) + (1-β(A|B))β(B) \\ β(A \textrm{ or }B) & = β(A) + β(B)-β(A|B)β(B) \\ β(A \textrm{ or }B) & = β(A) + β(B)-β(A,B) \end{aligned} \] You need to remember only the last line

The previous lines justify *why* the last one is always true

If A and B can happen at the same time, then \(β(A) + β(B)\) counts the intersection twice

So we have to take out the intersection \(β(A,B)\) \[β(A \textrm{ or }B) = \\ β(A) + β(B)-β(A,B)\]

If there are three compatible events, things get messy

\[\begin{aligned} & β(A \textrm{ or }B \textrm{ or }C) \\ & β(A) + β(B \textrm{ or }C)-β(A,(B \textrm{ or }C)) \\ & β(A) + β(B) + β(C)-β(B,C) - β(A,B \textrm{ or }A,C) \\ & β(A) + β(B) + β(C)-β(B,C) - (β(A,B) + β(A,C) - β(A,B,C)) \\ & β(A) + β(B) + β(C)-β(B,C) - β(A,B) - β(A,C) + β(A,B,C) \end{aligned} \]

It gets worse with more events

Using De Morganβs rule

\[\begin{aligned} & β(A \textrm{ or }B \textrm{ or }C) \\ & 1 - β((\textrm{not }A) \textrm{ and }(\textrm{not }B) \textrm{ and }(\textrm{not }C))\\ & 1 - β(\textrm{not }A)β β(\textrm{not }B | \textrm{not }A)β β(\textrm{not }C | \textrm{not }A, \textrm{not }B)\\ & 1 - (1-β(A))β (1-β(B | \textrm{not }A))β (1-β(C | \textrm{not }A, \textrm{not }B)) \end{aligned} \]

This is often easier to calculate

Letβs say we have three people, with birthday \(x_1, x_2\) and \(x_3.\)

The probability that there are at least two people with the same birthday is \[β(x_2=x_1 \textrm{ or }x_3=x_2 \textrm{ or }x_3=x_1)\] which can be rewritten as \[1-β(x_2β x_1 \textrm{ and }x_3β x_2 \textrm{ and }x_3β x_1)\]

We want to calculate \[1-β(x_2β x_1 \textrm{ and }x_3β x_2 \textrm{ and }x_3β x_1)\] We can separate like this (only the first **and**) \[1-β(x_2β x_1)β
β(x_3β x_2 \textrm{ and }x_3β x_1|x_2β x_1)\] Assuming 365 possible birthdays, we have \[1-\frac{364}{365}β
\frac{363}{365}\]

What is the probability that, in a group of N people, at least two of them share the same birthday?

How many people do we need to have at least 50% probability of least two of them sharing the same birthday?

if A and B cannot happen at the same time, then \((A \textrm{ and }B)\) is impossible, therefore \(β(A,B)=0\)

In that case (and only in that case) \[β(A \textrm{ or }B) = β(A) + β(B)\]

In particular we have \[β(A) = β(A\textrm{ and }(B \textrm{ or }\textrm{not }B)) = β(A,B) + β(A, \textrm{not }B)\] because

- \((A \textrm{ and }B)\) is incompatible with \((A \textrm{ and }\textrm{not }B)\),
- \((A \textrm{ and }(B \textrm{ or }\textrm{not }B))\) is equal to \(A\)

If we partition Ξ© into π subsets \(A_i\), such that they cover all Ξ© \[\Omega=A_1 βͺ A_2 βͺ β¦ βͺ A_n\] and each pair of events are mutually incompatible \[A_i β© A_j=\phi\] then we have \[β(\Omega)=β(A_1) + β(A_2) + β¦ + β(A_n)=1\]

One kind of events are the set of each single outcome

If \(a_i β Ξ©\) is an outcome, then \(A_i=\{a_i\}\) is an event

βThe experiment outcome is exactly \(a_i\)β

It is easy to see that these events are mutually incompatible and cover all Ξ©

Thus, \[β(\{a_1\}) + β(\{a_2\}) + β¦ + β(\{a_n\})=1\]