- we want to predict the future
- discuss what can happen in the future
- talk about what could have happened in the past
- reason about how we got to this present

Nature has rules. Universal and permanent rules

Whatever happens in the future is the result of applying the rules to the current state of the universe

\[\text{State}_{t+1} = F(\text{State}_t, \text{Parameters})\]

We just need to follow the logic consequences

If we launch a ball, and we know the angle and speed, then we can predict where it will fall

We can launch a rocket and land in the moon

We can put a satellite to explore the Earth, find our position using GPS, and watch TV from other countries

We can build a plane that can fly and carry us to other countries

If the world is deterministic, and we know

- all the rules
- all the parameters with infinite precision
- the current state of the world with infinite precision

then we can predict everything that will happen

and everything that has happened before

We just need to use *logic*

We do not know all the rules

Among the rules that we know, some

have complex solutions. They are hard to calculate

depend on parameters that we do not know

give very different results when parameters change a little bit

Since we have *imperfect knowledge*, we must deal with
*degrees of certainty*

How much we believe some predicate is true

We want to give a numeric value to the chances that our experiment is successful

We want to compare the chances of *success* versus
*failure*

We will call *experiment* to any procedure generating a result
which we do not knew before doing it

This include “natural experiments” and observations of the nature

An **experiment** produces a single
**outcome**

We do not know the outcome until we perform the experiment

If we knew the outcome before doing the experiment, we would not be doing it

It is useful to give a name to the set of all possible outcomes

We will call it \(Ω\)

- Your temperature is 38.2°C
- Your average grade is 85%
- Rain fall is 2mm in the last hour
- Plant height is 30cm
- Lottery winner has number 12345678

Exercise: What is \(Ω\) in each case?

An **event** is a yes-no question that will be answered
by the experiment

Having fever is an event. Thermometer showing 38.2°C is an outcome.

- You have fever, or not
- You finish your Master’s degree, or not
- Rain falls, or not
- Plant grows
- You win the lottery
- Your experiment gives the expected outcome

An **event** will become either *true* or
*false* after an **experiment**

For example, a dice can be either 4 or not

We want to give a value to our rational belief that the event will become true after the experiment

The numeric value is called **Probability**

Most people are familiar with the naive idea of Probability

\[ℙ(A)=\frac{\text{Number of cases where }A\text{ is true}}{\text{Total number of cases}}\]

This is a useful first approach, but it is easy to get confused

It is not obvious which are the cases

For example, if you throw a dice, what is the probability of getting a 6?

We have to be careful

new information may change our confidence

For example, if we learn that the dice outcome is an even number, what is the probability of getting a 6?

What if we learn that the outcome is an odd number?

They

- reflect what we know
- represent our rational confidence on future events

They are *subjective*, because different subjects may have
different knowledge

But they are *not arbitrary*.

We must use all the available information, and follow all the rules

We will use capital letters to represent *events*. For
example

\(A\): The dice outcome is 6

\(B\): The dice outcome is even

The probability of \(A\), given that we know \(B\) is \[ℙ(A|B)\]

This is called **conditional probability**

We always evaluate probabilities based on what whe know

If the background knowledge is well known, and does not change, we sometimes write \[ℙ(A)\]

This is to simplify notation.

But do not forget that there is an implicit context.

The order is relevant \[ℙ(A|Z)≠ℙ(Z|A)\] There are two events, 𝐴 and 𝑍

The one written after `|`

is what we assume to be true

The one written before `|`

is what we are asking for

One we know, the other we do not

The set of all possible outcomes is often called Ω

An event 𝐴 can be seen as the subset of all outcomes that make the event true

For example, \[Fever=\{Temp>37.5°C\}\]

It is useful to think that the probability of an event is the area in the drawing

The total area of Ω is 1

Usually we do not know the shape of 𝐴

Our rational beliefs depend on our knowledge

If we represent our knowledge (or hypothesis) by 𝑍, the the probability of an event 𝐴 is written as \[ℙ(A|Z)\] We read “the probability of event 𝐴, given that we know 𝑍”

For example, “the probability that we get a 4, given that the dice is symmetrical”

Now outcomes are limited only to the 𝑍 region

We measure the area of \(ℙ(A|Z)\) with respect to the area of 𝑍 instead of Ω

The shape of 𝑍 is often unknown

It has been proven that probabilities must be like this

A probability is a number between 0 and 1 inclusive \[ℙ(A) ≥ 0\text{ and } ℙ(A)≤1\]

The probability of an sure event is 1 \[ℙ(\text{True}) = 1\]

The probability of an impossible event is 0 \[ℙ(\text{False}) = 0\]

We are interested in non-trivial events, that are usually combinations of smaller events

For example, we may ask “what is the probability that, in a group of 𝑛 people, at least two persons have the same birthday?”

Fortunately, any complex event can be decomposed into simpler events,
combined with **and**, **or** and
**not** connectors

Exercise: decompose the *birthday* event into simpler ones

If the event 𝐴 becomes more and more plausible, then the opposite
event **not** 𝐴 becomes less and less plausible

It can be shown that we always have \[ℙ(\text{not } A) = 1-ℙ(A)\]

We have \[ℙ(\text{not } A) = 1-ℙ(A)\] therefore \[ℙ(A) + ℙ(\text{not } A) = 1\]

A coin is an experiment where \(Ω=\{"H","T"\}\)

Let’s take \(A\) to be *the
outcome is “H”*

\[ℙ(X="H") + ℙ(X="T") = 1\]

Without more information (or hypothesis) we cannot know more

If we do not have any reason to believe that one side of the coin has more chance than the other, then we assume that both sides have equal chances

If all alternatives are symmetric, then the probabilities are equal \[ℙ(X="H")= ℙ(X="T")\] Therefore \[ℙ(X="H")= ℙ(X="T")=\frac 1 2\]

Lets consider two different events \(A\) and \(B\)

(for instance, if X is the result of a dice, “X>3” and “X is
even”)

\[\underbrace{B}_{m} = B\text{ and }(A\text{ or not }A) = \underbrace{(B\text{ and }A)}_{m_1}\text{ or }\underbrace{(B\text{ and not }A)}_{m_2}\] We see that \(m=m_1+m_2\) because, for an outcome where \(B\) is true, we have either “\((B\text{ and }A)\) is true” or “\((B\text{ and not }A)\) is true”, but never both

Show that for a fair dice the probability of each side is 1/6.