May 11, 2018

Wishlist for extended logic

according to Jaynes

  1. Plausibility should be a real number
  2. Qualitative Correspondence with common sense
  3. Consistency

Plausibility of \((A\wedge B\vert Z)\)

There are two ways to see if the compound predicate \(A\) AND \(B\) is true, given \(Z\)

  • First we decide if \((A\vert Z)\) is true, then we see if \((B\vert A\wedge Z)\) is true
  • First we decide if \((B\vert Z)\) is true, then we see if \((A\vert B\wedge Z)\) is true

Both ways must be equivalent since \((A\wedge B)\Leftrightarrow (B\wedge A)\)

There must be a function for it

The plausibility of \((A\wedge B\vert Z)\) depends on the predicates

  • \((A\vert Z)\) and \((B\vert A\wedge Z)\), or
  • \((B\vert Z)\) and \((A\vert B\wedge Z)\)

Thus, there must be a function \(F:\mathbb R^2\mapsto\mathbb R\) such that

  • \((A\wedge B\vert Z) = F\left((A\vert Z), (B\vert A\wedge Z)\right)\)
  • \((A\wedge B\vert Z) = F\left((B\vert Z), (A\vert B\wedge Z)\right)\)

They should be the same since \((A\wedge B)\Leftrightarrow (B\wedge A)\)

If \(F(.,.)\) exist, then \(w(.)\) exists

Jaynes proves that, under those conditions, there must be at least one function \(w:\mathbb R\mapsto\mathbb R\) such that

  • \(w(A\wedge B\vert Z) = w(A\vert Z)\cdot w(B\vert A\wedge Z)\)
  • \(w(A\wedge B\vert Z) = w(B\vert Z)\cdot w(A\vert B\wedge Z)\)

This is called multiplication rule

There may be several alternative \(w(.)\)

Notice that the theorem only proves the existence of \(w(.)\), but it does not tell us how to find it

Moreover, it does not even say if \(w(.)\) is unique

In fact, it is not unique. There are many possible \(w(.)\)

And all have to follow the multiplication rule

If \(A\) is true, given \(Z\), then …

If \(B\) and \(Z\) are not contradictory, and \(Z\Rightarrow\ A\), then

  • Knowing \(B\) cannot change anything about \(A\), so \((A\vert Z)=(A\vert B\wedge Z)\)
  • \(A\) doesn’t give any new info about \(B\), so \((B\vert Z)=(A\wedge B\vert Z)\)

We will replace these values in the multiplication formula:

\[w(A\wedge B\vert Z) = w(B\vert Z)\cdot w(A\vert B\wedge Z)\]

After we replace them, we get

\[w(B\vert Z)= w(B\vert Z)\cdot w(A\vert Z) \quad\forall B\]

Therefore, if \(A\vert Z\) is true, then \(w(A\vert Z)=1\)

If \(A\) is false, given \(Z\), then …

If \(B\) and \(Z\) are not contradictory, and \(Z\Rightarrow\neg A\), then

  • \((A\vert Z)=(A\vert B\wedge Z)\) since \(A\) is still false
  • \((A\wedge B\vert Z)=(A\vert Z)\) since \(A\wedge B\) is also false

Using these facts in the multiplication formula

\[w(A\wedge B\vert Z) = w(B\vert Z)\cdot w(A\vert B\wedge Z)\]

we get

\[w(A\vert Z)= w(B\vert Z)\cdot w(A\vert Z)\quad\forall B\]

Therefore, if \(A\vert Z\) is false, then \(w(A\vert Z)=0\) or \(w(A\vert Z)=\infty\)

Plausibility of \((\neg A\vert Z)\)

Naturally, the plausibility of \((\neg A\vert Z)\) depends on the plausibility of \((A\vert Z)\). So there must be a function \(S:\mathbb R\mapsto\mathbb R\) such that

\[w(\neg A\vert Z)=S(w(A\vert Z))\]

In particular \(S(0)=1\) and \(S(1)=0\).

Since \(\neg (\neg A)=A\) we must have \(w(A\vert Z)=S(S(w(A\vert Z)))\).

A family of solutions

Jaynes proves that the only possible solutions are of the form

\[S(x)=(1-x^m)^{1/m}\]

for any \(m>0\). Therefore

\[w(\neg A\vert Z)=S(w(A\vert Z))=(1-w(A\vert Z)^m)^{1/m}\]

and thus

\[w(A\vert Z)^m+ w(\neg A\vert Z)^m=1\]

Rewriting the product rule

The original product rule is

\[w(A\wedge B\vert Z) = w(A\vert Z)\cdot w(B\vert A\wedge Z)\]

If that is true, then we can also write

\[w(A\wedge B\vert Z)^m = w(A\vert Z)^m\cdot w(B\vert A\wedge Z)^m\]

So we made all rules depending only on \(w(.)^m\)

Choosing a better name for \(w(.)^m\)

Instead of writing \(w(A\vert Z)^m\) we will write \(\Pr(A\vert Z)\)

We call it Probability of \(A\) given \(Z\). Its rules are:

  • \(\Pr(A\vert Z)\) grows when plausibility grows
  • \(\Pr(A\wedge B\vert Z) = \Pr(A\vert Z)\cdot \Pr(B\vert A\wedge Z)\)
  • \(\Pr(A\vert Z)+ \Pr(\neg A\vert Z)=1\)
  • If \(Z\Rightarrow A\) then \(\Pr(A\vert Z)=1\)

Notice that most books write \(\Pr(A)\) instead of \(\Pr(A\vert Z)\). We prefer to make the context \(Z\) explicit.

Some consequences

It is easy to see that if \(Z\Rightarrow A\) then \(\Pr(\neg A\vert Z)=0\)

If \(\Pr(B\vert A\wedge Z)=\Pr(B\vert Z)\) then we say that \(B\) is independent of \(A\) given \(Z\)

In that case \(\Pr(A\wedge B\vert Z) = \Pr(A\vert Z)\cdot \Pr(B\vert Z)\)

Therefore we also have that \(A\) is independent of \(B\) given \(Z\)

Bayes’ theorem

Since \[\Pr(A\wedge B\vert Z) = \Pr(A\vert Z)\cdot \Pr(B\vert A\wedge Z) = \Pr(B\vert Z)\cdot \Pr(A\vert B\wedge Z)\] we can write \[\Pr(B\vert A\wedge Z) = \frac{\Pr(B\vert Z)\cdot \Pr(A\vert B\wedge Z)}{\Pr(A\vert Z)}\] except, of course, when \(\Pr(A\vert Z)=0\)

Combining all we get the Sum rule

\[\begin{aligned} \Pr(A\vee B\vert Z) &= 1- \Pr(\neg A\wedge \neg B\vert Z)\\ &=1-\Pr(\neg A\vert Z)\Pr(\neg B\vert \neg A\wedge Z)\\ &=1-\Pr(\neg A\vert Z)\left(1-\Pr(B\vert \neg A\wedge Z)\right)\\ &=1-\Pr(\neg A\vert Z)+\Pr(\neg A\vert Z)\Pr(B\vert \neg A\wedge Z)\\ &=\Pr(A\vert Z)+\Pr(\neg A\wedge B\vert Z)\\ &=\Pr(A\vert Z)+\Pr(B\vert Z)\Pr(\neg A\vert B\wedge Z)\\ &=\Pr(A\vert Z)+\Pr(B\vert Z)(1-\Pr(A\vert B\wedge Z))\\ &=\Pr(A\vert Z)+\Pr(B\vert Z)-\Pr(A\vert B\wedge Z)\Pr(B\vert Z)\\ &=\Pr(A\vert Z)+\Pr(B\vert Z)-\Pr(A\wedge B\vert Z) \end{aligned}\]