Class 6: Statistical error

March 11, 2020

Significant figures

Dinosaur Bones

Some tourists in the Museum of Natural History are marveling at some dinosaur bones. One of them asks the guard, “Can you tell me how old the dinosaur bones are?”

The guard replies, “They are 3 million, four years, and six months old.”

“That’s an awfully exact number,” says the tourist. “How do you know their age so precisely?”

The guard answers, “Well, the dinosaur bones were three million years old when I started working here, and that was four and a half years ago.”

Being honest

Lets be honest about what we know and what we do not know

We write the values that have real meaning

3 million years means 3±0.5 ⨉ 10⁶

Adding 4.5 years is meaningless

Significant Figures

Which figures are significant?

All non-zero digits are significant
- In 1234 all digits are significant
- Same in 12.34 and 1.234
Zeros surrounded by non-zero are significant
- Same in 1204. Four significant digits

Which figures are significant?

Zeros to the left are not significant
- Four significant digits in 0.0001234
Zeros to the right are not significant unless there is a decimal point
- 12340 has four significant digits
- 123.40 has five significant digits
- 1234.0 has five significant digits
- 12340. has five significant digits

Be aware of notation

Notice that 1234.0 is not the same as 1234

Also, 12340. is not the same as 12340

This is a convention but not everybody uses it

It is much safer to use scientific notation

Using scientific notation

It is safer to represent numbers in scientific notation \[ \begin{align} 1234.0 & = 1.2340\cdot 10^3\\ 1234 & = 1.234 \cdot 10^3\\ 12340. & = 1.2340 \cdot 10^4\\ 12340 & = 1.234 \cdot 10^4 \end{align} \]

All digits in the mantissa are significant

(mantissa is the number being multiplied by 10 to the power of the exponent)

Writing our result

Last week we measured a volume
Round the uncertainty to a single figure
- Instead of 2013.765 ± 78.345 write 2013.765 ± 80
Then round the main value to the last well known place
- Instead of 2013.765 ± 80 write 2010 ± 80
The digits “3.765” were a white lie. Let’s not fool ourselves

Sources of uncertainty

The easy rule

Last class we saw the easy rule for error propagation

\[ \begin{align} (x ± 𝚫x) + (y ± 𝚫y) & = (x+y) ± (𝚫x+𝚫y)\\ (x ± 𝚫x) - (y ± 𝚫y) & = (x-y) ± (𝚫x+𝚫y)\\ (x ± 𝚫x\%) \times (y ± 𝚫y\%)& =xy ± (𝚫x\% + 𝚫y\%)\\ (x ± 𝚫x\%) ÷ (y ± 𝚫y\%)& =x/y ± (𝚫x\% + 𝚫y\%)\end{align} \]

Here \(𝚫x\%\) represents the relative uncertainty, that is \(𝚫x/x\)

We use absolute uncertainty for + and -, and relative uncertainty for ⨉ and ÷

General formula

Assuming that the errors are small compared to the main value, we can find the error for any “reasonable” function

Taylor’s Theorem says that, for any derivable function \(f,\) we have \[f(x±𝚫x) = f(x) ± \frac{df}{dx}(x)\cdot 𝚫x + \frac{d^2f}{dx^2}(x+\varepsilon)\cdot \frac{𝚫x^2}{2}\] When \(𝚫x\) is small, we can ignore the last part.

Examples

\[\begin{align} (x ±𝚫x)^2& = x^2 ± 2x\cdot𝚫x\\ & = x^2 ± 2x^2\cdot\frac{𝚫x}{x} \\ & = x^2 ± 2𝚫x\% \end{align}\] and \[\begin{align} \sqrt{x ±𝚫x}& = \sqrt x ± \frac{1}{2\sqrt x}\cdot 𝚫x\\ & = \sqrt x ± \frac{1}{2}\sqrt x\cdot \frac{𝚫x}{x}\\ & = \sqrt x ± \frac{1}{2}𝚫x\% \end{align}\]

Probabilistic uncertainty propagation

These rules are “pessimistic”. They give the worst case

In general the “errors” can be positive or negative, and they tend to compensate

(This is valid only if the errors are independent)

In this case we can analyze uncertainty using the rules of probabilities

Variance quantifies uncertainty

In this case, the value \(𝚫x\) will represent the standard deviation of the measurement

The standard deviation is the square root of the variance

Then, we combine variances using the rule

“The variance of a sum is the sum of the variances”

(Again, this is valid only if the errors are independent)

The probabilistic rule

\[ \begin{align} (x ± 𝚫x) + (y ± 𝚫y) & = (x+y) ± \sqrt{𝚫x^2+𝚫y^2}\\ (x ± 𝚫x) - (y ± 𝚫y) & = (x-y) ± \sqrt{𝚫x^2+𝚫y^2}\\ (x ± 𝚫x\%) \times (y ± 𝚫y\%)& =x y ± \sqrt{𝚫x\%^2+𝚫y\%^2}\\ \frac{x ± 𝚫x\%}{y ± 𝚫y\%} & =\frac{x}{y} ± \sqrt{𝚫x\%^2+𝚫y\%^2} \end{align} \]

Confidence interval for the measurement

When using probabilistic rules we need to multiply the standard deviation by a constant k, associated with the confidence level

In most cases (but not all), the uncertainty follows a Normal distribution. In that case

\(k=1.96\) corresponds to 95% confidence
\(k=2.00\) corresponds to 98.9% confidence
\(k=2.57\) corresponds to 99% confidence
\(k=3.00\) corresponds to 99.9% confidence

Not all uncertainties are alike

Last class we only considered one kind of uncertainty: the instrument resolution

This is a “one time” error

We notice it immediately
It does not change if we measure again

Visually

Noise

There are other sources of uncertainty: noise

When the instrument resolution is good, we observe that the measured values change on every read

In many cases this is due to thermal effects, or other sources of noise

Usually the variability follows a Normal distribution

Example

Distribution of thermal noise

Real v/s measured data

Discretization uncertainty

Combined

How to combine

The exact distribution is hard to calculate

International standards suggest using computer simulation

They recommend Montecarlo methods

(what we did here)

Standard deviation of discretization

Standard deviation of rectangular distribution is \[u=\frac{a}{\sqrt{3}}\] when the width of the rectangle is \(2a\)

Standard deviation of thermal noise

Standard deviation of noise can be estimated from the data: \[s=\sqrt{\frac{1}{n-1}\sum_i (x_i - \bar x)^2}\]

Standard deviation of average

If the measures are random, their average is also random

It has the same mean but less variance

Standard error of the average of samples is \[\frac{s}{\sqrt{n}}\]

Significant figures

Dinosaur Bones

Being honest

Significant Figures

Which figures are significant?

Which figures are significant?

Be aware of notation

Using scientific notation

Writing our result

Sources of uncertainty

The easy rule

General formula

Examples

Probabilistic uncertainty propagation

Variance quantifies uncertainty

The probabilistic rule

Confidence interval for the measurement

Not all uncertainties are alike

Visually

Visually

Noise

Example

Distribution of thermal noise

Real v/s measured data

Discretization uncertainty

Combined

How to combine

Standard deviation of discretization

Standard deviation of thermal noise

Standard deviation of average

More samples give higher precision, but not better accuracy