November 29, 2018

With experimental data from the big coil we found the linear model \[\text{n_marbles}=A+B\cdot \text{length}\] and we compared it with the formula from Hooke’s Law \[\text{force}=K\cdot(L-\text{length})\]

Each marble has mass \(m\). The force points down, so \[-m g\cdot\text{n_marbles}=K\cdot(L-\text{length})\] which can be re-written as \[\text{n_marbles}=\underbrace{-\frac{K}{m g}\cdot L}_{A} + \underbrace{\frac{K}{m g}}_{B}\cdot\text{length} \]

In this case it is easier to use *centimeters*, *grams* and *seconds*

Thus, force is measured in *dyne*, and length in *cm*

Looking at Hooke’s law \[\text{force}=K\cdot(L-\text{length})\] we can see that \(K\) is measured in *dyne/cm*

Our model is \[\text{n_marbles}=\underbrace{-\frac{K}{m g}\cdot L}_{\texttt{coef(model)[1]}} + \underbrace{\frac{K}{m g}}_{\texttt{coef(model)[2]}}\cdot\text{length}\] therefore \[K=\texttt{coef(model)[2]}\cdot m\cdot g\]

Taking the mass of marbles as 20gr, we calculate the **elasticity constant** as

coef(model)[2] * 20 * 9.8

length 37.5

The units are *dyne/cm*.

The label `length`

comes from `coef(model)[2]`

The number is the same as before. The units are correct this time

This is the same as last class. Since

\[\texttt{coef(model)[1]}=-\frac{K}{m g}L = -\texttt{coef(model)[1]}\cdot L \] we can calculate \(L\) as \[L=-\frac{\texttt{coef(model)[1]}}{\texttt{coef(model)[2]}} = 75.922\]

We need very little math for our course: arithmetic, algebra, and logarithms

If \(x=p^m\) then \[\log_p(x) = m\]

If we use another base, for example \(q\), then \[\log_q(x) = m\cdot\log_q(p)\]

So if we use different bases, there is only a scale factor

The easiest one is *natural logarithm*

- They only work with positive numbers. Not with 0
- If \(x=p\cdot q\) then \[\log(x)=\log(p)+\log(q)\]
- If \(\log(x)=m\) then \(x=\exp(m)\)

Basic linear model \[y=A+B\cdot x\] Exponential \[y=I\cdot R^x\qquad\log(y)=log(I)+log(R)\cdot x\] Power of \(x\) \[y=C\cdot x^E\qquad\log(y)=log(C)+E\cdot\log(x)\]

The easiest way to decide is to draw several plots, placing `log()`

in different places, and looking which one seems more like a straight line

For example, let’s analyze data from Kleiber’s Law (Physiological Reviews 1947 27:4, 511-541)

The following data shows a summary. The complete table has 26 animals

animal | kg | kcal |
---|---|---|

Mouse | 0.021 | 3.6 |

Rat | 0.282 | 28.1 |

Guinea pig | 0.410 | 35.1 |

Rabbit | 2.980 | 167.0 |

Cat | 3.000 | 152.0 |

Macaque | 4.200 | 207.0 |

Dog | 6.600 | 288.0 |

animal | kg | kcal |
---|---|---|

Goat | 36.0 | 800 |

Chimpanzee | 38.0 | 1090 |

Sheep ♂ | 46.4 | 1254 |

Sheep ♀ | 46.8 | 1330 |

Woman | 57.2 | 1368 |

Cow | 300.0 | 4221 |

Young cow | 482.0 | 7754 |

plot(kcal ~ kg, data=kleiber)

plot(log(kcal) ~ kg, data=kleiber)

plot(log(kcal) ~ log(kg), data=kleiber)

The plot that seems more straight line is the log-log plot.

Therefore we need a log-log model.

Depending on the context, we may want to use different versions of *semi-log* and *log-log* plots

For understanding the data, we do

plot(log(kcal) ~ kg, data=kleiber)

For publishing in a paper, we do

plot(kcal ~ kg, data=kleiber, log="y")

plot(log(kcal) ~ kg, data=kleiber)

plot(kcal ~ kg, data=kleiber, log="y")

plot(log(kcal) ~ log(kg), data=kleiber)

plot(kcal ~ kg, data=kleiber, log="xy")

The plot that seems more straight line is the log-log plot

Therefore we need a log-log model.

model <- lm(log(kcal) ~ log(kg), data=kleiber) coef(model)

(Intercept) log(kg) 4.206 0.756

If \[\log(kcal)=4.206 + 0.756\cdot \log(kg)\] then \[kcal=\exp(4.206) \cdot kg^{0.756}\]

Therefore:

- The average energy consumption of a 1kg animal is \(\exp(4.206) = 67.072\) kcal
- The energy consumption increases at a rate of \(0.756\) kcal/kg.

*“An animal’s metabolic rate scales to the ¾ power of the animal’s mass”.*

Google it

Models are the essence of scientific research

They provide us with two important things

- An explanation for the observed patterns of nature
- A method to predict what will happen in the future

predict(model, newdata)

where `newdata`

is a data frame with column names corresponding to the independent variables

If we omit `newdata`

, the prediction uses the original `data`

as `newdata`

predict(model) == predict(model, newdata=data)

animal | kg | kcal | predicted |
---|---|---|---|

Mouse | 0.021 | 3.6 | 1.28 |

Rat | 0.282 | 28.1 | 3.25 |

Guinea pig | 0.410 | 35.1 | 3.53 |

Rabbit | 2.980 | 167.0 | 5.03 |

Cat | 3.000 | 152.0 | 5.04 |

Macaque | 4.200 | 207.0 | 5.29 |

Dog | 6.600 | 288.0 | 5.63 |

animal | kg | kcal | predicted |
---|---|---|---|

Goat | 36.0 | 800 | 6.92 |

Chimpanzee | 38.0 | 1090 | 6.96 |

Sheep ♂ | 46.4 | 1254 | 7.11 |

Sheep ♀ | 46.8 | 1330 | 7.11 |

Woman | 57.2 | 1368 | 7.26 |

Cow | 300.0 | 4221 | 8.52 |

Young cow | 482.0 | 7754 | 8.88 |

We want to predict the metabolic rate, depending on the weight

The independent variable is \(kg\), the dependent variable is \(kcal\)

But our model uses only \(\log(kg)\) and \(\log(kcal)\)

So we have to undo the logarithm, using \(\exp()\)

predicted_kcal <- exp(predict(model))

animal | kg | kcal | predicted |
---|---|---|---|

Mouse | 0.021 | 3.6 | 3.62 |

Rat | 0.282 | 28.1 | 25.76 |

Guinea pig | 0.410 | 35.1 | 34.19 |

Rabbit | 2.980 | 167.0 | 153.11 |

Cat | 3.000 | 152.0 | 153.89 |

Macaque | 4.200 | 207.0 | 198.46 |

Dog | 6.600 | 288.0 | 279.29 |

animal | kg | kcal | predicted |
---|---|---|---|

Goat | 36.0 | 800 | 1007 |

Chimpanzee | 38.0 | 1090 | 1049 |

Sheep ♂ | 46.4 | 1254 | 1220 |

Sheep ♀ | 46.8 | 1330 | 1228 |

Woman | 57.2 | 1368 | 1429 |

Cow | 300.0 | 4221 | 5001 |

Young cow | 482.0 | 7754 | 7157 |

plot(log(kcal) ~ log(kg), data=kleiber) lines(predict(model) ~ log(kg), data=kleiber)

## Visually

plot(kcal ~ kg, data=kleiber, log="xy") lines(exp(predict(model)) ~ kg, data=kleiber)

A idea originated around 1970, by George Moore from Intel

The simple version of this law states that processor speeds will double every two years

More specifically, it says that the number of transistors on an affordable CPU would double every two years

(see paper)

plot(count~Date, data=trans)

plot(log(count) ~ Date, data=trans)

we have straight line on the semi-log

That is, `log(y)`

versus `x`

\[\log(y)=log(I)+log(R)\cdot x\] In this case the original relation is \[y=I\cdot R^x\]

model <- lm(log(count) ~ Date, data=trans) exp(coef(model))

(Intercept) Date 7.83e-295 1.41e+00

plot(count ~ Date, data=trans, log="y") lines(exp(predict(model)) ~ Date, data=trans)

Every year processors grow by a factor of

exp(coef(model)[2])

Date 1.41

- 1992 Russo-American moratorium on nuclear testing
- 1996 Computer simulations to design new weapons
- Needed more computing power than could be delivered by any existing machine

- USA designed
*ASCI Red*, the first supercomputer doing over one teraflop- ‘flop’ is a floating point operation (multiplication) per second
- teraflop is \(10^{12}\) flops

- In 1997, ASCI Red did 1.8 teraflops
- The most powerful supercomputer in the world until about the end of 2000.

In his book, John Lanchester says

"I was playing on Red only yesterday – I wasn’t really, but I did have a go on a machine that can process 1.8 teraflops.

"This Red equivalent is called the PS3: it was launched by Sony in 2005 and went on sale in 2006.

"Red was [the size of] a tennis court, used as much electricity as 800 houses, and cost US$55 million. The PS3 fits under the TV, runs off a normal power socket, and you can buy one for £200.

"[In 10 years], a computer able to process 1.8 teraflops went from being something that could only be made by the world’s richest government […], to something a teenager could expect [as a gift].