Real | Near | About |
---|---|---|

π | 22/7 | 3 |

\(\sqrt{10}\) | 3.16 | 3 |

seconds/day | 86400 | 1E5 |

weeks/year | 52 | 50 |

days/year | 360 | 1000/3 |

E.coli genome |
4.5E6 bp | 5E6 bp |

To identify the function of a gene, one strategy is to

- compare it to similar genes, and
- identify the polymorphisms

This is called *Multiple Sequence Alignment*

Aligning \(N\) sequences of length \(m\) requires \(m^N\) individual comparisons

To fix ideas, let’s assume that \(m=1000\)

(That is a typical size for a bacterial gene)

Now assume that the computer can do one million comparisons each second

What is the time needed to align \(N\) sequences?

How many seconds will it take to align 2, 4, 8, and 12 sequences?

Under these hypothesis we have this table

\(N\) | Seconds | In words |
---|---|---|

2 | \(10^0\) | 1 sec |

4 | \(10^6\) | 1 million seconds |

8 | \(10^{18}\) | 1 trillion/quintillion seconds |

12 | \(10^{30}\) | a lot of time |

Translate these numbers to days, years, etc.

(Approximate answer are OK)

How do these numbers change if \(m\) changes?

What happens if the computers are 1000 times faster?

What is the largest multiple alignment that you can do in your life?

What is the largest number of sequences that can be aligned?

This is clearly too expensive, so we need heuristics

(i.e. solving a similar but simpler problem)

There are fast Multiple Sequence Alignment methods, but they are approximate

We are doing something similar with our way of calculating

Write down the answer.

In other words, come up with a reasonably close solution.

If you can’t estimate the answer, break the problem into smaller pieces and estimate the answer for each one

We have already done the first approach. Today we do the second

This part follows the text of

“Guesstimation: Solving the World’s Problem” by Lawrence Weinstein and
John A. Adam

(or your own city)

This is a classic example originated by Enrico Fermi

(in the 1930’s)

It is used at the beginning of many physics courses, because

- it requires the methods and reasoning used in science
- does not need any physics concepts.

This is a complicated problem. We cannot just estimate the answer

To solve this, we need to break down the problem

We need to estimate

- how many pianos there are in Los Angeles
- how many pianos each tuner can care for

**How would you do it?**

the population of the city

the proportion of people that own a piano

the number of schools, churches, etc. that also have pianos

we need to estimate

how often each piano is tuned

how much time it takes to tune a piano

how much time a piano tuner spends tuning pianos

- It must be much less than 10
^{8}- since the population of USA is 3 × 10
^{8}

- since the population of USA is 3 × 10
- It must be much more than 10
^{6}- since that is the size of an ordinary big city

- We estimate it at 10
^{7}

Pianos will be owned by individuals, schools, and houses of worship

- About 10% of the population plays a musical instrument
- it’s surely more than 1% and less than 100%

- At most 10% of musicians play the piano
- not all of them own a piano

- the proportion that own a piano is probably 2–3% of the musicians
- This would be 2 × 10
^{−3}of the population

- There is about one house of worship per thousand people
- each of those will have a piano

- There is about one school per 500 students
- or about 1 per 1000 population
- each of those will have a piano

This gives us about 4×10^{−3} pianos per person

Thus, the number of pianos will be about \(10^7×4×10^{−3} =4×10^4\)

Pianos will be tuned less than once per month and more than once per decade

We’ll estimate once per year.

It must take much more than 30 minutes and less than one day to tune a piano (assuming that it is not too badly out of tune)

We’ll estimate 2 hours

Another way to look at it is that there are 88 keys

- At 1 minute per key, it will take 1.5 hours
- At 2 minutes per key, it will take 3 hours

A full-time worker works

- 8 hours per day
- 5 days per week
- 50 weeks per year

which gives 8 × 5 × 50 = 2000 hours

In 2000 hours she can tune about 1000 pianos

Do you think these values are still valid?

How do you think these values changed?

Why?

In 1950, at the Los Alamos National Laboratory, four scientists (Emil Konopinski, Edward Teller, Hebert York, and Enrico Fermi) had a casual conversation about flying saucers during lunch

This quickly turned into a discussion about the possibility of sophisticated societies populating the universe

During the discussion, Enrico Fermi came out with this casual remark

“Where is everybody?”

Herbert York wrote in 1984 that Fermi “followed up with a series of

- calculations on the probability of earth-like planets,
- the probability of life given an earth,
- the probability of humans given life,
- the likely rise and duration of high technology
- and so on

He concluded on the basis of such calculations that we ought to have been visited long ago and many times over”

\[N = R_* \cdot f_\mathrm{p} \cdot n_\mathrm{e} \cdot f_\mathrm{l} \cdot f_\mathrm{i} \cdot f_\mathrm{c} \cdot L\]

where

- \(N\) = the number of civilizations in our galaxy with which communication might be possible
- \(R_{∗}\) = the average rate of star formation in our Galaxy
- \(f_{p}\) = the fraction of those stars that have planets
- \(n_{e}\) = the average number of planets that can potentially support life per star that has planets

\[N = R_* \cdot f_\mathrm{p} \cdot n_\mathrm{e} \cdot f_\mathrm{l} \cdot f_\mathrm{i} \cdot f_\mathrm{c} \cdot L\]

where

- \(f_{l}\) = the fraction of planets that could support life that actually develop life at some point
- \(f_{i}\) = the fraction of planets with life that actually go on to develop civilizations
- \(f_{c}\) = the fraction of civilizations that develop a technology that releases detectable signs of their existence into space
- \(L\) = the length of time for which such civilizations release detectable signals into space

- \(R_{*}\) = 1 yr
^{-1}(1 star formed per year, on average) - \(f_{p}\) = 0.2 to 0.5 (1/5 to 1/2 of all stars will have planets)
- \(n_{e}\) = 1 to 5 (stars with planets will have between 1 and 5 planets capable of developing life)
- \(f_{l}\) = 1 (100% of these planets will develop life)
- \(f_{i}\) = 1 (100% of which will develop intelligent life)
- \(f_{c}\) = 0.1 to 0.2 (10–20% of which will be able to communicate)
- \(L\) = 1000 to 100,000,000 communicative civilizations (which will last somewhere between 1000 and 100,000,000 years)

With the lowest initial guesses we get a minimum N of 20

The maximum numbers gives a maximum of 50,000,000

This varies *a lot* depending on the hypothesis

**Question:** How can we know the range of values for
\(N\)?