Blog of Andrés Aravena

Comments about the Final Exam

01 June 2019

Why some people got all answers correctly, while others got nothing? I think it is directly correlated to delivering homework. The people who did homework got prepared for this kind of questions. Of course, the exam has different questions, but the logic and philosophy is the same as in the homework. Here I explain how.

The first question was probably the harder, since it required combining two populations. We discussed this in class, when we spoke about horizontal gene transfer. We showed an example where genes from two organisms were mixed, and we drew a histogram of the combined set of genes. Then we said that it was the same as the score of an exam where you studied only half of the subjects. Even more, I said in that class “this is very much like an exam question, we will probably ask it”.

It was also similar to the epilepsy homework, where we had to count how many TRUE cases there were for different group sizes. This result was combined with the dolmuş, ferry, tramway homework. We had to sum all partial results and find the interval that contained most of the simulation results

The most common error that people made here was to write something like

### WRONG ###
sample("TRUE", "FALSE", size=nq, replace=TRUE, prob=c(0.8, 0.2))

This does not work, because "TRUE" is not TRUE. When we write quotation marks "", the value is a text, and we cannot sum it or calculate average. The exam statement is clear: the answer must be TRUE or FALSE. There are no quotations. There are no "TRUE" or "FALSE". The correct code is something like

sample(TRUE, FALSE, size=nq, replace=TRUE, prob=c(0.8, 0.2))

This is something basic, that everybody should know from the previous course. They are even shown in different colors. Making this mistake is like confusing DNA with RNA, or lactose with lactase. It is either ignorance of basic facts, or lack of professionalism. In either case, people which keeps making these mistakes are going to be lousy molecular biologists.

Second question

This was essentially the same as the last two homework assignments. To make your life simpler, there was no need to use seqinr or download big files. If you attended to the last 3 classes and wrote the important formulas in your book, it was easy. If you had solved the homework, it was very easy, since it was exactly the same formula, with different values.

Third question

This was exactly the same question as the same birthday homework. The only change was that, instead of 365 days, we had 1000 “days”.

Many people got confused in this question. The simple version is this: create a vector of size num_people with numbers between 0 and 1000. Each person gets a single number.

The biological interpretation (that confused many students, but which is not necessary for the answer) is that each number between 0 and 1000 has three digits, from 000 to 999. Each digit has ten possible values, each one with the same probability. This is the case when there are 3 loci, each one taking one of 10 allele. As usual, we can forget about all biochemical characteristics of the genome, and just represent them with a number.

The code to create this vector is something like this

sample(0:1000, size=num_people, replace=TRUE)

but many people got confused and wrote something like this

### WRONG ###
sample(num_people, size=1000, replace=TRUE)

It is essential to not be confused with the meaning of each part of the functions we use. Again, it is the same as following a laboratory protocol. If you confuse, you lose.

The rest of the question is identical to the homework, changing only the function names. This is just one of the many cases when the solution of one problem can also solve another. And that is the most important thing that you need to learn in these courses.


Do all the homework. Talk to your classmates, but find your own solution. The thing to learn is not any particular answer, is how to make your own answer.

Think of a food cooking course. Do you want to learn how to prepare the same dishes that everybody else can prepare (and become a underpaid, easy-to-replace cook)? Or do you want to learn how to create new dishes that nobody has tasted before (and become a successful and famous chef)?

Find your own answers. If you do not find them, find your own questions.