Blog of Andrés Aravena
CMB1:

Some comments about the Exercise for Midterms

06 November 2017

As you remember, I gave you an exercise to prepare you for the midterm exam of Computing in Molecular Biology 1. A few (very few) of you have sent me their answers. Some people asked questions about the parts 3 and 8, which I think are interesting to all the students. Here are my comments.

Question 3 is about general culture about computers. You need to know how much pictures and music you can fit on your cellphone, SD card for the camera, USB stick or as an email attach. First you have to know that storage (for example, inside the cellphone) is measured in bytes, maybe with the prefix kilo, mega or giga. Interestingly, the transmission speed is measured in bits/second, so when you buy a cellphone you get memory in gigabytes and the internet plan (3G, 4G, LTE, etc.) in megabits per second.

In the question about pixels, the key word is “grayscale”, which means black-and-white-and-shades-of-grey. For us it means that each pixel takes one byte. If the image was not grayscale but “color”, then we would need 3 bytes for each pixel.

You also have to know that “mega” means one million. Therefore we have that “12 megapixels” means “12 million pixels”, which means “12 million bytes”, which means “12 megabytes”.

In the question about low quality audio, you start with the fact that “ten minutes” are 600 seconds. You have 8000 values (“samples”) each second, each one with 8 bits (i.e. 1 byte). Then you multiply carefully. The same logic applies to the high quality stereo audio, but with different values.

This question is not hard, except that there are some hidden clues: “grayscale” means 1 byte per pixel, “stereo” means Left and Right audio channels, so you have 2 times the amount of data. In the same way “8 bits per sample” is 1 byte per sample, and “16 bits per sample” means 2 bytes. These clues are easy if you read often about science or technology (which you should do if you want to be a good scientist).

In question 8 you need to combine two parts. First, the expression longley$Armed.Forces > 300 will give you a logic vector with TRUE values in the positions where the variable Armed.Forces is greater than 300 (and FALSE when the value is less than 300). Then you use that logic vector as an index for the longley$Year vector.

This is the most important case. We will use this strategy very often. You “ask” a yes-no question, that is, you compare a vector to a fixed value, and you get for answer a logic vector. Then you use that logic vector as index of the original vector, or the rows of a data frame, or any other vector of the same length. Please study well this case.