Homework after Class 16

16 April 2017. Deadline: Monday, 1 May, 14:00.

The homework of this week aims to replicate the tables and graphics of the website Comparative Genometrics, which has precomputed statistics for the DNA sequences of several thousands of Bacteria.

Please take a look at the page of E.coli K-12. in the Comparative Genometrics. You can see that the graphics are made based on the table CP009685.txt.

Please write the R code to read the genome of E.coli and produce a table equivalent to CP009685.txt. You may see that the step size is 1000 nt, the column pos is the average of start and end, the columns nA, nC, nG and nT are the output of table(), and GCsk and TAsk are very easy to calculate.

You have to research and understand how to make the columns cGCsk and cTAsk. The function cumsum() may be useful, but you can do the same with a for loop.

PS. Can you make a function to produce the reverse complement of a DNA sequence?

