The homework of this week aims to replicate the tables and graphics of the website Comparative Genometrics, which has precomputed statistics for the DNA sequences of several thousands of Bacteria.
Please take a look at the page of E.coli K-12. in the Comparative Genometrics. You can see that the graphics are made based on the table CP009685.txt.
Please write the R code to read the genome of E.coli and
produce a table equivalent to CP009685.txt.
You may see that the step size is 1000 nt, the column
pos is the average of start and
end, the columns nA, nC,
nG and nT are the output of
table(), and GCsk and TAsk are
very easy to calculate.
You have to research and understand how to make the columns
cGCsk and cTAsk. The function
cumsum() may be useful, but you can do the same with a
for loop.
iyi çalismalar
PS. Can you make a function to produce the reverse complement of a DNA sequence?