For March 15, 2016, the homework questions are this:
Genes and Proteins
- What is the genetic code? How was it discovered? Why George Gamow was wrong?
- Given the FASTA of a prokaryotic coding gene, How can you get the sequence of the protein?
- Write a function to transform the sequence of a gene into the corresponding protein
- How can we combine a FASTA file and a GFF file to get the gene sequences?
- How can we read a GFF file on R?
- What is an ORF? What is a CDS?
Write an HTML document (using Rmarkdown) describing the location of the replication origin (ori) of E.coli (strain K12, accession NC_000913.3). Then make a copy and apply it to Acidithiobacillus ferrooxidans strain ATCC23270.
You can use the same function three times in each document with different parameters to draw the GC skew for windows of length 1k, 10K and 100K. A good start point is the template given in classes.
- How would you validate experimentally this result?
- How do we find the exact position of the replication origin?
Describe the other apply functions: