Blog of Andrés Aravena
CMB2:

Homework 3

22 March 2021. Deadline: Friday, 26 March, 8:59.

In class 4 we learned how to calculate the GC-content of each gene. Now we will calculate GC skew, defined as

\[\frac{G-C}{G+C}\]

  1. Draw a scatter plot of GC content and GC skew of each E.coli gene. Use GC content in the horizontal axis and GC skew in the vertical axis.

    • Write the function calculate_GC_skew() and use sapply() to apply it over each gene.
    • Do the same for GC content.
  2. Calculate the AT skew. Draw a scatter plot of GC skew and AT skew.

    • You should create a new function calculate_AT_skew() and use sapply() again.
  3. The DNA sequences on FASTA files represents one of the strands. In many times we need to know the other strand.

    To do so we need to calculate the reverse-complement sequence. That is, a sequence where the first letter is the complement of the last letter in the original sequence.

    The complement of “A” is “T”, and vice versa. The complement of “C” is “G”, and the other way around.

    Please write, in English, a detailed plan to get the reverse complement of a DNA sequence

Deadline: Friday, 26 March, 8:59.

Originally published at https://anaraven.bitbucket.io/blog/2021/cmb2/homework-3.html