The class of next week will be based on these subjects
- What is a Transcription Factor (TF)?
- What is a Binding Site (BS)?
- What is a Motif? What is a Regular Expression?
- What is a Position Specific Score Matrix?
- How do we find Transcription Factor Binding Sites?
These programs should have been developed in class today, but we were too slow.
(This question is hard and was replaced by question 1 of the following week’s homework). Write a function to find, in a bacterial genome, all ORF larger than a minimum size. The inputs are:
genome: a vector of characters
min.size: a number
The output should be a list of vectors of characters. Can you tell which ORF are CDS?
Write a function that reads a GFF file and a FASTA file containing a genome and writes a new FASTA file with the aminioacidic sequences of the proteins encoded by the CDS. The inputs are:
- The name of the GFF file
- The name of the FASTA file with the genome
- The name of the new FASTA file that will contain the protein sequences