Blog of Andrés Aravena
Bioinfo:

Homework 1

19 September 2019. Deadline: Tuesday, 24 September, 13:00.

One of the basic problems we want to address in this course is to find a pattern —such as a word, a motif, a gene, or a protein domain— into a larger text —such as a novel, a genome or a protein. For example, we would like to know where we find the word Sancho in the file Don_Quixote.txt.

Your mission is to write a function (in any reasonable computer language) that takes two inputs, pattern and text, and returns the set of locations where pattern occurs in text.

For example, if pattern="RB" and text="ABRACADABRA", then your function should return 2 and 9. (In some languages, such as C++, Java and Python, indices start at 0, so in that case the result is 1 and 8).

  1. Write the function, and test with a FASTA file, and with Don_Quixote. Try with several patterns.
  2. How long does your function takes to find all matching places? What factors affect the execution time?

Deadline: Tuesday, 24 September, 13:00.

Originally published at https://anaraven.bitbucket.io/blog/2019/bioinfo/homework-1.html