Welcome

to “Computing for Molecular Biology 1”

We will learn about

  • computing
  • data science
  • descriptive statistics
  • documenting results

and maybe (if we have time)

  • how computers work
  • how internet works

My name is Andrés Aravena

Even if my passport says

Andrés Octavio Aravena Duarte

I have a short and a long version of my name

Türkçe bilmiyorum

I am

  • Assistant Professor at Molecular Biology and Genomics Department
  • Mathematical Engineer, U. of Chile
  • PhD Informatics, U Rennes 1, France
  • PhD Mathematical Modeling, U. of Chile
  • not a Biologist
  • but an Applied Mathematician who can speak “biologist language”

I’ve worked on

  • Big and small computers
  • Telecommunication Networks
  • Between 2003 and 2014 I was the chief research engineer
    • on the main bioinformatic group in my country
    • in the top research center (CMM)
    • in the top university (University of Chile)
    • of my country

I come from Chile

world

Chile

chile

Small country of ~17 million people

Universities ranks similar to Turkish ones

Spanish colony 500 years ago (so language is Spanish)

Independent Republic 200 years ago

First Latin American country to recognize Turkish republic

Everyday life very similar to Turkey

Chileans like Turkish soap operas

binbirgece

The most successful soap opera last year was Bin Bir Gece

Latin America in Turkey

Foreigners enrich the hosting countries. Just look at the food:

  • Corn is from North and South America
    • spanish name is maiz
  • Tomato is Mexican: tomates
  • Potato is from Chile and Peru: patatas

tomato potato corn

Diversity increases opportunities

I am American

america

USA is not America

America is the continent, from Patagonia to Alaska

USA is one country in part of the North of America

The same way as Australia is not Oceania

English is not my native language

If you don’t understand me, just ask

The only silly question is the one not asked

If you don’t understand, probably you are not the only one

Ask for help, from me or your classmates

How this course works

The Goal

The purpose of this course is to learn to think with data

Not to study

Not to teach

But to learn

What is learning?

To learn is to change our behavior rules

We think different, we act different

We think with two strategies:

  • analytical
  • automatic

So learning has to be in two stages

How do we learn?

We learn what we do, not what we study

Like riding a bike

Can you learn to dance only reading Wikipedia?

Some theory, a lot of practice

Failure is expected. We learn from failure. Try judo

Why learning is hard?

Not always. Kids learn easily.

We have difficult learning new things because we have to unlearn old habits.

So we will need to make an extra effort

Cognitive styles

how people remembers

According to cognitive research, there are three main ways of remembering

  • A few people remembers what they see
  • Some people remembers what they listen
  • Most people remembers what they do
    • Kinesthetic sense: body position

Moreover logic is tightly connected with language

Note taking

cornell

Google: Cornell Method

Focusing

It’s easy to get distracted,
specially when we don’t like the task

We learn by getting out of or “comfort zone”

Learning is uncomfortable, even annoying

We have to push ourselves to focus

Pomodoro technique

  • set a timer for 25 minutes
  • force yourself to focus on that period
  • no email, no Facebook, no toilet, no coffee/cigarettes
  • in case of interruption, restart from zero
  • when the bell rings STOP working for 5 minutes
  • every 3 Pomodoros, take a 20 min break

We will use this technique in this course

How memory works

We have roughly three stages of memory

  • very short term memory (seconds)
  • medium term memory (hours)
  • long term memory (years)

Transition between medium and long term memory happens when we sleep (and dream)

Ideally we need 2 sessions per week

Inverted class

there is nothing new under the sun

All the ideas we will discuss are available for free on line

I understand you will learn from the web

I encourage you to do so

and then come and teach me something new

Learning strategy

my proposal

Attend to classes regularly (always!)

Bring a notebook and a pen
Handwrite what we speak, and your own questions
Summarize at the end of the class (Cornell Method)

Speak with your classmates

Sleep well (but not during the class)

I will give a subject every week.
You have to research and present it in 25 min

Online forum

participation not optional

Why computers?

for Molecular Biology and Genetics

Computers are rule changers

Modern computers were created to solve math equations

Then they were used to handle big databases

They became cheap and found everywhere

They became communication tools

They transformed society and science

How many computers do you use?

  • Cellphone
  • TV
  • Cable decoder
  • Microwave oven
  • Washing machine
  • Car motor
  • Metro
  • Elevator
  • Notebook

Computers transformed

  • the banking industry
  • the air travel industry
  • the manufacturing
  • the cars
  • the movies
  • Science

Four Paradigms of Science

1 Empiric

  • observation of isolated facts
  • description of related facts
  • e.g. Botany

2 Theoretical

  • Abstract models and theories
  • Usually expressed in mathematical formulas
  • Correct predictions validate the models
  • e.g. Mendel laws of inheritance

Four Paradigms of Science

3 Simulation Based

  • Models that cannot be expressed in formulas
  • Formulas that cannot be solved
  • e.g. Protein structure prediction

4 Data Based

  • Discovering patterns hidden in data
  • Huge volumes of data
  • Complex interactions
  • e.g. Bioinformatics

Computers

What does Computer means?

A computer is a counter

Normally was a person that did calculations

Sometimes with the help of mechanical devices

During the 2nd World War people invented electronic computers

So, computers are devices handling numbers

A Computer

“but I don’t use numbers …”

Don’t worry

Using numbers we can represent other things

In my country kids change vowels A, E, I, O, U by the numbers 1, 2, 3, 4, 5

Then they say H2LL4 (they are just kids)

Using the same idea we can represent any text

Notice that we have represented sounds by signs for centuries

Numbers can represent other things

There are three things in the Universe

  • Matter
  • Energy
  • Information

Information can be put in digital (numeric) form

Numbers can represent a lot of things

  • Images
  • Audio
  • Movies

not yet

  • smell
  • taste
  • tact

What can a modern computer do?

Computers handle numbers

Numbers represent information

Computers can transform and transfer information

So, What is a Computer

Computer: (English) counter, calculator

Ordinateur: (French) sorter, gives order to and handles data

Bilgisayar: (Turkish) Information/Data counter

What do you do with a computer?

Do you have a computer at home?

What do you use it for?

What can a computer do?

  • calculate formulas
  • solve (some) equations
  • store and retrieve huge quantities of data
  • find patterns in data
  • find data matching a pattern
  • transform data in useful ways
  • compress data
  • move data at low cost without distortion

Let’s play “computer”

Solving an equation

First usage of electronic computers was to solve ballistic equations

Same approach enabled landing on the moon

Let’s find the value \(x\) that satisfies \[24x^3-70x^2+19x+15=0\]

Naming the formula

Let us put a name to the formula. Let’s call it \(f(x)\). \[f(x) = 24x^3-70x^2+19x+15\]

We want to find \(x\) that makes \(f(x)=0.\) We can write \[f(x) = (24x^2-70x+19)x+15\] or even \[f(x) = ((24x-70)x+19)x+15\]

Computing f(x) given x

  • Take a piece of paper and write \(x\) in the first line
  • Write 24
  • Multiply the last two numbers
  • Add -70
  • Write \(x\) (from the first line)
  • Multiply the last two numbers
  • Add 19
  • Write \(x\) (from the first line)
  • Multiply the last two numbers
  • Add 15
  • Compare to 0

Let us become a computer

I will assign roles to each one

I will write \(x\) in a sheet and give it to the first person

Each one does a single task and delivers to the next person

I will collect all results and tabulate them

Computing f(x) given x

  • Take a piece of paper and write \(x\) in the first line
  • Write 24
  • Multiply the last two numbers
  • Add -70
  • Write \(x\) (from the first line)
  • Multiply the last two numbers
  • Add 19
  • Write \(x\) (from the first line)
  • Multiply the last two numbers
  • Add 15
  • Compare to 0

What happened?

We solved a complex mathematical question using a simple set of rules

  • write
  • multiply
  • add
  • compare

This decomposition in simple steps is called a program

Parts of a computer

In this exercise we used

  • memory (paper)
  • arithmetic/logic units (you: adding, multiplying, deciding)
  • input/output (me)

Programs

Many different questions can be solved with the same rules

It is just a matter of changing the program

First electromecanic computers were like us:
A sequence of devices, each one feeding the next

Changing the program required physical change of pieces

Stored program

The key step

John Von Neuman realized that the set of steps can be also stored in memory
(coded as numbers, obviously)

We only need to include

  • a pointer to the current instruction
  • a system to decide which arithmetic/logic rule apply

This is called Central Process Unit (CPU)

Hardware and Software

Since old times physical tools are called hardware

That includes al the physical parts of the computer
(what you can kick)

Programs determine the function of the computer, but they are not “physical”.

That is software (what you can only insult)

Biological analogy

All cell components are hardware

The sequence of the DNA is the software

In summary

What is a computer?

Is a general purpose device that can

  • read, process and write numbers
    • (and things that can be represented by numbers)
    • to and from the memory
  • following a program stored also in the memory
    • many simple steps

Changing the program changes the purpose of the machine

In the Next Chapter

we will see …

  • How information is coded in numbers
  • How these numbers are stored and organized
  • How we interact with computers
  • Start using an specific tool: RStudio

Homework

  • Prepare a presentation about NCBI
  • Install R and RStudio
  • Register in the Google Group