Class 1: Introduction to Computational Thinking

Computing for Molecular Biology 2

Andrés Aravena, PhD

12 March 2021

Welcome to
“Computing in Molecular Biology 2”

What is this about?

The previous course was “Introduction to Data Science”

  • also know as “Computers are not typewriters”

This course is “Scientific Computing”

  • a.k.a. “Computational Thinking”
  • a.k.a. “Quantitative Thinking”

Why?

Why computers?

for Molecular Biology and Genetics

Computers are rule changers

Computers are essential tools for Molecular Biologists

  • They control the instruments

  • The help us to understand the results

  • They help us to design the experiments

We will focus on the last 2 items

Software as a Scientific Tool

"Scientists spend an increasing amount of time building and using software.

However, most scientists are never taught how to do this efficiently"

Software as a Scientific Tool

“Software is as important to modern scientific research as telescopes and test tubes”

Software as a Scientific Tool

“…recent studies have found that scientists typically spend more than 30% of their time developing software…”

Software as a Scientific Tool

“We believe that software is just another kind of experimental apparatus and should be built, checked, and used as carefully as any physical apparatus”

Software skills are important

"However, most scientists do not know how reliable their software is.

This can lead to serious errors impacting the central conclusions of published research"

Software skills are important

“Recent high-profile retractions, technical comments, and corrections because of errors in computational methods include papers in Science, PNAS, the Journal of Molecular Biology, Ecology Letters, the Journal of Mammalogy, Journal of the American College of Cardiology, Hypertension, and The American Economic Review”.

Who said so?

Wilson et al. “Best Practices for Scientific Computing.” PLoS Biology 12,1 (2014)

  • University of Ontario Institute of Technology, Canada
  • Michigan State University, USA
  • Space Telescope Science Institute, USA
  • University of Toronto, Canada
  • Monterey Bay Aquarium Research Institute, USA
  • University of California Berkeley, USA
  • University of British Columbia, Canada
  • Queen Mary University of London,United Kingdom
  • University College London, United Kingdom
  • University of California Davis, USA

Quantitative Methods

Harvard Medical School

Modern biology increasingly requires computational and quantitative methods to collect, process, and analyze data, as well as to understand and predict the behavior of complex systems.

Quantitative Methods

Harvard Medical School

Whereas throughout much of the 20th century computational and mathematical biology were niche disciplines, their methods are now becoming an integral part of the practice of biology across all fields.

Read more in the paper

Stefan et al. “The Quantitative Methods Boot Camp: Teaching Quantitative Thinking and Computing Skills to Graduate Students in the Life Sciences”. PLoS Computational Biology 11, 1–12 (2015).

Learning Goals and Objectives

The authors say:

“We broadly categorize these goals into three domains”

  • thinking
  • doing
  • feeling

The authors say:

Developing practical programming skills (“doing”) is of limited use if one does not also develop both the ability to think about problems algorithmically (“thinking”) and a positive attitude towards computing (“feeling”).

Thinking

Students should be able to

  • recognize when to use computational methods
  • analyze a problem to find a computational solution
  • use simulations to learn about biological systems
  • compare the results of simulations to real-world data

Thinking

  • formulate and test hypotheses
  • understand a project as a collection of smaller parts
  • prepare a plan to solve a problem
  • think of ways to test if the computational approach is valid

Doing

Students will be able to

  • import large datasets
  • put them into appropriate computational structures
  • visualize a dataset in multiple ways
  • compute summary statistics

(we already did this)

Doing

  • use ideas of programming to solve problems
  • use trial and error to design a computational approach
  • write a program to automatize data analysis
  • find and fix errors in a piece of code

Documenting

  • read and understand documentation
  • read and understand someone else’s code
  • document their code

Feeling

Students should

  • understand the value of computational approaches
  • feel confident about solving a computational problem
  • keep working when they find a problem difficult
  • recognize that successful coding can be fun as well as useful

Feeling

  • know when to ask for help and where to find support
  • be willing and ready to learn more
  • evaluate the quality of computational methods in science
  • help the work of others with examples of good practice

How?

How will we do it?

A lot of practice

  • Solving problems from Molecular Biology

    • genome analysis
    • experimental design
  • Quizzes

  • Forum

Forum

Remember that you can ask any question related to the course

  • On the Web:

    https://groups.google.com/d/forum/iu-cmb

  • by Email:

    iu-cmb@googlegroups.com

You get 1 point for each real question, and 2 points for each practical answer

Practical issues

  • Classes on Fridays
  • 12 classes in the semester
  • Attendance will be taken at the start and end of each class
  • Students must attend at least 70% of the classes

Homework is not optional

  • We will give homework every week
  • Homework is mandatory, especially if you are doing the course again
  • Homework delivered on time counts as attendance
  • Talk with your classmates, and deliver a personal homework

Welcome Survey

  • There is a survey in the homepage
  • Answer it. It is not optional
    • we use this data later
  • If you answered in previous years, answer again

Deadline: End of March

This is a new course

If you did this course before, you will not be bored

We will teach different tools this time

We will not teach Turtle Graphics

We will do more Systems Simulations

DO ALL THE HOMEWORK ON TIME

This course is good for you

This course is good for you

It will make you

  • a better Scientist
  • a better Professional
  • a better Citizen

It will make you a better Scientist

test test prediction prediction test->prediction explanation explanation prediction->explanation nature nature observation observation nature->observation pattern pattern observation->pattern knowledge knowledge observation->knowledge question question pattern->question question->explanation peer-review peer-review knowledge->peer-review peer-review->test

Scientist work is to understand Nature

We start by Observing Nature, usually measuring values.

These are exploratory experiments.

We study this in other courses.

The thing we study must be repetible, and we need to see that repetition.

In CMB1 we learn how to find them using plots, linear models, clustering, etc.

This is the most important part.

Good answers to bad questions are useless.

Good questions are good, even if we don’t have answers

In this course we will study how to…

…answer these questions using models and explanations, and…

In this course we will study how to…

…make predictions that we can test in the lab…

These are validation experiments.

If the results do not match the prediction, we know that the explanation is wrong. Two steps back.

Now we publish our data and model, so other scientists validate or reject it.

In CMB1 we learned how to write well organized papers.

If the paper is accepted and published, our work becomes part of our shared human knowledge.

The goal of Science is to produce new Knowledge.

Now when we observe Nature we use our new Knowledge

We look for new Patterns that raise new Questions.

It will make you a better Professional

This course will teach you

How to solve hard problems

using computational thinking

You do not need a computer

You just need a brain

Key parts of computational thinking

Decomposition
breaking down a complex problem or system into smaller parts
Pattern Recognition
looking for similarities among and within problems
Abstraction
focusing on the important parts only, ignoring irrelevant detail
Algorithms
developing a step-by-step solution to the problem

It is not about computers

Computational thinking is about

problem solving

Almost any problem can be solved using computational thinking

For example: Sports, Projects, Science

You will also learn to use a computer

This will be an advantage in any professional environment

  • You will be careful in your work,
    • pay attention to details
  • You will communicate in clear and unambiguous terms
  • You will not do the boring work
  • You may even find a work as a coder, programmer, or developer

It will make you a better Citizen

This course will help you to understand complex systems, like

  • Climate change (butterfly effect)
  • Epidemics, Pandemics, and Zombie invasions
  • Economy
  • Health
  • and so on

You will see why simple explanations are usually wrong

You will learn to understand systems

In summary, this course…

  • Will make you happier
  • Will make you proud of yourself
  • Will give you a super-power

Enjoy!