to MSR 🌽
Answer now with your voice
In this course we will speak about
We will learn to analize gene expression, so we can design better experiments and achieve higher impact
Since there is no “authority”, nobody can make an “official” definition of Science.
There are two ways to “define” Science:
Scientist work is to understand Nature
We start by Observing Nature, usually measuring values.
These are exploratory experiments.
We study this in other courses.
The thing we study must be repetible, and we need to see that repetition.
We can find them using plots, linear models, clustering, etc.
This is the most important part.
Good answers to bad questions are useless.
Good questions are good, even if we don’t have answers
We answer these questions using models and explanations
Valid models should make predictions that we can test in the lab…
These are validation experiments.
If the results do not match the prediction, we know that the explanation is wrong. Two steps back.
Now we publish our data and model, so other scientists validate or reject it.
The final validation is to be published.
If the paper is accepted and published, our work becomes part of our shared human knowledge.
The goal of Science is to produce new Knowledge.
When we observe Nature we use our previous Knowledge
We look for new Patterns that raise new Questions.
“Noise becomes Signal”
In this framework, Technology is about Things Built by Humans
Using any recording device (paper, cell phone, etc), take note of the questions that you can ask about what you see every day
Especially about questions that you don’t know the answer
For example “Does Technology derive from Science?”
according to Microsoft Research
More precisely, mRNA concentration
We want to know
Measuring protein concentration is hard
We assume that protein concentration is proportional to mRNA concentration
If you have primers for each gene
Raw data: CT value for each gene/condition
and CT value for calibration reference
Southern/Northern/Western blot can detect, but not quantify
(I think so. I’m not a biologist)
Instead, we have macro- and microarrays
Raw data: Light intensity (luminescence) in one or more wave length
This is measured in arbitrary units, and is a number between 0 and 65536
(that is, a 16-bits value)
mRNA is retro-transcribed and fragmented.
Fragments are sequenced. Reads are aligned to reference genome
Raw data: SAM/BAM file with location of each read in the reference genome
Processed data: Number of reads per gene, normalized by gene length
Gene Expression Omnibus
Write a document (in English) explaining your results