Class 17: Normalization in Arrays

Systems Biology

Andrés Aravena, PhD

December 21, 2021

Normalization

  • Background correction
    • Remove technical noise
  • Within arrays
    • Guarantee that observed differential expression is real
    • “Makes average equal to zero”
  • Between arrays
    • Enables comparison between conditions
    • “Makes variances equal”

Within arrays

  • Uses logarithm of war expression
  • Remove background signal
  • Assumes that some genes are not differentially expressed
    • Either housekeeping genes or most of the genes
    • they are forced to have zero differential expression

Let’s read some data

library(limma)
targets <- readTargets("targets.txt")
RG <- read.maimages(targets$Filename, source = "genepix")
Read GSM3303967_Dcg2699_vs_WT_I.gpr.gz 
Read GSM3303968_Dcg2699_vs_WT_II.gpr.gz 
Read GSM3303969_Dcg2699_vs_WT_III_csw.gpr.gz 
RG_corr <- backgroundCorrect(RG, method = "normexp")
Array 1 corrected
Array 2 corrected
Array 3 corrected
Array 1 corrected
Array 2 corrected
Array 3 corrected
MA <- normalizeWithinArrays(RG_corr, method = "loess")
MA.q <- normalizeBetweenArrays(MA, method = "quantile")

Box plots

Lowess: within array normalization

Pending

  • NormExp model for background normalization

  • Combining probes for a single gene

  • Clustering

  • Heathmap

Single color arrays

Affymetrix

  • Massive arrays

    • Hundreds of thousands of spots
  • In situ synthesis

  • Each oligo has a negative control

Negative control

  • Oligos are short. Let’s say 25 bp

  • There may be several oligos for each gene

  • Positive match oligos are called \(PM\)

  • For each \(PM\) there is a negative control called \(MM\)

  • \(MM\) oligos have a mismatching basepair in the center bp

    • Position 13 if oligos arr 25 bp long

Normalization

The \(MM\) probe will not hybridize with the gene, but will hybridize with random fragments

The same random fragments will hybridize with the \(PM\)

The difference \(PM-MM\) is the signal of the gene