# Bioinformatics

## What is the question?

We have reads that are close related to a reference genome

For example, RNA messengers

Or DNA from a new individual from the same species

Or DNA from a species on the same genus

Or reads that we want to map to an assembled genome

## Why we want to do this?

The answer depends on the case

• With RNA messengers we measure gene expression
• With DNA from a new individual, we identify polymorphisms
• They may explain some genetic traits
• For a new species of the same genus, this can be used instead of de novo assembly
• For a genome assembled with De Bruijn graphs, we can find the contigs of each read

## What tool can we use align reads to a genome?

There are several tool for the same goal

Two tools that are popular today are:

• bwa: Burrows-Wheeler Alignment Tool
• bowtie and bowtie2
• hisat and hisat2

They have a similar philosophy

Can you find others?

## Using bwa

First, it makes an index of the genome

bwa index ref.fa

Then it aligns the reads to the genome

bwa mem ref.fa reads.fq > aln-se.sam

We use the extension .sam. These are large text files

Sometimes .sam files are encoded in smaller .bam binary format files