BWA

Software package for mapping low-divergent sequences

Overview

BWA, which is an acronym for the Burrows-Wheeler Aligner, is a genomic sequence mapping program which is designed to map low-divergent sequence reads against large reference genomes using one of three algorithms: BWA-backtrack, which is intended for use with Illumina sequence reads of up to 100 base pairs; BWA-MEM which is menat for longer sequence reads of 70 base pairs up to 1Mbp and supports long-reads and split alignment as well as BWA-SW, which is similar to BWA-MEM. In general, BWA-MEM is the latest algorithm and the one recommended for high-quality queries due to it being faster and more accurate.

Example

Note that BWA does NOT require the loading of a module. First log in to a Spear or HPC node, create a directory and download the relevant example files into it.

mkdir ~/bwa-test
cd ~/bwa-test
wget https://raw.github.com/dzerbino/velvet/master/data/test_reads.fa
wget https://raw.github.com/dzerbino/velvet/master/data/test_reference.fa

To index the reference genome, use

bwa index test_reference.fa

Alignment of data is done using

bwa aln test_reference.fa test_reads.fa > aln_test.sai

which will output the results into aln_test.sai. Finally for pairing and mapping, use

bwa samse test_reference.fa aln_test.sai test_reads.fa > aln_test.sam

For a complete set of commands, consult the BWA manual.