Mauve
Introduction
Mauve is one of the bioinformatics programs available on RCC Systems. It is a genomic sequence alignment program designed to align multiple genomic sequences when there are large-scale evolutionary events that need to also be considered when computing said alignments. These events can include such things as rearrangement and inversion among others.
Using Mauve on RCC Resources
Mauve has a GUI available for use on HPC systems. In order to access this, be sure you log in to the HPC login nodes using the ssh command which allows you to use graphical interfaces. This can be done with the ssh -Y
command. Once you've logged in, an interactive Mauve session can be started by running Mauve
with no arguments. For detailed information on using Mauve, please refer to the main website as there is a users guide there which has a wealth of screenshots and examples.
Mauve Command Line Tools in Serial
Mauve also has two command line tools mauveAligner
and progressiveMauve
both of which are sequence alignment tools. These can be run in serial using the commands:
mauveAligner TEST1.seq
In the above example, TEST.seq is a genomic sequence file which can be in one of three formats as stated on the Mauve website (Fasta, Multi Fasta or GenBank). Note that you can have more than one sequence file in the command line arguments. See Mauve Aligner documentation for more information.
Mauve Command Line Tools in Parallel
To speed up alignments, Mauve can be run in parallel using the GNU OpenMPI module which can be loaded by typing module load gnu-openmpi
. You can then either run a simple MPI job or submit a Slurm script to run this code. A simple MPI run with 4 processors could be done by:
mpirun -np 4 mauveAligner TEST.seq
Alternatively, a Slurm script could be written to do this as well. For example, using 4 processors again, we could write a script like:
#!/bin/bash
#SBATCH -J Mauve_Test
#SBATCH -n 4
#SBATCH -p genacc_q
#SBATCH -t 00:10:00
#SBATCH --mail-type=ALL
module load gnu-openmpi
mpirun -np 4 mauveAligner TEST.seq