Clustal W

A Program for Alignment of Multiple Genetic or Protein Sequences

Introduction

Clustal W is a program designed to take in nucleic acid (genetic) sequence data or protein sequence data and align them. Clustal W is essentially the same program as Clustal X. The only difference is that Clustal X is a GUI for Clustal W.

Using Clustal W on RCC Resources

Running Clustal W on HPC Login Nodes

Clustal W does not requre a module to run on HPC login nodes. The interface is very interactive. In order to begin working with Clustal W on a login node, simply run the command clustalw and a command-line interface with prompts will run. From this interface, you can run your Clustal W job. You can also specify a list of options and files to run Clustal W with in non-interactive mode. This can be done using clustalw -[OPTIONS] FILES. Detailed documentation on the options and inputs available for use with Clustal W can be found by typing clustalw -help. For more information, please refer to the main website. An example run for this program would be as follows. Note that without specifying an output format, you will get a default output file which has the same name as the input file with a different file extension (.aln).

clustalw TEST.fa

Running Clustal W in Parallel

It is also possible to run Clustal W in parallel with OpenMPI. In order to do this, the GNU OpenMPI module is required. This must be loaded first. An example of a run for Clustal W would be as follows, again using the default output parameters.

module load gnu-openmpi
mpirun -np 4 clustalw TEST.fa