ABySS

Last updated on Wednesday, June 12, 2019

Assembly By Short Sequences - a de novo, parallel, paired-end sequence assembler

Introduction
Running ABySS on RCC Resources
- Serially running ABySS on Spear
- Parallaly Running ABySS on HPC

Introduction

ABySS is a Bioinformatics program designed to assemble genomes from small paired-end sequence reads. It can be run either in serial or in parallel, though the parallel version is capable of efficiently assembling larger genomes than the serial one is.

Running ABySS on RCC Resources

ABySS can be run in HPC serially as well as in parallel. The abyss module needs to be loaded before running ABySS.

Serially running ABySS on Spear

Download and assemble a small synthetic data set.

module load abyss
abyss-pe k=25 name=test se=https://raw.github.com/dzerbino/velvet/master/data/test_reads.fa

Calculate assembly contiguity statistics

abyss-fac test-unitigs.fa

To assemble paired reads in two files named test-1.fa and test-3.fa into contigs in a file named test-contigs.fa, run the command:

abyss-pe name=test k=64 in='test-1.fa test-3.fa'

Further details about the commands can be found in the ABySS documentation.

Parallaly Running ABySS on HPC

Following SLURM submit script can be used as a template to submit a parallel ABySS job in HPC.

#!/bin/bash
#
# Name your job
#SBATCH -J abyss
#
#Change the queue
#SBATCH -p genacc_q
#
#Change the number of nodes and processes per node as necessary
#SBATCH -N 2
#SBATCH --ntasks-per-node=4
#
#Change the wall time
#SBATCH -t 00:30:00
#
module load abyss
#
#Run your ABySS commands
abyss-pe name=test k=48 n=8 in='test-1.fa test-3.fa'

FSU | ITS Research Computing Center

FSU Research Computing Center

Information Technology Services

ABySS

Introduction

Running ABySS on RCC Resources

Serially running ABySS on Spear

Parallaly Running ABySS on HPC

FSURCC Main Menu