TopHat

A spliced read mapper for RNA Sequencing

Introduction

TopHat is a spliced read mapper for RNA sequence data. It uses Bowtie and SAMTools to handle sequences as large as a mammalian genome and analyzes these sequences to find splice junctions.

Using TopHat on RCC Resources

TopHat makes use of Bowtie and Samtools, both of which are available on the HPC. Note that all three of these are readily available upon login and do not require the loading of modules. To run an example for TopHat (found here), first a few files must be copied into a new test directory.

mkdir ~/tophat-test
cd ~/tophat-test
cp /gpfs/research/software/userfiles/tophat/* .

To run TopHat with these example files, use the command

tophat -r 20 test_ref reads_1.fq reads_2.fq

This will then generate a folder tophat_out that contains the output. Documentation explaining this output can be found here. Note that you will have to use a submit script to submit jobs in HPC. A sample script can be found in the "Submitting a Job to the HPC" section here .