Biopython

Python library for computational molecular biology.

Introduction

Biopython is a set of tools written in Python to analyze and process bioinformatics datasets including genomes, genetic sequence reads and protiens. The program contains a myriad of analysis tools and algorithms to use on these types of biological data.

Using Biopython on HPC

Log in to the HPC and load the python module python3. The Biopython library is readily available within Python. Once Python is loaded, using the Biopython library is functionally no different than using a locally installed Biopython library. The tutorial located here provides information on getting started with the library.

Example Biopython Library Call

After importing the python module, load python with the command python. Below are some example commands and results from section 2.2 of the tutorial executed on the HPC.

>>> from Bio.Seq import Seq
>>> my_seq = Seq("AGTACACTGGT")
>>> my_seq
Seq('AGTACACTGGT', Alphabet())
>>> print(my_seq)
AGTACACTGGT
>>> my_seq.alphabet
Alphabet()