Desmond

Simulation Software for Molecular Dynamics

Introduction

Desmond is a software developed by D. E. Shaw Research to perform molecular dynamics simulations of biological systems on conventional commodity clusters. Desmond was installed on the FSU HPC cluster as a component of the Schrodinger software suite. The Schrodinger suite also include Maestro, a visualization tool for molecular dynamics. The Schrodinger suite was installed on the parallel file system panasas at

     /gpfs/research/software/desmond/schrodinger2013-3  

Molecular Dynamics Simulation Process Using the Schrodinger Suite

The procedure for running a molecular dynamics simulation using Desond and Maestro can be summarized in the following figure:

In particular, the original structure file imported from the protein data base (PDB) has to be prepared/preprocessed by Maestro to produce the structure file (with force field) which is used as the input of the desmond simulation. The configuration file contains the simulation parameters such as the global cell, the force field, constraints (if there are any), and the integrator.

Remark. The visualization tool Maestro is a GUI. It is not advisable to run GUI on the HPC login node. We suggest you to install Maestrao to your personal computer and prepare the structure file on your laptop/desktop, before running the Desmond simulation (the CPU-intenstive part) on the HPC cluster.

Command Line Syntax of Desmond

The syntax for desmond is

       desmond [Job_Options] [Backend_Options] Backend_Arguments         

where the Job_Options can be

  -h           : print help message.
  -v           : print version information and exit.
  -WAIT                   : don't exit until job completes.
  -p,-NPROC           : number of processors to be used (default is 1).                        
  -JOBNAME name      : the name of this job.
  -gpu                           : run GPU version.
  -jin filename               : files or directories to be transfered to the compute node.
  -jout filename             : files to be copied back to the submit node.
  -dryrun backend_cfg  : generate backend config file only.

The Backend_Options can be:

         -comm plugin    : use communication plugin (serial or mli)
         -c config_file      : parameter file for simulation.  
        -tpp n                  : number of threads per processor.
        -dp                      : run double precision version (single precision by default).
        -noopt                 : do not optimize parameters automatically.
        -overwrite            : overwrite trajectory.
        -profile                 : enable backend profiling,

and the Backend_Arguments can be:

      -in   x.cms       : the structure file
      -restore checkpoint   :  a check point file for resuming a simulation

(run the desmond -h command for details).

As an example, to run the desmond simulation with input structure file x.cms and configuration file y.cfg on a single machine,

      $SCHRODINGER/desmond -in  x.cms  -c y.cfg

where SCHRODINGER is the environmental variable (path) to the Schrodinger software installation.

Command Line Options of Schrodinger Job Control Facility

Besides command line options specific to desmond, there are important command line options directly recognized by the Schrodinger Job Contral Facility. Here are a few important ones (refer to Schrodinger Job Control Guide for more information)

   -HOST host
   -HOST host:n
   -HOST "host_1:n_1 host_2:n_2 ... host_k:n_k"

This option tells the job control facility to run job on a specified host, or submit job to a queue.
Here host is the value of a name entry (not the host entry) in the schrodinger.hosts file (see discussion in the following), or the actual address of a host, and n (n_1, n_2, ... n_k) is the number of cores to the host. When specifying more than one host, use space to separate them and quotes to enclosing them.

    -QARGS queue-args

This option passes arguments to the queue manager. These arguments are appended to those specified by the qargs settings in the hosts file schrodinger.hosts.

    -TMPDIR directory

This option specifies the scratch directory for the job. The job directory is created as a subdirectory of the scratch directory. We suggest you to use
$HOME/scratch or $HOME/_tmp.

There are some options for information. For example,

      -ENTRY

This option shows the section of the schrodinger.hosts file that will be used for this job provided the -Host host option points to a section of the hosts file.

Remark. Command-line options always take precedence over the corresponding environment variable.

Running Desmond Simulation on the HPC cluster

Molecular Dynamics simulation is CPU-intensive. A desmond simulation can run on the HPC cluster in two ways: (1) via a SLURM job submit script or (2) through the Schrodinger's Job Control Facility.

Submit Desmond Job using a SLURM script

Here is an example submit script

  #!/bin/bash
  #SBATCH -J desmond_mpi
  #SBATCH --mail-type=ALL
  #SBATCH -N 2
  #SBATCH --ntasks-per-node=8
  #SBATCH -t 24:00:00
  #SBATCH  --mem-per-cpu=2000
  #SBATCH -p genacc_q

  # the desond module defines the environmental variables needed 
  # for running desmond       
  module   load   desmond

  #$SCHRODINGER=/gpfs/research/software/desmond/schrodinger2013-3 
   mpirun $SCHRODINGER/desmond -in  x.cms  -c y.cfg -comm mpi

where x.cms and y.cfg are respectively the structure file and simulation parameter file.

Schrodinger's Job Control facility

Parallel Desmond simulation can also run under Schrodinger's Job Control facility. The Job Control Facility obtains information about the hosts on which it will run jobs (and other information needed by the queue) from the hosts file---schrodinger.hosts--- and submit/monitor the job for you. The default hosts file is located in

        $SCHRODINGER/schrodinger.hosts

An example entry of the schrodinger.hosts looks like

   name:   genacc_q
   host:    hpc-login
   queue:  slurm
   qargs:  -p genacc_q  -N 2 --ntasks-per-node=8  --mem-per-cpu=2000 -t 14:00:00:00
   processors: 608
   tmpdir: /tmp

where genacc_q is the name of the entry (each entry is one job submission scenario), submit is the name of the host (the login node of the HPC cluster), slurm is the queue software, qargs contains the command line options of qsub (or sbatch for slurm). For genacc_q, the maximum processors allowed is 608, and the wall clock limit is 14 days, tmpdir is the scratch space for desmond application (you can use for example $HOME/scratch for your own schrodinger.hosts file). To submit a parallel job using the above configuration

     $ module load desmond
     $SCHRODINGER/desmond -in x.cms -c x.cfg -HOST genacc_q

(the -HOST genacc_q flag above will tell the Schrodinger Job Monitoring Facility to use the entry with name genacc_q in the schordinger.hosts file to create the job submit script. Consequently, the simulation will be submitted to the queue genacc_q asking for 2 nodes and 8 cores each).

To use your own hosts file, create a directory .schrodinger under your home directory, and copy the above file to it:

     $ mkdir $HOME/.schrodinger
     $ cp $SCHRODINGER/schrodinger.hosts  $HOME/.schrodinger/schrodinger.hosts

and edit it. You also need define the environmental variable SCHRODINGER_HOSTS to point to this file:

     export SCHRODINGER_HOSTS=$HOME/.schrodinger/schrodinger.hosts 

Monitoring Jobs Using the Schrodinger Job Control Facility

Besides the job control commands provided by the job scheduler MOAB such as checkjob, qstat, etc., desmond simulations can be monitored by the Schrodinger Job Control Facility. The Job Control facility provides tools for monitoring and controlling the jobs that it runs. Information about each job is kept in the user's job database. This database is kept in the directory $HOME/.schrodinger/.jobdb. A desmond job can be monitored by the Schrodinger utility jobcontrol

       $SCHRODINGER/jobcontrol command query

In the above command is the command for the action you want to perform, and query defines the scope of the action performed by the command.
The command can be

   -cancel

which cancels a job that has been launched but not started, or

    -kill 

which terminates a job immediately, or

    -list

which lists the JobID, job name and status, or

    -resume

which continues running a paused job, or

   -dump 

which shows the complete job record.

On the other hand, query can be

  all

which means all jobs in your job database, or

  active

which means all active jobs (not finished), or

 finished

which means all jobs finished, or a JobID. The JobId is a unique identifier consisting of the name of the submission host, a sequence number, and a hexadecimal timestamp, e.g., submit-0-a1b2c3d4.

For example, to list all the jobs in your job database that finished successfully, enter:

       $SCHRODINGER/jobcontrol -list finished

To list just the job whose JobId is submit-0-a1b2c3d4, enter:

      $SCHRODINGER/jobcontrol -list submit-0-a1b2c3d4

To list the complete database record for a job, enter the command:

    $SCHRODINGER/jobcontrol -dump jobid

References

A few useful resources for Desmond is

(1). [Desmond Tutorial] (http://www.deshawresearch.com/Desmond_Tutorial_0.6.1.pdf).
(2). [Desmond User's Guide] (http://www.deshawresearch.com/downloads/download_desmond.cgi/Desmond_Users_Guide-0.5.3.pdf).
(3). [Schrodinger Job Control Guide] (http://gohom.win/ManualHom/Schrodinger/Schrodinger_2012_docs/general/job_control.pdf).