Desmond
Introduction
Desmond
is a software developed by D. E. Shaw Research to perform molecular dynamics simulations of biological systems on conventional commodity clusters. Desmond was installed on the FSU HPC cluster as a component of the Schrodinger
software suite. The Schrodinger suite also include Maestro
, a visualization tool for molecular dynamics. The Schrodinger suite was installed on the parallel file system panasas
at
/gpfs/research/software/desmond/schrodinger2013-3
Molecular Dynamics Simulation Process Using the Schrodinger Suite
The procedure for running a molecular dynamics simulation using Desond
and Maestro
can be summarized in the following figure:
In particular, the original structure file imported from the protein data base (PDB) has to be prepared/preprocessed by Maestro
to produce the structure file (with force field) which is used as the input of the desmond
simulation. The configuration file contains the simulation parameters such as the global cell, the force field, constraints (if there are any), and the integrator.
Remark. The visualization tool Maestro
is a GUI. It is not advisable to run GUI on the HPC login node. We suggest you to install Maestrao
to your personal computer and prepare the structure file on your laptop/desktop, before running the Desmond
simulation (the CPU-intenstive part) on the HPC cluster.
Command Line Syntax of Desmond
The syntax for desmond
is
desmond [Job_Options] [Backend_Options] Backend_Arguments
where the Job_Options can be
-h : print help message.
-v : print version information and exit.
-WAIT : don't exit until job completes.
-p,-NPROC : number of processors to be used (default is 1).
-JOBNAME name : the name of this job.
-gpu : run GPU version.
-jin filename : files or directories to be transfered to the compute node.
-jout filename : files to be copied back to the submit node.
-dryrun backend_cfg : generate backend config file only.
The Backend_Options can be:
-comm plugin : use communication plugin (serial or mli)
-c config_file : parameter file for simulation.
-tpp n : number of threads per processor.
-dp : run double precision version (single precision by default).
-noopt : do not optimize parameters automatically.
-overwrite : overwrite trajectory.
-profile : enable backend profiling,
and the Backend_Arguments can be:
-in x.cms : the structure file
-restore checkpoint : a check point file for resuming a simulation
(run the desmond -h
command for details).
As an example, to run the desmond simulation with input structure file x.cms and configuration file y.cfg on a single machine,
$SCHRODINGER/desmond -in x.cms -c y.cfg
where SCHRODINGER
is the environmental variable (path) to the Schrodinger
software installation.
Command Line Options of Schrodinger Job Control Facility
Besides command line options specific to desmond
, there are important command line options directly recognized by the Schrodinger Job Contral Facility. Here are a few important ones (refer to Schrodinger Job Control Guide for more information)
-HOST host
-HOST host:n
-HOST "host_1:n_1 host_2:n_2 ... host_k:n_k"
This option tells the job control facility to run job on a specified host, or submit job to a queue.
Here host
is the value of a name
entry (not the host entry) in the schrodinger.hosts file (see discussion in the following), or the actual address of a host, and n
(n_1, n_2, ... n_k) is the number of cores to the host. When specifying more than one host, use space to separate them and quotes to enclosing them.
-QARGS queue-args
This option passes arguments to the queue manager. These arguments are appended to those specified by the qargs
settings in the hosts file schrodinger.hosts
.
-TMPDIR directory
This option specifies the scratch directory for the job. The job directory is created as a subdirectory of the scratch directory. We suggest you to use
$HOME/scratch
or $HOME/_tmp
.
There are some options for information. For example,
-ENTRY
This option shows the section of the schrodinger.hosts
file that will be used for this job provided the -Host host
option points to a section of the hosts file.
Remark. Command-line options always take precedence over the corresponding environment variable.
Running Desmond Simulation on the HPC cluster
Molecular Dynamics simulation is CPU-intensive. A desmond simulation can run on the HPC cluster in two ways: (1) via a SLURM job submit script or (2) through the Schrodinger's Job Control Facility.
Submit Desmond Job using a SLURM script
Here is an example submit script
#!/bin/bash
#SBATCH -J desmond_mpi
#SBATCH --mail-type=ALL
#SBATCH -N 2
#SBATCH --ntasks-per-node=8
#SBATCH -t 24:00:00
#SBATCH --mem-per-cpu=2000
#SBATCH -p genacc_q
# the desond module defines the environmental variables needed
# for running desmond
module load desmond
#$SCHRODINGER=/gpfs/research/software/desmond/schrodinger2013-3
mpirun $SCHRODINGER/desmond -in x.cms -c y.cfg -comm mpi
where x.cms and y.cfg are respectively the structure file and simulation parameter file.
Schrodinger's Job Control facility
Parallel Desmond simulation can also run under Schrodinger's Job Control facility
. The Job Control Facility obtains information about the hosts on which it will run jobs (and other information needed by the queue) from the hosts file---schrodinger.hosts
--- and submit/monitor the job for you. The default hosts file is located in
$SCHRODINGER/schrodinger.hosts
An example entry of the schrodinger.hosts
looks like
name: genacc_q
host: hpc-login
queue: slurm
qargs: -p genacc_q -N 2 --ntasks-per-node=8 --mem-per-cpu=2000 -t 14:00:00:00
processors: 608
tmpdir: /tmp
where genacc_q
is the name of the entry (each entry is one job submission scenario), submit
is the name of the host (the login node of the HPC cluster), slurm
is the queue software, qargs
contains the command line options of qsub
(or sbatch
for slurm). For genacc_q, the maximum processors allowed is 608, and the wall clock limit is 14 days, tmpdir
is the scratch space for desmond application (you can use for example $HOME/scratch for your own schrodinger.hosts
file). To submit a parallel job using the above configuration
$ module load desmond
$SCHRODINGER/desmond -in x.cms -c x.cfg -HOST genacc_q
(the -HOST genacc_q
flag above will tell the Schrodinger Job Monitoring Facility to use the entry with name genacc_q
in the schordinger.hosts
file to create the job submit script. Consequently, the simulation will be submitted to the queue genacc_q
asking for 2 nodes and 8 cores each).
To use your own hosts file, create a directory .schrodinger
under your home directory, and copy the above file to it:
$ mkdir $HOME/.schrodinger
$ cp $SCHRODINGER/schrodinger.hosts $HOME/.schrodinger/schrodinger.hosts
and edit it. You also need define the environmental variable SCHRODINGER_HOSTS
to point to this file:
export SCHRODINGER_HOSTS=$HOME/.schrodinger/schrodinger.hosts
Monitoring Jobs Using the Schrodinger Job Control Facility
Besides the job control commands provided by the job scheduler MOAB such as checkjob
, qstat
, etc., desmond simulations can be monitored by the Schrodinger Job Control Facility. The Job Control facility provides tools for monitoring and controlling the jobs that it runs. Information about each job is kept in the user's job database. This database is kept in the directory $HOME/.schrodinger/.jobdb. A desmond job can be monitored by the Schrodinger utility jobcontrol
$SCHRODINGER/jobcontrol command query
In the above command
is the command for the action you want to perform, and query
defines the scope of the action performed by the command.
The command
can be
-cancel
which cancels a job that has been launched but not started, or
-kill
which terminates a job immediately, or
-list
which lists the JobID, job name and status, or
-resume
which continues running a paused job, or
-dump
which shows the complete job record.
On the other hand, query
can be
all
which means all jobs in your job database, or
active
which means all active jobs (not finished), or
finished
which means all jobs finished, or a JobID. The JobId
is a unique identifier consisting of the name of the submission host, a sequence number, and a hexadecimal timestamp, e.g., submit-0-a1b2c3d4.
For example, to list all the jobs in your job database that finished successfully, enter:
$SCHRODINGER/jobcontrol -list finished
To list just the job whose JobId is submit-0-a1b2c3d4, enter:
$SCHRODINGER/jobcontrol -list submit-0-a1b2c3d4
To list the complete database record for a job, enter the command:
$SCHRODINGER/jobcontrol -dump jobid
References
A few useful resources for Desmond is
(1). [Desmond Tutorial] (http://www.deshawresearch.com/Desmond_Tutorial_0.6.1.pdf).
(2). [Desmond User's Guide] (http://www.deshawresearch.com/downloads/download_desmond.cgi/Desmond_Users_Guide-0.5.3.pdf).
(3). [Schrodinger Job Control Guide] (http://gohom.win/ManualHom/Schrodinger/Schrodinger_2012_docs/general/job_control.pdf).