efg's Research Notes
Embarrassingly Parallel Computations Using the Sun Grid Engine
The following are practical suggestions on how to submit jobs to a Sun Grid Engine (SGE) to speed up computations using an embarrassingly parallel approach on a Linux cluster. These notes assume you have a cluster administrator to setup and configure your cluster and the Sun Grid Engine. Your installation may be configured differently than described here.
Background. A cluster job is a Linux script (I will only use bash scripts here) that performs a computation with most inputs and outputs being read from and written to files. A cluster job "runs headless", i.e., without any display monitor to see the results. You'll need to design your cluster job to run in this environment, which is similar to the normal Linux command-line environment, but there may be some differences. Examples of cluster jobs will be discussed later.
I usually use a submit.bash script to submit cluster jobs in a repeatable and documented way using the qsub Sun Grid Engine command. The submit.bash script uses qsub to schedule another script, e.g., job.bash in Fig 1, to execute on one of the cluster nodes as soon as possible.
Fig. 1. Script submit.bash calls the SGE qsub command
E a r l F. G l y n n
e f g @ s t o w e r s - i n s t i t u t e . o r g
17 Dec 2007