06 June 2012

176. Weaning people onto SGE, one script at a time

On a five node (1 front + 4 exec; each node has 8 cores and 8 GB RAM) cluster that I know and hang out with, people have been submitting jobs one by one. As in, doing it manually, without a queue manager.

I got one of the users to start using my Very Simple Python Queue Manager to prevent too much idle time, but not everyone is using it yet.

Another downside when people aren't using queue managers is that they use top and kill to manage jobs, and that has a way of screwing things up for everyone. SGE is a much better solution in every possible sense.

To make it easier for the users to switch to using qsub i.e. make the change as undisruptive as possible, I wrote a little bash function and set up some standard qsub files.

The user navigates to the  directory where their .in file is (e.g. test.in) and runs
presub test
which open test.in and creates test.qsub

The user then submits by doing
qsub test.qsub


It's easy enough to customize the function and the output files (.e.g using .com, .g03in etc.). This script obviously only does g09, but I'll post a more general script later.




The .bashrc function:
presub () {

    paste -s -d "\n" ~/.qsub/qsub.head $1.in ~/.qsub/qsub.tail > $1.qsub
    return 0
}



The files:
I put the following files in ~/.qsub/

qsub.head:

#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=99:30:00
#$ -l h_vmem=8G
#$ -j y
#$ -pe orte 8



export GAUSS_SCRDIR=/tmp
export GAUSS_EXEDIR=/share/apps/gaussian/g09/bsd:/share/apps/gaussian/g09/local:/share/apps/gaussian/g09/extras:/share/apps/gaussian/g09
/share/apps/gaussian/g09/g09 << END >> g09.log

qsub.tail:





END

The empty lines above are on purpose since gaussian can be annoying in that sense.


No comments:

Post a Comment