[클러스터] Tight MPICH integration with SGE – another doc
Sun GridEngine
From Rocks Clusters
Contents
• 1 General Information
• 2 Impressing Your Friends With SGE
• 3 Tight MPICH integration with SGE
• 4 Submitting Jobs From One Cluster To Another
• 5 Prologue/Epilogue
General Information
SGE home page
Sample SGE Submission Scripts
Impressing Your Friends With SGE
This is a place for tricks that make SGE easier to deal with.
(Title is a nod to the great old redhat page IYFW RPM.)
Tight MPICH integration with SGE
Rocks comes with a default parallel environment for SGE that facilitates tight integration. Unforunately it’s not quite complete. For the ch_p4 mpich device, the environment variable MPICH_PROCESS_GROUP must be set to no on both the frontend and compute nodes in order for SGE to maintain itself as process group leader. These are the steps I did to get it to work in Rocks 4.1 (I opted for the 2nd solution described here) Nmichaud@jhu.edu 14:11, 15 February 2006 (EST)
1. Edit /opt/gridengine/default/common/sge_request and add the following line at the end:
-v MPICH_PROCESS_GROUP=no
2. For a default Rocks setup, SGE calls /opt/gridengine/mpi/startmpi when starting an mpi job. This in turn calls /opt/gridengine/mpi/rsh. Both these files must be changed. However, each compute node has it’s own copy of these files. Instead of editing it on the frontend and copying to all the compute nodes, I found it easier to place my own copy in a subdirectory of /share/apps called mpi and then change the mpich parallel environment to call my own copies of startmpi and stopmpi (and by extension rsh). This way my one copy is exported to all the nodes and I don’t have to worry about keeping them in sync.
3. Edit /share/apps/mpi/startmpi. Change the line
rsh_wrapper=$SGE_ROOT/mpi/rsh
to
rsh_wrapper=/share/apps/mpi/rsh
4. Edit /share/apps/mpi/rsh. Change the following lines:
echo $SGE_ROOT/bin/$ARC/qrsh -inherit -nostdin $rhost $cmd
exec $SGE_ROOT/bin/$ARC/qrsh -inherit -nostdin $rhost $cmd
else
echo $SGE_ROOT/bin/$ARC/qrsh -inherit $rhost $cmd
exec $SGE_ROOT/bin/$ARC/qrsh -inherit $rhost $cmd
to:
echo $SGE_ROOT/bin/$ARC/qrsh -V -inherit -nostdin $rhost $cmd
exec $SGE_ROOT/bin/$ARC/qrsh -V -inherit -nostdin $rhost $cmd
else
echo $SGE_ROOT/bin/$ARC/qrsh -V -inherit $rhost $cmd
exec $SGE_ROOT/bin/$ARC/qrsh -V -inherit $rhost $cmd
5. Finally, run qconf -mp mpich. Change it from:
pe_name mpich
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh $pe_hostfile
stop_proc_args /opt/gridengine/mpi/stopmpi.sh
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
to
pe_name mpich
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /share/apps/mpi/startmpi.sh -catch_rsh $pe_hostfile
stop_proc_args /share/apps/mpi/stopmpi.sh
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
Submitting Jobs From One Cluster To Another
It is currently unknown how to do this in SGE 6. Use Globus (instructions here.)
Prologue/Epilogue
This is a place for information about the prologue & epilogue scripts that run before and after a job, respectively.
I have found that SGE is not particularly good at cleaning up after mpi jobs; it does not keep track of which other nodes are being used and killing leftover user processes on those nodes. If anyone has a good solution for this, I’d love to see it.
Retrieved from “https://wiki.rocksclusters.org/wiki/index.php/Sun_GridEngine”