Of critical interest to us is how long it takes to startup and shutdown large jobs. I have been doing some experimentation on a 64 2-cpu/node sparc cluster. I ran a variety of timing tests and I realized that starting the jobs on the remote nodes via rsh is much faster than ssh. However, rsh is limited to using ports 512-1024 so if one runs several large jobs at the same time, or one runs them in quick succession, one can quickly run out of available ports. Therefore, we suggest one uses ssh instead (or a resource manager, but that is another story).
I have therefore looked into how to speed up ssh. I am not interested in the security features of ssh as I am running in a safe environment, so if I can give up security for speed, I will do so. I discovered that one way to speed things up is to have modify ssh and the sshd do use sshv1 instead of sshv2. Following are the instructions on how to do this.
Generate the key needed by sshv1. This is done on the node that you want to ssh to. Just leave the passphrase blank. # cd /etc/ssh # ssh-keygen -t rsa1 -f ssh_host_key
Make the following changes to the /etc/ssh/sshd_config file. The first change tells sshd to use Protocol 1. The second change generates the key needed by Protocol 1 as it is not created by default. Protocol 1
Restart the sshd. # svcadm restart network/ssh
> cd /home/rolfv/.ssh > /bin/rm -rf identity identity.pub > ssh-keygen -t rsa1 -f identity > cat identity.pub >> authorized_keys
To verify that you are really using sshv1, you can do ssh with the -1 switch. (That is the number one, not an ell).
> ssh -1 allegany > ssh -2 allegany
UPDATE – Here are the results of comparing rsh, ssh, and sshv1 run times of a simple MPI job.