[클러스터] SGE Shadow master 구성하기
Grid Engine에서 Shadow master를 설정하려면, 아래 순서를 따른다.
1) shadow_masters file 생성
2) file의 permissions 을 확인
3) shadowd daemon(s) 을 실행
1) shadow_masters file 생성
이 file을 $SGE_ROOT/default/common 아래에 생성한다. 파일 내용의 첫번째 라인에는 primary master host의 이름을 적어주고,
그아래로 shadow master host역할을 할 host들의 이름을 순서대로 쓴다.
예:
>cat shadow_masters
host1
host2
host3
여기서 host1이 primary master host이고, host1이 fail시에 host2가 master server로 10분 정도의 시간 후에 take over되고,
그 다음으로 host2가 죽으면 host3가 그 역할을 대신하게 된다.
2) File permissions 확인
모든 master shadow host들은 qmaster spool directory에 read/write permission을 갖도록 설정해야 한다.
3) shadow daemons 실행
Shadow daemon은 모든 shadow master host들에서 실행 되어야 한다.
shadow daemon의 실행은 rcsge라는 startup script를 통해서 한다.
각 host에서 root 권한으로 다음과 같이 실행:
$SGE_ROOT/default/common/rcsge -shadowd [Version 5.3 and its patches]
$SGE_ROOT/default/common/sgemaster -shadowd [Version 6 or later]
위의 3 단계를 성공적으로 밟으면, master의 shadowing기능이 정상적으로 작동하게 된다. See under issue #497 for more information about shadowd failover delay and check interval.
NOTES:
When using this shadow master feature with the master hosts with multiple network interfaces, the following things have to be addressed.
Version 6 release must install the Update 1 patch to make the shadow master work.
Version 5.3 and its patch releases need to create a symbolic link to each of shadow masters as shown below at $SGE_ROOT/default/spool/qmaster directory. This is because the shadow daemon still looks for the following file name associated with the old hostname while the rcsge script looks for the file name associated with the new hostname assigned to the Grid Engine traffic. An example is given below.
% ls -l $SGE_ROOT/default/spool/qmaster
…
lrwxrwxrwx 1 sge sge 17 Sep 28 09:01 shadowd_host1-ge.pid -> shadowd_host1.pid
-rw-r–r– 1 sge sge 17 Sep 28 09:00 shadowd_host1.pid
lrwxrwxrwx 1 sge sge 17 Sep 28 09:02 shadowd_host2-ge.pid -> shadowd_host2.pid
-rw-r–r– 1 sge sge 17 Sep 28 09:00 shadowd_host2.pid
In this example, host1 and host2 are hostnames for two shadow masters. Also host1-ge and host2-ge are the names for Grid Engine network interfaces.
% cat /etc/hosts
#
# hostnames
#
192.168.8.10 host1
192.168.8.11 host2
#
# Grid Engine Network
#
192.168.9.10 host1-ge
192.168.9.11 host2-ge
Version 5.3 and its patch releases need to keep both new and old hostnames in the shadow_masters file due to the reason mentioned above. An example is given below.
% cat shadow_masters
host1
host2
host1-ge
host2-ge