SGE Installation 번역

by 서진우 · Published 2015년 5월 29일 · Updated 2024년 12월 18일

SGE Installation
________________________________________
Chapter 1>
Before You Install the Software

Plane the Installation
이미 설치를 전에 했었거나 새로 설치하거나 미리 준비해야하는 부분이 있다.

Decisions That You Must Make
하나의 클러스터 단위인 cell로 설치를 할 것인지를 선택한다. cell을 중심으로 분하여 설치하되 Binary file을 공유하여 설정할 수 있다.
장비의 종류별로 master host, shadow master host, administration host, submit host, execution host 또는 그의 혼합형으로 설치를 할 수 있다. 모든 grid engine system의 사용자는 submit host와 execution host에 동일하게 존재해야 한다.
grid engine software의 디렉토리 구조를 미리 결정해야 하고 각 워크스테이션별로 완전한 디렉토리 구조를 설정하거나 cross-mount 경로를 만들어 줄 수 있다. sge-root라는 grid engine software 설치 디렉토리를 어디로 설정할지 분명히 정해야 한다.
queue 구조를 정하는 부분도 잊지 말아야 한다.
NIS 서비스를 이용하거나 /etc/services 파일을 이용해서 네트워크를 정의할지 선택해야 한다.

Gather the Necessary Information
미리 계획을 짜고 설치를 진행해야 한다.
sge-root directory
cell name
administrative user
sge_qmaster port number
sge_execd port number
Master host
Shadow master hosts
Execution host
administration hosts
submit hosts
Group ID range for jobs
spooling Mechanism(Berkeley DB 또는 Classic spooling)
Berkeley DB server host(Master host 또는 다른 특정 host)
Berkeley DB spooling directory on the database server
Scheduler tuning profile(Normal, High, Max)
Installation Method(interactive, secure, automated or upgrade)

Disk Space Requirements
40MB 의 설치용량
10~15 MB의 바이너리 파일
10~200MB의 Master host spool directory
10~200MB의 Kerkeley DB spool directory
(master host와 execution host의 spool directory설정시 sge-root 하위로 설정하지 않는다)
sge-root Installation Directory
설치 데이터를 포함한
Directory Organization

Cells
Cell = a collection of sub-clusters
SGE_CELL 환경 설정은 클러스터에서 참조된다. 기본적으로 설치되면 default라는 값을 갖는다.
cell을 설치하면 나눠진 구조로 설치가 되지만 binary 파일은 구조상에서 서로 공유된다.

User Names
Grid Engine에서 유저가 job을 특정 execution host에 보내게 하기 위해서 유저의 이름은 중요하다. 몇몇 장비에서의 유저이름을 바꿔줘야 하기도 한다. (그래서 계정동기화 작업을 진행해야 한다)

Installation Accounts
root 유저나 특정 유저를 통해서 설치가 가능한데 권한이 없는 유저를 사용해서 설치를 진행하게 되면 해당 유저에게만 job을 구동시킬 수 잇게 만든다. 다른 유저는 사용할 수 없게 된다. root 유저로 설치하게 되면 이런 제한을 해결 할 수 있다. 만약 권한이 없는 유저를 사용해서 설치를 하게 되면 ‘qrsh, qtcsh, qmake’ 와 같은 명령어를 사용할 수 없게 되고, parallel job을 제대로 실행할 수 없다.

File Access Permissions
root 계정으로 설치를 하면 권한에 대한 문제가 발생할 수 있다. 그래서 sgeadmin이라는 계정을 만들어서 사용할 수 있다. sge-root 디렉토리(공유 파일시스템)에 읽고 쓰기 권한을 sgeadmin에 주어야 한다.
(sgeroot 하위에 설치가 되며 공유되어야 한다)

Network Services
Network Services가 NIS에 등록되어 있는지, /etc/services에 등록되어 있는지 확인한다.
만약 NIS 서버를 사용중이라면 NIS “services” map에 등록한다.
Grid Engine system service : sge_execd, sge_qmaster
NIS에 추가하려고 하면 사용하지 않는 port를 적용해야 한다.
예시)
sge_qmaster 536/tcp
sge_execd 537/tcp

Master Host
sge_qmaster, sge_schedd가 구동되는노드로 안정적이어야 하며, 1Gbyte 이상의 메모리가 있어야 하고 2 CPU가 있으면 좋다. Shadow Master Execution, Administration, submit host가 설치되기 전에 설치되야 한다.

Shadow Master Hosts
failover역할을 하며 sge_shadowd를 구동시킨다. sge_qmaster의 상태를 공유하며 job 상태와 queue상태를 공유한다.
/sge-root/cell/common/shadow_masters에 해당 노드의 호스트 정보가 들어가 있다.
sge_qmaster spool 디렉토리와 /cell/common 디렉토리에 접근할 수 있어야 한다.

Spool Directories under the root directory
Spool job을 돌리기 위한 디렉토리를 설정해준다. execution hosts로 부터 사용될 경로로 Execution host의 local경로를 갖지 않는다. master host의 qmaster-spool-dir에 적용된다. 기본적으로 master node에서는 /sge-root/cell/spool/qmaster 경로로 사용한다.
각각의 execution host에서는 /sge-root/cell/spool/exec-host가 기본적으로 설정된다.
master host의 spooling directory를 NFS하는것보다 local spooling 경로를 갖는게 더 나은 성능을 보일 수 있다.(네트워크 병목)

Choosing Between Classic Spooling and Database Spooling
설치중에 Classic Spooling과 Berkeley DB Spooling server중에 선택할 수 있다. 만약 Berkeley DB spooling을 선택하면 로컬 디렉토리에 Spool을 적용하거나, 다른 호스트로 설정할 수 있는(Berkeley DB spooling server) 옵션이 더해진다.
Berkeley DB spooling으로 사용하는 것이 더 안전한 성능을 보인다.

Database server and spooling host
Master host는 configuration이나 state를 Berkeley DB spooling에 저장할 수 있는데 이 Datebase는 master server에 저장이 되거나 분리된 다른 서버에 저장할 수 있고 master server 자체 local에 저장했다면 더 나은 성능을 발휘한다. Shadow master를 설정했다면 독립된 Berkeley DB spooling server를 사용해야 한다. 그럴 때는 RPC service를 설정해야 하는데 Master host가 RPC를 통해서 Berkeley DB 에 접근하기 때문이다.
Berkeley DB를 다른 서버에 설치시에는 보완을 꼭 생각해야 한다.
Berkeley DB를 Shadow Master없이 설치한다면 다른 서버를 사용할 필요는 없다. 그러나 반대로 Berkeley DB를 사용하지 않는다면 Shadow Master를 다른 독립 서버에 설치할 필요도 없다.

Execution Hosts
유저가 Grid Engine System에 보낸 job을 구동하는 역학을 하며, 첫번째 Execution host를 administration 호스트로 설정한다. 설치 스크립트를 각각의 execution host에 설치한다. ( nfs 마운트로 해결될 듯)

Group IDs
각 job에 할당될 ID를 제공해야 한다. 그 양은 많을 수록 좋으며 한순간에 하나의 호스트에서 동작할 수 있는 grid engine system job의 최대수만큼이어야 한다.
만약에 range가 20000-20100이라면 100개의 job을 하나의 호스트에서 사용할 수 있다. 물론 언제든 group ID의 Range값을 수정할 수 있다.

Administration Hosts
Grid engine System의 관리자와 사용자는 Administration host를 사용해서 관리 업무를 한다.
Master host 설치 스크립트가 자동으로 Master host를 administration host로 만든다. 물론 설치중에 다른 administration host를 추가할 수 있다. 설치 후에도 추가하는 것은 가능하다.

Submit Hosts
job이 보내지고 관리되는 host로서 master host 설치 스크립트가 자동으로 master host를 submit host로 만든다.

Cluster Queues
설치중에 기본 클러스터 queue 구조가 만들어 진다. 설치 후에 제거할 수도 있으며, 시스템을 친숙하게 하는데 좋다.
설치중에 만들어진 어떤 디렉토리도 설치 후에 관리자에 의해서 수정될 수 있다. 물론 시스템이 사용중일 때도 수정이 가능하다.
Queue 구조에 대해서 결정할 때 몇가지 신경 써야 할 요소들이 있다.
연속적인 것인지, 상호작용이 필요한지, parallel인지, 그 밖에 다른 타입인지, 어떤 queue를 어떤 execution host에 할당할지, 얼마나 많은 job slot이 각 queue에 필요한지, 그 상황에 맞게 설정해야 한다.

Scheduler Profiles
설치 중에 3가지 schedular profiles중에 선택한다. normal, high, max.
Grid engine Tuning을 위해서 미리정의된 profile로 시작해야 된다.
다음과 같은 내용을 최적화 하기 위함이다.
1. 스케쥴링 작업에 대한 정보의 양
2. 스케쥴링 작업중에 load adjustment(불러오는 정도)
3. 시간차가 있는 스케쥴링(기본) 또는 즉각적인 스케쥴링

– normal : load adptation, interval scheduling, 대부분의 grid에서 시작점으로 사용되는 profile이다. 만약 정보를 모으고 보고하는 일이 가장 중요하다면 이것을 사용하는 것이 좋다.
– high : 더 큰 규모의 클러스를 구성할 때 좋다. 처리량이 정보의 모음보다 더 중요할 때 사용된다. interval scheduling도 사용한다. 정보보다는 성능이 우선이다.
– max : 정보 수집과 보고가 아예 무시되는 profile이다. load adaptation이 적용되지 않고 immediate scheduling이 적용된다. job의 증가에 따라서 immediate scheduling의 장점이 줄어든다. 우선순위가 처리량일 때 사용된다.

Installation Method
몇가지 방법으로 설치를 할 수 있다.
– Interactive : Chapter 2에서 방법을 설명한다.
– Interactive, with increased security : Chapter 4에서 방법을 설명한다.
– Automated 방식으로 설치하면 inst_sge 스크립트와 설정파일을 사용 : “Using the inst_sge Utility and a Configuration Template” 에서 방법을 설명한다.
– upgrade : Chapter 5에서 방법을 설명한다.
상황 요소에 따라서 위의 방법중에 선택한다.

Loading the Distribution Files On a Workstation

How to Load the Distribution Files On a Workstation
1. 설치 파일들을 준비한다.
2. 시스템에 로그인한다.
3. 설치 디렉토리를 만든다. sge-root 디렉토리로 사용될 것이다.
# mkdir -p /engrid/ensge/
4. master, execution, submit host를 위한 binary 파일들을 설치한다.
pkgadd 방식은 Solaris OS에서 상용된다

tar.gz 파일이 제공되며 n1ge-6_0-common.tar.gz은 어떤 플랫폼에서도 사용할 수 있게 하기 위한 것이다.
각 플랫폼에 따라서 다른 패키지를 제공한다.
예시)
Linux x86 platform용 : n1ge-6_0-bin-linux24-i586.tar.gz
Linux AMD platform용 : n1ge-6_0-bin-linux24-amd64.tar.gz

# cd /engrid/ensge/
# gzip -dc basedir/Common/tar/n1ge-6_0-common.tar.gz | tar xvpf –
# gzip -dc basedir/Docs/tar/n1ge-6_0-doc.tar.gz | tar xvpf –
# gzip -dc basedir/Solaris_sparc/tar/n1ge-6_0-bin-solsparc32.tar.gz | tar xvpf –
# gzip -dc basedir/Solaris_sparc/tar/n1ge-6_0-bin-solsparc64.tar.gz | tar xvpf –
# SGE_ROOT=/engrid/ensge/ ; export SGE_ROOT
# util/sefileperm.sh $SGE_ROOT

________________________________________
Chapter 2>
Installating of Grid Engine Software Interactively

How to install Master Host
sge_qmaster, sge_schedd가 필요로 하는 디렉토리 구조를 만든다.
기본적으로 master host는 administration host와 submit host 역학을 한다(첫 번째 execution host). 설치중에 무언가 문제가 생겼을 때는 설치를 중단하고 처은부터 다시 시작한다.
1. master host에 root로 로그인
2. $SGE_ROOT 환경 설정이 지정되었는지 확인한다. # echo $SGE_ROOT
만약에 지정되지 않았다면 # SGE_ROOT=/engrid/ensge/ ; export SGE_ROOT 실행
3. sge-root 디렉토리로 이동.
4. package를 sge-root로 복사 (/engrid/ensge/)
5. install_qmaster를 실행한다.
# ./install_qmaster
enter
y
clunix( 사용할 user name)
enter

6. 설치 폴더를 확인한다.
7. TCP/IP 서비스 설정을 한다.
/etc/services 나 NIS설정중에 선택해서 사용할 수 있다.
만약 /etc/services를 사용한다면, Shell 창을 하나 새로 열어서 아래와 같이 추가한다.
# vi /etc/services
———————————–
…
sge_qmaster 536/tcp
sge_execd 537/tcp
———————————–
이전 shell로 돌아가서 Enter를 쳐주면 바로 위에서 설정한대로 값이 적용된다.

8. cell의 이름을 넣어준다.
Cell이름을 정했다면 그대로 넣어주고, 정하지 않기로 했다면 그냥 엔터를 쳐준다.

9. Spool 디렉토리 설정
/default/spool/qmaster 로 설정되며 ‘y’를 선택해서 다른 디렉토리로 바꿔줄 수 있다.

10. 권한 설정
첫 질문이 pkgadd로 설치했냐는 것인데 ‘n’를 선택한다.
두번 째 질문은 권한 설정에 대한 것인데 ‘y’를 선택한다.

11. 하나의 DNS domain으로 모든 grid engine system들이 설정될지 선택한다.
-하나의 DNS domain이라면 ‘y’로 대답한다.
-아직 설정되어 있지 않다면 ‘n’으로 답변하고, default domain으로 설정하겠냐고 물어보면 ‘y’로 답한다.
그런다음 default domain을 넣어준다. 가령, clunix.com

12. 디렉토리를 만들어준다는 내용이 출력된다. 엔터를 쳐준다.

13. Classic Spooling을 사용할지 Berkeley DB를 사용할지 정한다. DB 폴더를 의미한다.
Berkeley DB를 사용하고자 한다면 엔터를 쳐주고, Berkeley DB spooling server를 사용하고자 한다면 ‘y’ 쳐준다.
continue를 하지 말라는 메시지가 나타난다. Berkeley DB 설치가 마무리 될 때까지 엔터키를 누르지 않는다.

 새로운 터미널을 하나 연다.
 Spooling server로 사용하고 있는 server로 로그인한다.
 Berkeley DB server 설치법을 활용해서 설치한다.(46페이지)
 설치가 완료되면 원래 터미널로 돌아와서 엔터키를 쳐준다.
 Berkeley DB server의 이름을 물어보는 화면에서 Berkeley server의 hostname을 넣어준다.
 spooling 디렉토리의 path를 넣어준다. 매뉴얼 default는 /opt/n1ge6/default/spool/spooldb/이며, 예시로 /engrid/ensge/default/spool/spooldb/ 가 그 경로라면 그대로 넣어준다. 그리고 엔터를 쳐준다.
만약 Berkeley DB server를 사용하지 않는다면, ‘n’를 쳐준다. 그리고 엔터를 쳐주라는 말에 맞게 진행한다. 그리고 Spooling 디렉토리 path를 넣어준다.
그리고 다음 질문은 Spooling Method를 선택하는 것인데 위에서 Berkeley DB를 사용하지 않는다고 했으므로 classic으로 넣어주고 엔터를 친다.

14. Group ID 값의 범위를 설정한다.
예를 들면 20000-20100으로 넣는다.

15. Execution Daemon을 위한 Spooling Directory를 설정한다.
매뉴얼 default는 /opt/n1ge6/default/spool/
현재 예시로는 /engrid/ensge/default/spool/

16. 문제 발생시 받을 이메일 주소를 넣어준다. 기본적으로는 none이다.

17. Configuration parameters를 확인한다.
execd_spool_dir과 administrator_mail항목이 있는데 위에서 서정한 것이 맞으면 바꾸지 않겠다는 의미로 ‘n’을 넣어준다.
execd_spool_dir은 위의 Spooling Directory를 의미하고 administrator_mail은 위에서 정한 문제 발생시 리포트를 받는 메일주소를 의미한다.

18. 시스템이 시작할 때, Daemon들이 실행될지 여부를 결정한다. ‘y’로 답한다.

19. 나중에 execution host로 설치할 hostname을 설정해준다.
차후에 설치할 execution hosts들을 설정하는 부분으로 submit hosts로도 사용될 것이다. 직접 hostname을 넣어줄 수도 있고 hostname을 규정한 파일을 불러 올 수도 있다.

기본 질문은 hostname이 있는 파일을 사용하겠냐는 것인데 ‘n’를 선택하면 다음진행으로 hostname을 직접 넣어주는 것인데, 띄어쓰기로 hostname을 구분한다. 그리고 라인이 길어지면 엔터를 쳐주며 계속 추가할 수 있는데 아무것도 치지 않고 엔터를 칠 때까지 추가할 수 있다. 넣어줄 만큼 넣어주었다면 그냥 엔터를 쳐주면 된다.

이제 all.q(queue)와 allhosts(hostgroup)이 생성된다.

20. Scheduler profile을 선택한다.
종류는 위에서 설명한대로 Normal, High, Max 중에 맞는 것을 선택해서 다음 단계로 넘어간다.

21. Grid Engine Software와 함께 사용될 환경변수를 만들어 준다.
** cell 이름을 따로 지정해 주지 않았다면, 기본적으로 default가 sell 이름이 된다.
– 만약에 C shell을 사용한다면,
# source /cell/common/settings.csh
– 만약에 Bourne shell을 사용중이라면,
# source /cell/common/settings.sh

How to Install Execution Hosts

설치 과정중에 sge_execd가 요구하는 디렉토리 구조를 만들어주고 sge_execd Daemon이 execution host에서 구동된다.
Master Server가 설치가 완료되고 난 뒤에 이 과정을 진행한다.

1. 터미널에서 root로 로그인한다.
2. master host 설치과정과 같이 sge-root 디렉토리에 설치 파일들을 옮겨 놓는다.
3. $SGE_ROOT 환경변수를 설정한다.
# SGE_ROOT= ; export SGE_ROOT
# echo $SGE_ROOT
4. sge-root 디렉토리로 이동한다.
5. administration host에 execution host가 제대로 정의되어져 있는지 확인한다.
# qconf -sh

** 만약 해당 execution host가 qconf 명령으로 나타나지 않는다면, administration host에서 다시 선언해 줘야 한다.
 새로운 터미널을 하나 연다.
 master host로 로그인한다.
 administration host로서 execution host를 선언해준다. 만약에 quark라는 hostname이라면
# qconf -ah quark
quark added to administrative host list
 로그 아웃해서 다시 execution host 설치 과정으로 돌아간다.
6. 설치 명령을 넣어준다.
# ./install_execd
만약에 Certificate Security Protocol을 사용한다면, -csp 옵션을 뒤에 붙여준다.
엔터를 쳐주면 5분정도 설치시간이 걸려서 완료가 된다는 메시지가 나타난다.
7. sge-root 디렉토리를 설정한다.
Grid Engine의 default root 디렉토리가 나타난다.
$SGE_ROOT = //
만약에 다르면 수정하고 엔터를 쳐준다.
8. cell의 이름을 넣어준다.
default cell이름은 default로 되어 있다. 미리 설정한 것이 있다면 그 이름을 넣어준다.

9. 엔터키를 눌러서 설치를 진행한다. 스크립트가 Execution host의 이름이 administration host에서 설정되었는지 확인한다.
10. Spool directory를 설정한다.
local spool directory를 원하지 않는다면 ‘n’를 설정한다.
local spool directory를 사용하고자 한다면 ‘y’를 설정한다. 그리고 그 path 값을 넣어준다.
11. 시스템 시작할 때, execd daemon이 실행될지 여부를 설정한다. ‘y’로 설정한다.
12. 현재 설치중인 execution host를 위한 queue를 확인한다.
allhosts 그룹에 포함되며 all.q에서 적용된다.
default queue에 추가될 것을 물어보는 것에 ‘y’로 대답한다면 설치과정은 종료된다. 몇 개의 화면이 더 나타나지만 그 다음 마무리를 한다. 몇개의 command가 화면에 나타나는데 매뉴얼을 통해서도 확인할 수 있는 것들이다.
13. Grid Engine Software와 함께 사용될 환경변수를 만들어 준다.
** cell 이름을 따로 지정해 주지 않았다면, 기본적으로 default가 sell 이름이 된다.
– 만약에 C shell을 사용한다면,
# source /cell/common/settings.csh
– 만약에 Bourne shell을 사용중이라면,
# source /cell/common/settings.sh

Administration host 등록

Master host는 이제 administrative task를 실행, 제출, 제거등을 할 수 있게 된다. Master host는 더이상의 설치가 필요없다. 그러나 새로 설치한 administration host는 등록과정을 거쳐야 한다.

——————————————————————
** execution host를 QMON GUI를 통해서 설치가 가능하다.
——————————————————————

Master Host에서 Grid Engine System Administrative Account를 사용해서 아래 명령을 쳐준다.(root 계정이 아닌)
% qconf -ah

Submit Host 등록

Master host에서 Grid Engine System Administrative Account를 사용해서 아래 명령을 쳐준다.
% qconf -as

How to Install the Berkeley DB Spooling Server

1. Spooling server host에 root로 로그인한다.
2. $SGE_ROOT 환경변수를 설정한다.
# SGE_ROOT= ; export SGE_ROOT
# echo $SGE_ROOT
3. sge-root 디렉토리로 이동한다.
4. inst_sge 명령을 -db 옵션과 함께 실행한다.
# /inst_sge -db
이 명령어는 spooling server 설치를 실행한다. 몇가지 질문을 받는다. 문제 발생시 설치를 중단하고 처음부터 다시 시작한다.
5. administrative account owner를 선택한다.
Grid Engine이 root가 아닌 다른 계정에서 설치되길 바라느냐는 질문에 ‘y’로 답한다.
그리고 Grid Engine admin user name을 넣어준다. 예를 들면 sgeadmin
6. sge-root 디렉토리를 설정한다.
Grid Engine의 default root 디렉토리가 나타난다.
$SGE_ROOT = //
만약에 다르면 수정하고 엔터를 쳐준다.
7. cell의 이름을 넣어준다.
default cell이름은 default로 되어 있다. 미리 설정한 것이 있다면 그 이름을 넣어준다.
8. Berkeley DB spooling을 선택한다.
Berkeley DB인지 Classic인지 선택하는데 기본은 Berkeley DB로 되어 있다.
9. hostname를 설정한다.
만약에 현재 설치중인 hostname이 host2라면 host2로 나타날것이다. 수정하고자 한다면 다른 이름으로 바꿔준다.
qmaster가 다른 서버에 있는 RPC server와 통신할 수 있게 한다.
10. spooling directory의 path를 넣어준다.
/default/spooldb
11. RPC 서버를 시작한다.
이제 rc 스크립트를 실행한다. RPC server에서 아래 스크립트를 실행한다.
# /default/common/sgedbd
만약에 이미 Berkeley DB spooling server를 구성했다면 Database를 rc 스크립트와 함께 다시 실행한다음에 “no”라는 답변으로 설치를 이어간다.
rc 스크립트가 RPC 서버를 실행시켜도 되냐고 물어본다. ‘y’라고 답하면 rpc server를 host2(Berkeley DB Spooling server)에서 시작한다.
12. Berkeley DB service가 시스템 시작할 때, 실행될지 여부를 설정한다.
‘y’로 설정한다.
13. Grid Engine Software와 함께 사용될 환경변수를 만들어 준다.
** cell 이름을 따로 지정해 주지 않았다면, 기본적으로 default가 sell 이름이 된다.
– 만약에 C shell을 사용한다면,
# source /cell/common/settings.csh
– 만약에 Bourne shell을 사용중이라면,
# source /cell/common/settings.sh

________________________________________
Chapter 3>

Automating the Installation Process

어떻게 자동으로 설치를 하는지 설명하는 내용이며, root 유저로는 설치를 진행할 수 없고 administrative user로 설치를 진행해야 한다.

Using the inst_sge Utility and Configuration Template

inst_sge 유틸은 일반적인 목적으로의 설치를 요할 때 사용된다. 어떤 host이든 master, administration, executio, submit, Shadow Master host 역할로 설치가 가능하다. 명령어를 사용해서 설치하는 것을 대신해서 사용할 수 있다. inst_sge를 사용하면 Configuration 파일을 활용해서 자동으로 설치가 될 수 있게 사용할 수도 있다. 물론 이 방법을 사용하면 system으로부터 feedback을 받을 수 없다. 대신 log 파일이 남아서 그것으로 확인이 가능하다. /cell/spool 하위에 log파일이 생성된다. 만약 설치가 실패하면, log 파일은 /tmp 디렉토리 하단에 남게 된다.

How to use the inst_sge Utility to Automate the Installation Process
기본적으로 rsh, ssh를 패스워드 없이 사용할 수 있게 설정을 해야한다. 그렇지 않으면 이 방법으로 설치를 할 수 없다.

1. sge-root 디렉토리의 ownership을 Grid engine system administrative user로 바꿔준다.
root로 설정되어 있으면 설치를 진행할 수 없다.
# chown -R sgdadmin(administrative user) //
2. configuration template을 복사한다.
# cd /util/install_modules
# cp inst_template.conf my_configuration.conf
3. 복사한 template을 수정한다. 11 페이지에 있는 Plan the Installation을 참고해서 진행한다.
4. 설치를 시작한다.
현재 master host와 execution host로 정한 장비에서 -auto 옵션을 붙여서 설치하면 Configuration 파일에 설정한 remote hosts들도 같이 설치가 된다.
# cd
# ./inst_sge -m -x -auto ./util/install_modules/my_configuration.conf
설치가 끝나면 //default/spool/qmaster 디렉토리 하단에 log파일이 남게 된다.
install_hostname_date_time.log 라는 이름으로 남게 된다.

————————————————————————————-
Example Configuration File
#————————————————-
# SGE default configuration file
#————————————————-
# Use always fully qualified pathnames, please
# SGE_ROOT Path, this is basic information
#(mandatory for qmaster and execd installation)
SGE_ROOT=”/opt/n1ge6″
# SGE_QMASTER_PORT is used by qmaster for communication
# Please enter the port in this way: 1300
# Please do not this: 1300/tcp
#(mandatory for qmaster installation)
SGE_QMASTER_PORT=”536″
# SGE_EXECD_PORT is used by execd for communication
# Please enter the port in this way: 1300
# Please do not this: 1300/tcp
#(mandatory for qmaster installation)
SGE_EXECD_PORT=”537″
# CELL_NAME, will be a dir in SGE_ROOT, contains the common dir
# Please enter only the name of the cell. No path, please
#(mandatory for qmaster and execd installation)
CELL_NAME=”default”
# The dir, where qmaster spools this parts, which are not spooled by DB
#(mandatory for qmaster installation)
QMASTER_SPOOL_DIR=”/opt/n1ge6/default/spool/qmaster”
# The dir, where the execd spools (active jobs)
# This entry is needed, even if your are going to use
# berkeley db spooling. Only cluster configuration and jobs will
# be spooled in the database. The execution daemon still needs a spool
# directory
#(mandatory for qmaster installation)
EXECD_SPOOL_DIR=”/opt/n1ge6/default/spool”
# For monitoring and accounting of jobs, every job will get
# unique GID. So you have to enter a free GID Range, which
# is assigned to each job running on a machine.
# If you want to run 100 Jobs at the same time on one host you
# have to enter a GID-Range like that: 16000-16100
#(mandatory for qmaster installation)
GID_RANGE=”20000-20100″
# If SGE is compiled with -spool-dynamic, you have to enter here, which
# spooling method should be used. (classic or berkeleydb)
#(mandatory for qmaster installation)
SPOOLING_METHOD=”berkeleydb”
# Name of the Server, where the Spooling DB is running on
# if spooling methode is berkeleydb, it must be “none”, when
# using no spooling server and it must containe the servername
# if a server should be used. In case of “classic” spooling,
# can be left out
DB_SPOOLING_SERVER=”none”
# The dir, where the DB spools
# If berkeley db spooling is used, it must contain the path to
# the spooling db. Please enter the full path. (eg. /tmp/data/spooldb)
# Remember, this directory must be local on the qmaster host or on the
# Berkeley DB Server host. No NSF mount, please
DB_SPOOLING_DIR=”/opt/n1ge6/default/spooldb”
# A List of Host which should become admin hosts
# If you do not enter any host here, you have to add all of your hosts
# by hand, after the installation. The autoinstallation works without
# any entry
ADMIN_HOST_LIST=”host1″
# A List of Host which should become submit hosts
# If you do not enter any host here, you have to add all of your hosts
# by hand, after the installation. The autoinstallation works without
# any entry
SUBMIT_HOST_LIST=”host1″
# A List of Host which should become exec hosts
# If you do not enter any host here, you have to add all of your hosts
# by hand, after the installation. The autoinstallation works without
# any entry
# (mandatory for execution host installation)
EXEC_HOST_LIST=”host1″
# The dir, where the execd spools (local configuration)
# If you want configure your execution daemons to spool in
# a local directory, you have to enter this directory here.
# If you do not want to configure a local execution host spool directory
# please leave this empty
EXECD_SPOOL_DIR_LOCAL=””
# If true, the domainnames will be ignored, during the hostname resolving
# if false, the fully qualified domain name will be used for name resolving
HOSTNAME_RESOLVING=”true”
# Shell, which should be used for remote installation (rsh/ssh)
# This is only supported, if your hosts and rshd/sshd is configured,
# not to ask for a password, or promting any message.
SHELL_NAME=”rsh”
# Enter your default domain, if you are using /etc/hosts or NIS configuration
DEFAULT_DOMAIN=”none”
# If a job stops, fails, finnish, you can send a mail to this adress
ADMIN_MAIL=”my.name@sun.com”
# If true, the rc scripts (sgemaster, sgeexecd, sgebdb) will be added,
# to start automatically during boottime
ADD_TO_RC=”true”
#If this is “true” the file permissions of executables will be set to 755
#and of ordenary file to 644.
SET_FILE_PERMS=”true”
# This option is not implemented, yet.
# When a exechost should be uninstalled, the running jobs will be rescheduled
RESCHEDULE_JOBS=”wait”
# Enter a one of the three distributed scheduler tuning configuration sets
# (1=normal, 2=high, 3=max)
SCHEDD_CONF=”1″
# The name of the shadow host. This host must have read/write permission
# to the qmaster spool directory
# If you want to setup a shadow host, you must enter the servername
# (mandatory for shadowhost installation)
SHADOW_HOST=”hostname”
# Remove this execution hosts in automatic mode
# (mandatory for unistallation of executions hosts)
EXEC_HOST_LIST_RM=”host2 host3 host4″
———————————————————————————–
만약 execution host가 local spool을 사용하지 않기를 원한다면, EXECD_SPOOL_DIR_LOCAL=”” 항목에서 따옴표 사이에 띄어쓰기 조차 없이 남겨놓는다.

________________________________________
Chapter4>

Installing the Increased Security Features
보안이 필요한 경우, Certificate Security Protocol(SCP)-based encryption을 사용해서 셋팅한다.

Why Install the Increased Security Features
secret key는 public/privat key protocl을 사용해서 교체됩니다.

Additional Setup Required
Grid Engine system의 Certificate Scurity Protocal 강화버젼을 설치하기 위한 이번 단계는 기본적인 셋업과 거의 유사하다. 기본적으로 “Plan the Installation”을 따로고 “How to Load the Distribution Files on a Workstation”, 그리고 “How to Install the Master Host”, “How to Install Execution Host”, “Registering Administration Hosts” 섹션을 따른다.

다음과 같은 과정이 요구된다.
1. Certificate Authority(CA) system key를 생성하고 설치 스크립트와 -csp 옵션을 활용해서 Master Host에 자격을 준다.
2. ssh를 사용하여 system key와 Certificate을 execution, submit hosts에 분배한다.
3. master 설치 후에 user key와 Certificate을 자동으로 생성한다.
4. 새로운 유저를 만들어 준다.
How to Install a CSP-Secured System

다양한 설치 스크립트를 호출할 때 -csp 옵션을 붙이는 것을 포함해서 “Perporming an Installation”단계를 수행한다.
1. 아래 명령어를 실행한다.
# cd
# ./install_qmaster -csp
2. CSP Certificate과 key를 생성하려면 몇가지 정보를 입력해야 한다.
• 두 글자의 국가 코드(US는 미국을 의미한다)
• 주(State)
• 도시(location)
• 소속(Organization)
• 팀(Organization Unit)
• 이메일 주소(CA email address)
설치 중에 CA는 생성이 되며, Grid Engine system에 속한 CA는 master host에 생성된다. 그 정보는 몇 개의 디렉토리로 분류가 된다.
• Publicly Accessible CA와 daemon : //cell/comon/sgeCA
• 그에 반응하는 private key : /var/sgeCA/{sge_service | portSGE_QMASTER_PORT}/cell/private
• 유저 key와 Certificate : /var/sgeCA/{sge_service | portSGE_QMASTER_PORT}/cell/userkeys/$USER
3. 스크립트가 site information을 요구한다.
4. 입력한 정보들이 맞는지 Confirm한다.
5. sge_qmaster의 보완관련 셋업이 끝나면 스크립트가 다음 설치로 계속 진행할 것을 요구한다.
SGE startup script
——————–
Your system wide SGEEE startup script is installed as:
“/scratch2/eddy/sge_sec/default/common/sgemaster”
Hit Return to continue >>
6. 공유 파일 시스템이 CSP 보안정보를 담기에 보안이 충분한지 확인한다.
– 만약에 보안이 충분하다면 “How to Install Execution hosts” 와 같은 방법으로 설치한다. 물론 ./install_execd -csp로 설치를 진행한다.
만약 root 유저가 모든 장비의 sge-root 디렉토리의 쓰기 권한을 갖고 있지 않다면, 해당 디렉토리권한을 갖은 유저로 소프트웨어를 설치할 것인지를 물어본다. “yes”라고 답변하면, $HOME/.sge 디렉토리에 보안관련 파일들이 설치가 된다.
만약에 유저이름이 sgeadmin이라면,

% su – sgeadmin
% source /default/common/settings.csh
% /util/sgeCA/sge_ca -copy
% logout

나머지 설치 단계를 모두 마무리 한 뒤, “How to Generate Certificates and Private Keys for Users” 항목을 참고한다.

– 만약 공유 파일 시스템의 보안이 충분하지 않다면 Daemon의 Private key와 random 파일들이 포함된 디렉토리를
1) execution host로 옮겨야 한다.
Master host에서 root로 로그인한 뒤에 아래와 같이 명령을 적용한다.
# umask 077
# cd /
# tar cvpf /var/sgeCA/port536.tar /var/sgeCA/port536/default
2) execution host에서 root로 로그인한 뒤에 아래와 같이 명령한다.
# umask 077
# cd /
# scp masterhost:/var/sgeCA/port536.tar .
# umask 022
tar xvpf port536.tar
# rm port536.tar
# ls -lR /var/sgeCA/port536/ << 하위폴더 및 파일의 권한을 확인한다.
3) 보안관련 파일들을 admin user의 $HOME/.sge 디렉토리에 설치한다.
만약 root 유저가 모든 장비의 sge-root 디렉토리의 쓰기 권한을 갖고 있지 않다면, 해당 디렉토리권한을 갖은 유저로 소프트웨어를 설치할 것인지를 물어본다. “yes”라고 답변하면, $HOME/.sge 디렉토리에 보안관련 파일들이 설치가 된다.
만약에 유저이름이 sgeadmin이라면,
% su – sgeadmin
% source /default/common/settings.csh
% /util/sgeCA/sge_ca -copy
% logout
4) execution host에서 아래 명령어로 설치를 이어서 진행한다.
# cd
# ./install_execd -scp
5) sge_execd가 요구하는 디렉토리 구조로 설치를 진행하고 sge_execd를 execution host에 실행시킨다.
6) Grid Engine Software와 함께 사용될 환경변수를 만들어 준다.
** cell 이름을 따로 지정해 주지 않았다면, 기본적으로 default가 sell 이름이 된다.
– 만약에 C shell을 사용한다면,
# source /cell/common/settings.csh
– 만약에 Bourne shell을 사용중이라면,
# source /cell/common/settings.sh

7. “How to Generate Certificates and Private Keys for Users” 단계로 넘어간다.

How to Generate Certificates and Private Keys for Users
CSP-secured system을 사용하려면 각 유저가 접근할 수 있는 유저만의 certificate과 private key가 있어야 한다. 가장 편리한 방법은 유저에 대한 정보를 담은 text 파일을 만들어 놓는 것이다.
1. master host에서 유저정보를 담은 text파일을 만들어서 저장한다.
myusers.txt라는 파일이름으로 한다.
Unix_username:Gecos_field:email_address 구조로 되어 있다. 예를 들면,
—————————————
eddy:Eddy Smith:eddy@myh.org
sarah:Sarah Miller:sarah@my.org
leo:Leo Lion:leo@my.org
—————————————
2. master host에서 root로 로그인 한 뒤에 아래 명령을 실행한다.
# /util/sgeCA/sge_ca -usercert myusers.txt
3. 적용된 키를 확인한다.
# ls -l /var/sgeCA/port536/default/userkeys
dr-x—— 2 eddy staff 512 Mar 5 16:13 eddy
dr-x—— 2 sarah staff 512 Mar 5 16:13 sarah
dr-x—— 2 leo staff 512 Mar 5 16:13 leo
4. 위에서 나타난 각 유저들이 보안관련 파일들을 각 유저의 $HOME/.sge 디렉토리에 설치하도록 명령을 넣는다.
% source /default/common/settings.csh
% /util/sgeCA/sge_ca -copy
Certificate and private key for user
eddy have been installed
% ls -lR %HOME/.sge
키가 생성된 상태로 하위 디렉토리가 나타나는 것을 볼 수 있다.

Checking Certificates

Dispaly a Certificate
% /utilbin/arch/openssl x509 -in ~/.sge/port536/default/certs/cert.pem -text
Check Issuer
% /utilbin/arch/openssl x509 -issuer -in ~/.sge/port536/default/certs/cert.pem -noout
Check Subject
% /utilbin/arch/openssl x509 -subject -in ~/.sge/port536/default/certs/cert.pem -noout
Show Email of Certificate
% /utilbin/arch/openssl x509 -email -in ~/.sge/port536/default/certs/cert.pem -noout
Show Validity
% /utilbin/arch/openssl x509 -dates -in ~/.sge/port536/default/certs/cert.pem -noout
Show Fingerprint
% /utilbin/arch/openssl x509 -fingerprint -in ~/.sge/port536/default/certs/cert.pem -noout

________________________________________

Chapter5>
Upgrading from a Previous Release of Grid Engine Software

Sun™ ONE Grid Engine 5.3 or Sun ONE Grid Engine, Enterprise Edition, 5.3 버젼에서 업그레이드 하는 방법이다.

-생략한다-

________________________________________
Chapter6>
Verifying the Installation

각 Host에서 Daemon들이 제대로 동작하는지 확인하고 job을 보내는 것을 테스트한다.
Grid Engine System Daemon이 제대로 동작하는지 확인하기 위해서는 Master host에서 sge_qmaster와 sge_schedd이 구동중인지 확인하고 execution host의 sge_execd daemon을 확인한다. 만약 정상 가동중이라면 job을 보내는 것을 테스트한다.

How to verify That the Daemons are Running on the Master Host
1. master host에 로그인한다.
2. 서비스가 구동중인지 확인한다.
% ps -ax | grep sge (BSD-based Unix System)
14676 p1 S < 4:47 /gridware/sge/bin/solaris/sge_qmaster
14678 p1 S < 9:22 /gridware/sge/bin/solaris/sge_schedd
% ps -ef | grep sge (Solaris: Unix System 5-based System)
root 439 1 0 Jun 2 ? 3:37 /gridware/sge/bin/solaris/sge_qmaster
root 446 1 0 Jun 2 ? 3:37 /gridware/sge/bin/solaris/sge_schedd
3. 만약에 2번과 같은 결과가 나타나지 않는다면 한, 두개 이상의 Daemon이 master host에서 구동되지 않고 있는 것이다.
현재 노드가 master host가 맞는지 확인한다.
% /cell/common/act_qmaster 이 파일을 확인한다.
sgemaster Daemon을 다시 실행한다.
# /cell/common/sgemaster start

How to Verify That the Daemons Are Running on the Execution Hosts
1. execution host에 로그인한다.
2. 서비스가 구동중인지 확인한다.
% ps -ax | grep sge (BSD-based Unix System)
14688 p1 S < 4:27 /gridware/sge/bin/solaris/sge_execd
% ps -ef | grep sge (Solaris: Unix System 5-based System)
root 171 1 0 Jun 22 ? 7:11 /gridware/sge/bin/solaris/sge_execd
3. 만약에 2번과 같은 결과가 나타나지 않는다면 필요한 Daemon이 실행되지 않은 것이다.
# /cell/common/sgeexecd start

How to Run Simple Commands
sge_qmaster, sge_schedd, sge_execd 세 Daemon이 제대로 구동중이라면, Trial Command를 발생시켜본다.
1. master host 또는 다른 administrative host에 접속한다.
standard search path에 /bin 이 포함되어 있는지 확인(정확히 어떤 의미인지 파악안됨)
2. 명령어를 쳐준다.
% qconf -sconf
global cluster Configuration 결과가 출력된다.

명령이 실패하면, 환경설정 을 확인한다.
/cell/common/settings.csh 또는 /cell/common/settings.sh에서 SGE_EXECD_PORT와 SGE_QMASTER_PORT 가 제대로 설정되어 있는지 확인한다.
그리고 다시 위의 명령을 실행한다.

3. test job을 보낸다.

How to Submit Test Jobs
batch 스크립트를 보내기전에 Standard Shell resource file(.cshrc, .profile 또는 kshrc), 또는 개인적인 shell resource 파일이 stty 명령어를 갖고 있는지 확인한다. Batch job은 기본적으로 터미널 연결을 갖지 않기 때문에 stty를 호출하면 에러가 발생한다.
1. master host에 로그인한다.
2. 아래 명령을 실행한다.
% rsh exec-host-name date
exec-host-name은 Execution host 중 하나의 hostname을 의미한다. 만약 login이나 home 디렉토리가 host 간에 다르다면 모든 Execution hosts에 테스트를 해야한다. 출력값은 master host에서 직접 date 명령을 실행한것과 비슷하게 나타난다. 만약에 어떤 추가적인 에러 메시지가 나타난다면 batch job을 보내기 전에 해결해야 한다.

Terminal Connection 테스트를 진행하기 위해서 Bourne Shell 스크립트를 하나 만들어서 실행할 수 있다.
——————–
tty -s
if [ $? = 0 ]; then
stty erase ^H
fi
——————–
아래 내용은 C shell로 만든 스크립트다.
————————-
tty -s
if ( $status = 0 ) then
stty erase ^H
endif
————————-

3. 간단한 스크립트를 제출한다(//examples/jobs/ 안에 있다)
% qsub /examples/jobs/simple.sh
4. qstat 명령을 활용해서 job의 상태를 확인할 수 있다.
5. job이 다 끝나고 나면, home 디렉토리에 출력된 결과를 확인할 수 있다.
stdout/stderr 파일로 script-name.ejob-id와 script-name.ojob-id
id 값은 각 job에 할당된 특정 정수로 표기된다.
에러가 발생하면 Administration Guide에서 Chapter 8, “Fine Tuning, Error Messages, and Troubleshooting”을 확인한다.

________________________________________

Chapter 7>
Removing the Grid Engine Software

Removing The Software Interactively
master host에서 제거하기전에 Execution host에서 software를 먼저 제거해야 한다. 만약 Master host에서 먼저 execution host를 제거 하면 Execution host에서 software를 자동으로 지울 수 없게 된다.

How to Remove the Software Interactively
1. 다음 명령어를 실행한다.
# source <sge-root/cell/common/settings.csh 또는
# source <sge-root/cell/common/settings.sh
2. master host에서 sge_root/inst_sge 명령을 실행한다.
아래 명령 예시는 host1, host2, host3의 excution host를 제거한다.
# /inst_sge -ux -host “host1 host2 host3”
3. (optional) 만약에 shdow master host가 있다면, 그것도 제거해 줘야 한다.
# /inst_sge -usm -host “host4″
4. 이제 Master Host를 제거한다.
# /inst_sge -um

Removing The Software Using the inst_sge Utility and a Configuration Template
interactive방식과 다른점은 uninstall 진행시에 output이 나타나지 않는다는 것이다. 그리고 Configuration 파일을 요구한다는 것이다.

How to Remove the Software Interactively
1. 다음 명령어를 실행한다.
# source <sge-root/cell/common/settings.csh 또는
# source <sge-root/cell/common/settings.sh
2. Configuration Template 파일을 복사한다.
# cd /util/inst_sge_modules/
# cp inst_sge_template.conf my_configuration.conf
3. Configuration 파일을 수정한다. 제거할 Execution host 들의 hostname을 넣어준다.
# vi my_configuration.conf
———————————————————
EXEC_HOST_LIST_RM=”host1 host2 host3 host4”
———————————————————
4. master host에서아래 명령을 수행한다.
# /inst_sge -ux -auto /util/inst_sge_modules/my_configuration.conf
5. (optional) 만약에 shdow master host가 있다면, 그것도 제거해 줘야 한다.
# /inst_sge -usm -auto /util/inst_sge_modules/my_configuration.conf
6. 이제 Master Host를 제거한다.
# /inst_sge -um -auto /util/inst_sge_modules/my_configuration.conf

________________________________________

Chapter 8>
Installing the Accounting and Reporting Console

Setting up the Database Software
Accounting and Reporting Console을 설치하고 사용하기 위해서는 Database software를 먼저 설치하고 제대로 설정을 해줘야 한다. 여기서는 PostgreSQL과 Oracle database의 설정법을 알아본다.

Setup the PostgreSQL Database Software
자세한 사항은 아래 url에 나타나있다.
http://www.postgresql.org/docs/7.4/static/index.html

How to Start the Database Server
일단, PostgreSQL이 설치가 되면 database server를 구동할 수 있다.
먼저 다운로드, 컴파일, 설치를 진행하고 유저계정을 만들어서 database process를 관리할 수 있게 한다. 보통은 유저이름이 postgres이다. PostgreSQL의 bin 디렉토리와 LD_LIBRARY_PATH값을 환경설정에 넣어준다.

1. postgres 유저를 위한 home 디렉토리를 만들어 준다.
% mkdir -p /space/postgres/data
% useradd -d /space/postgres postgres
% chown postgres /space/postgres/data
% su – postgres
2. 아래 명령을 실행한다. (PostgreSQL 문서를 바탕으로 한다)
> intdb -D /space/postgres/data
creating directory /space/postgres/data… ok
creating directory /space/postgres/data/base… ok
creating directory /space/postgres/data/global… ok
creating directory /space/postgres/data/pg_xlog… ok
creating directory /space/postgres/data/pg_clog… ok
creating template1 database in /space/postgres/data/base/1… ok
creating configuration files… ok
initializing pg_shadow… ok
enabling unlimited row size for system tables… ok
initializing pg_depend… ok
creating system views… ok
loading pg_description… ok
creating conversions… ok
setting privileges on built-in objects… ok
vacuuming database template1… ok
copying template1 to template0… ok
Success. You can now start the database server using:
postmaster -D /space/postgres/data
or
pg_ctl -D /space/postgres/data -l logfile start
3. pg_hba.conf 파일을 수정한다.
Database Superuser postgres에 제한없고 패스워드없는 접근이 가능하도록 만들어준다. 그러나 다른 모든 Database 유저를 위해 md5 encrypted 패스워드가 필요하다. 아래에서 ip-address에서 subnet값만 nnn.nnn.nnn에서 대체해서 넣어준다. 그리고 호스트별로 접근 규칙을 추가해준다.
# TYPE DATABASE USER IP-ADDRESS IP-MASK METHOD
local all postgres trust
local all all md5
# IPv4-style local connections:
#host all all nnn.nnn.nnn.0 255.255.255.0 md5
4. postgresql.conf 파일을 수정하여 다른 host들로 부터 TCP/IP 접속을 허용한다.
Shared_buffer값은 최소한 Max_connections의 두배여야 한다.
———————————————-
tcpip_socket = true
max_connections = 40 (increase if necessary)
———————————————-
5. Database를 시작한다.
> postmaster -S -i < -S : silent 모드, -i : TCP/IP 적용 모드 6. 설치가 제대로 되었는지 확인한다. % su – postgres > createuser -P test_user
Enter password for new user:
Enter it again:
Shall the new user be allowed to create databases? (y/n) y
Shall the new user be allowed to create more new users? (y/n) n
CREATE USER
> createdb -O test_user -E UNICODE test
CREATE DATABASE
7. Database super user로 명령을 실행한다.
> psql test
Welcome to psql 7.3, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

test=# create table test (x int, y text);
CREATE TABLE
test=# insert into test values (1, ‘one’);
INSERT 16982 1
test=# insert into test values (2, ‘two’);
INSERT 16983 1
test=# select * from test;
x | y
—+——
1 | one
2 | two
(2 rows)

test=# \q

> psql -U test_user test
Password:
Welcome to psql 7.4.1, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit
test=>

How to Set Up a PostgreSQL Database
1. Database superuser로 로그인한다.
# su – postgres
2. database의 owner 계정을 만들어 준다.
> createuser -P arco_write
Enter password for new user:
Enter it again:
Shall the new user be allowed to create databases? (y/n) y
Shall the new user be allowed to create more new users? (y/n) n
CREATE USER
3. 해당 계정의 database를 만들어 준다.
> createdb -O arco_write arco
CREATE DATABASE
4. arco_data의 table과 view를 arco_write이라는 계정으로 만들어 준다.
> psql -f /dbwriter/database/postgres/setup.sql arco arco_write
> psql -f /dbwriter/database/postgres/view.sql arco arco_write
5. Database를 읽기위한 database user를 만들어준다.
(arco_read를 사용하지 않으려고 한다면 다음 단계에 있는 권한 수정을 해줘야 한다)
> createuser -P arco_read
Enter password for new user:
Enter it again:
Shall the new user be allowed to create databases? (y/n) n
Shall the new user be allowed to create more new users? (y/n) n
CREATE USER
6. arco_user의 권한을 승인하고자 한다면,
> psql -f /dbwriter/database/postgres/privileges.sql arco arco_write (왜 arco_user가 아닌지..?)

How to Set Up an Oracle Database
1. Database 관리자에게 Oracle Database의 예시가 있는지 확인한다.
두개의 Database user가 필요하다. arco_write, arco_read. arco_write은 테이블을 만들고 수정하거나 보고 index를 할 수 있어야 한다. arco_read는 차후에 SQL 스크립트를 통해서 수정할 것이다.
2. Database 관리자에게 Databaseㄹ의 연결이 될 수 있는지 확인한다.
sqlplus 라는 Utility가 PATH에 있어야 한다.
3. 환경 변수를 설정한다.
Bourne Shell이나 Korn shell을 사용한다면 아래 명령을 수행한다.
> . /usr/local/bin/oraenv ORACLE_SID = [] ? SID of your database ORACLE_HOME = [/space/oracle] ? path to your oracle installation
C shell을 사용한다면 아래 명령을 수행한다.
> source /usr/local/bin/coraenv ORACLE_SID = [] ? SID of your database ORACLE_HOME = [/space/oracle] ? path to your oracle installation
4. Database Table을 만들어 준다.
> $ORACLE_HOME/bin/sqlplus arco_write@my-host.my-domain @ /dbwriter/database/oracle/setup.sql

SQL*Plus: Release 9.0.1.0.0 – Production on Thu May 13 11:14:06 2004
(c) Copyright 2001 Oracle Corporation. All rights reserved.
Enter password:
Connected to:
Oracle9i Enterprise Edition Release 9.0.1.0.0 – Production
With the Partitioning option
JServer Release 9.0.1.0.0 – Production
Table created.
…
Disconnected from Oracle9i Enterprise Edition Release 9.0.1.0.0 – Production
With the Partitioning option
JServer Release 9.0.1.0.0 – Production
5. View를 만들어 준다.
> $ORACLE_HOME/bin/sqlplus arco_write@my-host.my-domain @sge-root/dbwriter/database/oracle/view.sql

SQL*Plus: Release 9.0.1.0.0 – Production on Thu May 13 11:20:05 2004
(c) Copyright 2001 Oracle Corporation. All rights reserved.
Enter password:
Connected to:
Oracle9i Enterprise Edition Release 9.0.1.0.0 – Production
With the Partitioning option
JServer Release 9.0.1.0.0 – Production
View created.
Commit complete.
Disconnected from Oracle9i Enterprise Edition Release 9.0.1.0.0 – Production
With the Partitioning option
JServer Release 9.0.1.0.0 – Production
6. arco_user의 권한을 허용한다.
> $ORACLE_HOME/bin/sqlplus arco_write@my-host.my-domain @ /dbwriter/database/oracle/privileges.sql
SQL*Plus: Release 9.0.1.0.0 – Production on Thu May 13 11:20:05 2004
(c) Copyright 2001 Oracle Corporation. All rights reserved.
Enter password:
Connected to:
Oracle9i Enterprise Edition Release 9.0.1.0.0 – Production
With the Partitioning option
JServer Release 9.0.1.0.0 – Production
Grant succeeded.
…
Synonym created.
…
Commit complete.
Disconnected from Oracle9i Enterprise Edition Release 9.0.1.0.0 – Production
With the Partitioning option
JServer Release 9.0.1.0.0 – Production

Install the Accounting and Reporting Software

How to Set up dbWriter
1. accounting and reporting software의 압축을 풀어준다.
# cd
# gunzip -dc cdrom_mount_point/N1_Grid_Engine_6/ARCo/tar/n1ge-6_0-arco-1_0.tar.gz | tar xvpf –
2. Administrative user로 환경변수를 설정한다.
Bourne shell 또는 Korn shell이면 아래명령을 실행한다.
$ . /default/common/settings.sh
C shell이면 아래 명령을 실행한다.
$ source /default/common/settings.csh
3. reposting이 적용되게 global 설정을 바꿔준다.
자세한 사항은 Administration Guide에서 “Repost Statistics(ARCo)”를 살펴본다.
% qconf -mconf

reporting_params accounting=true \
reporting=true flush_time=00:00:15 joblog=true \
sharelog=00:00:00

기본적으로 report 변수는 활성화되지 않는다. qconf를 활용해서 statistics gatering을 ‘on’으로 바꿔줘야 한다.
% qconf -me global
hostname global

report_variables cpu,np_load_avg,mem_free,virtual_free

4. dbWriter software를 설치한다.
# cd /dbwriter
# ./inst_dbwriter
Welcome to the Grid Engine installation
—————————————
Grid Engine dbWriter installation
———————————
The dbWriter installation will take approximately 5 minutes
Hit to continue >>
5. sge-root 디렉토리의 위치와 cell의 이름을 설정한다.
Generic Parameters
——————
Please enter your SGE_ROOT [] >> /opt/n1ge6
Now the name of the cell is needed:
Please enter your SGE_CELL [default] >> default
6. Java™ Software Development Kit의 설치 경로를 설정한다.
만약에 JAVA_HOME 환경변수가 설정되어 잇다면 자동으로 script가 그 값을 default로 잡을 것이다.
Please enter the path to your java 1.4 installation [/usr/java] >>/opt/j2sdk1.4.1_01
7. Reporting database를 위한 연결 값을 설정한다.
Setup your database connection parameters
—————————————–
Enter your database type ( o = Oracle, p = PostgreSQL ) [p] >> p
Please enter the name of your postgres db host []>> my-host.my-domain
Please enter the port of your postgres db [5432] >>
Please enter the name of your postgres database [arco] >>
8. Database 유저의 유저이름과 패스워드를 입력한다.
Please enter the name of the database user [arco_write] >>
Please enter the password of the database user >>
Please retype the password >>
9. Database Schema의 이름을 넣어준다.
만약에 JDBC driver를 찾았다면, Schema의 이름을 물어볼 것이다. PostgreSQL의 값은 보통 public이다. Oracle에서는 Database owner(arco_write)dml 유저이름이다.
Please enter the name of the database schema [public] >>
Search for the jdbc driver org.postgresql.Driver
in directory /opt/n1ge6/dbwriter/lib ……….
found in /opt/n1ge6/dbwriter/lib/pg73jdbc2.jar

만약에 JDBC driver를 찾지 못했다면, JAR 파일(Database Driver)를 //dbwriter/lib 디렉토리로 복사를 할 것을 물어본다. 복사가 되고 난 후에는 엔터키를 누르고 다시 찾을 것이다.
Please copy your driver jar into the
directory /opt/n1ge6//dbwriter/lib
Press enter to continue >>
10. 제대로 설치가 되었는지 확인한다.
만약 제대로 설치가 되었다면 연결상태를 테스트하게 된다.
Should the connection to the database be tested? (y/n) [y] >>
엔터를 쳐서 진행했을 때, 성공하면 아래와 같이 결과가 나타난다.
Test db connection to ’jdbc:postgresql://my-host.my-domain:5432/arco’ … OK
만약 실패한다면 다른 결과값을 나타낸다. 다시 Database 연결 setup을 진행한다.
Test db connection to ’jdbc:postgresql://my-host.my-domain:5432/arco’ … Failed
Do you want to repeat database connection setup? (y/n) [y] >>
11. 얼마나 자주 dbWriter 프로그램이 Grid Engine system log file을 체크해야 할지 설정한다.
Please enter the interval between two dbwriter runs in seconds [60] >>
12. 계산으로 파생된 값을 위한 규칙이 적용된 파일의 위치를 설정한다.
reporting and Accounting Data를 기반으로 dbWriter 프로그램이 파생된 값을 계산한다. 파생된 값(Derived Values는 raw data에서 계산된 Reporting 정보다. dbWriter 프로그램은 data가 필요가 없을 때, Reporting database로 부터 data를 지울 수 있게 해준다. 파생된 값 계산의 관리를 위한 규칙과 data 삭제는 하나의 파일에 저장된다. dbWriter 프로그램에서는 PostgreSQL database와 Oracle database를 위해 각 각 하나의 example 파일을 제공하는데 위치는 아래와 같다.
$SGE_ROOT/dbwriter/database/postgres/dbwriter.xml
$SGE_ROOT/dbwriter/database/oracle/dbwriter.xml
위의 파일 경로를 그대로 사용해도 되고 새로운 경로를 사용해도 된다.
Please enter the file with the derived value rules
[$SGE_ROOT/dbwriter/database/postgres/dbwriter.xml] >>
13. dbWriter 프로그램이 사용할 logging 레벨을 설정한다. 기본은 INFO로 되어 있다.
The dbWriter can run with different debug levels
Possible values: WARNING INFO CONFIG FINE FINER FINEST
Please enter the debug level of the dbwriter [INFO] >>
14. 설정을 확인한다.
만약에 ‘n’을 입력하고 엔터를 치면 set-up을 다시 진행한다.
All parameters are now collected
——————————–
SGE_ROOT=/opt/n1ge6
SGE_CELL=default
JAVA_HOME=/usr/java (java version “1.4.1”)
DB_URL=jdbc:postgresql://my-host.my-domain:5432/arco
DB_USER=arco_write
INTERVAL=60
REPORTING_FILE=/opt/n1ge6/default/common/reporting
DERIVED_FILE=/opt/n1ge6/dbwriter/database/postgres/dbwriter.xml
DEBUG_LEVEL=INFO
Are this settings correct? (y/n) [y] >> y
설치 스크립트는 시작 스크립를 하나 만들어 준다. /dbwriter/bin/sgedbwriter
이 시작스크립트에 대한 설정파일은 /cell/common/dbwriter.conf 이다. 만약 어떤 값의 변화가 생긴다면 설치 스크립트를 다시 시작해야 한다.

Create configuration file for dbWriter in
/opt/n1ge6/default/common
Installation of dbWriter completed
Start the dbWriter with
/opt/n1ge6/dbwriter/bin/sgedbwriter start
>

15. 만약에 dbWriter 프로그램이 시스템 시작할 때 실행되려면 해당 스크립트를 /etc/init.d 디렉토리에 복사해준다.
# cp /cell/dbwriter/bin/sgedbwriter /etc/init.d
# ln -s /etc/int.d/sgedbwriter /etc/rc2.d/S98sgedbwriter
16. dbWriter 프로그램을 실행한다.
# /etc/init.d/sgedbwriter start

How to Install Sun Web Console
1. Web Console package를 /tmp 디렉토리 하위에 압축을 푼다.
# cd /tmp
# umask 022
# mkdir swc
# cd swc
# tar xvf cdrom_mount_point/N1_Grid_Engine_6/SunWebConsole/tar/swc_sparc_2.0.3.tar
2. 만약에 noaccess 유저와 noaccess 그룹이 password 파일이나 NIS passwd map에 존재하지 않을 때, 다음 명령을 실행한다.
# groupadd -g 60002 noaccess
# useradd -u 60002 -g 60002 -d /tmp -s /bin/csh -c “No Access User” noaccess
3. 만약 SuSE 9.0이 구동중이라면, /etc/rc#.d 디렉토리의 Symbolic Links를 만들어 준다.
# ln -s /etc/rc.d/rc0.d /etc/rc0.d
# ln -s /etc/rc.d/rc1.d /etc/rc1.d
# ln -s /etc/rc.d/rc2.d /etc/rc2.d
4. Sun Web Console Setup 스크립트를 실행한다.
# ./setup

Installation complete.
Starting Sun(TM) Web Console Version 2.0.3…
See /var/log/webconsole/console_debug_log for server logging information
After running the setup script, the Sun Web Console is started. You can stop, start,
or restart the console at any time, using the following commands:
# /usr/sadm/bin/smcwebserver start
# /usr/sadm/bin/smcwebserver stop
# /usr/sadm/bin/smcwebserver restart
5. 웹브라우져에서 Web Console로 접속한다.
https://hostname:6789
6. UNIX 계정으로 로그인한다. 이상없다면 다음 단계인 “How to Install the Accounting and Reporting Console”

How to Install the Accounting and Reporting Console
1. 경로를 이동한다.
# cd /reporting
2. inst_reporting 스크립트를 사용해서 설치를 진행한다.
# ./inst_reporting
Welcome to the N1 SGE reporting module installation
—————————————————-
The installation will take approximately 5 minutes
Hit to continue >>
3. Java™ Software Development Kit의 설치 경로를 설정한다.
만약에 JAVA_HOME 환경변수가 설정되어 잇다면 자동으로 script가 그 값을 default로 잡을 것이다.
Please enter the path to your java 1.4 installation [/usr/java] >>/opt/j2sdk1.4.1_01
4. accounting and reporting software가 그 결과값을 저장할 수 있는 디렉토리를 설정한다.
만약 그 경로가 없다면 알아서 그 경로를 생성시켜준다.
Spool directory
—————
In the spool directory the N1 SGE reporting module will
store all queries and results
Please enter the path to the spool directory [/var/spool/arco] >>
5. database를 위한 연결 값을 설정한다.
Database Setup
————–
Enter your database type ( o = Oracle, p = PostgreSQL ) [p] >> o
Please enter the name of your oracle db host [] >> my-host
Please enter the port of your oracle db [1521] >>
Please enter the name of your oracle database [arco] >>
6. Accounting and Reporing Database 계정을 설정한다.
이전에 dbwriter 프로그램에서 사용한 계정을 사용하면 안된다. 보안을 문제 때문에, Account and reporting 을 위한 database 계정은 database table에 대한 read 권한만 갖는것으로 설정한다.
Please enter the name of the database user [arco_read] >>
Please enter the password of the database user >>
Please retype the password >>
Please enter the name of the database schema [arco_write] >>
Search for the jdbc driver oracle.jdbc.driver.OracleDriver
in directory /opt/n1ge/reporting/WEB-INF/lib …
found in /opt/n1ge/reporting/WEB-INF/lib/classes12.jar
Should the connection to the database be tested? (y/n) [y] >> y
Test db connection to ’jdbc:oracle:thin:@my-system:1521:arco’ … OK
h150
7. Queries와 results를 저장할 수 있는 유저의 이름을 넣어준다.
Configure users with write access
———————————
Enter a login name of a user (Press enter to finish) >> user1
Users: user1
Enter a login name of a user (Press enter to finish) >> user2
Users: user1 user2
Enter a login name of a user (Press enter to finish) >>
8. 제대로 설정을 마무리 했는지 확인한다.
All parameters are now collected
——————————–
SPOOL_DIR=/var/spool/arco
DB_URL=jdbc:oracle:thin://my-system:1521/arco
DB_USER=arco_read
ARCO_WRITE_USERS=user1 user2
Are this settings correct? (y/n) [y] >> y
Shutting down Sun(TM) Web Console Version 2.0.3…
9. Query 디렉토리를 만든다.
만약에 이미 그 디렉토리가 있다면, 미리 정의된 Query들은 설치되지 않는다. 해당 Query들을 설치하고 설치하고 싶다면 모두 복사를 해줘야 한다. //reporting/database/example_queries/queries 에서 복사한다.
Directory /var/spool/arco does not exist, create it? (y/n) [y] >> y
Create query directory /var/spool/arco
Create query directory /var/spool/arco/queries
Copy predefined queries into /var/spool/arco/queries
Create query directory /var/spool/arco/results
Create query directory /opt/n1ge6/reporting/charts
Register the N1 SGE reporting module in the webconsole
Registering com.sun.grid.arco_6.
Starting Sun(TM) Web Console Version 2.0.3…
See /var/log/webconsole/console_debug_log for server logging information
10. log 파일 및 경고 메시지를 확인한다.
# more /var/log/webconsole/consle_debug_log (Accounting and reporting log가 저장된 위치)
기본적으로 log level은 INFO 설정했었지만, 차후에 수정하고자 한다면 아래명령을 실행한다.
# smreg add -p -e arco_logging_level=FINE
레벨의 종류는 WARNING, INFO, FINE, FINER, FINEST 등이 있다.
11. 웹브라우져에서 Web Console로 접속한다.
https://hostname:6789
12. UNIX 계정으로 로그인한다.
13. N1 Grid Engine 6 Accounting and Reporting Console을 선택한다.

SGE Installation 번역

You may also like...

알림글

시스존 통합 검색

카테고리

2025 8월
월	화	수	목	금	토	일
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

SGE Installation 번역

You may also like...

Intel Xeon v4 CPU Rpeak 값

GotoBLAS-1.26.tar.gz compile at Intel Nehalem CPU

LSF 주요 environment 정리

알림글

시스존 통합 검색

카테고리