[튜닝] Tuning and Optimizing Linux for Oracle 9i Database

Tuning and Optimizing Red Hat Linux Advanced Server 2.1 for Oracle9i Database

This article has been published in the September 2003 Australian Unix Users Group Newsletter (AUUGN) Journal.

>>> Werner’s Oracle – Linux Page <<<

The following procedure is a step-by-step guide with tips and information for tuning and optimizing Red Hat Linux Advanced Server 2.1 for Oracle9i. This summary (HOWTO) shows how I tuned and optimized Red Hat AS 2.1 for Oracle 9iR2 (9.2.0).

A procedure for installing Oracle9iR2 on Red Hat AS 2.1 can be found at Werner’s Oracle – Linux Page.

This article covers the following subjects and steps:

* Introduction

* Oracle Limits on Linux

* Why Using Red Hat Advanced Server 2.1

* Upgrading the Linux Kernel

* Sizing Swap Space

     Swap Size Recommendations

     Checking Physical Memory

     Checking Swap Space Size and Usage

* Setting Shared Memory

     Setting SHMMAX Parameter

     Setting SHMMNI Parameter

     Setting SHMALL Parameter

* Setting Semaphores

     The SEMMSL Parameter

     The SEMMNI Parameter

     The SEMMNS Parameter

     The SEMOPM Parameter

     Setting the Semaphore Kernel Parameters

* Setting File Handles

* Setting Shell Limits for the Oracle User

     Setting Limits for the Maximum Number of Open File Descriptors for the Oracle User

     Setting Limits for the Maximum Number of Processes for the Oracle User

* Setting Asynchronous I/O

     Relinking Oracle to Enable Asynchronous I/O for Oracle9iR2

     Enabling Asynchronous I/O in init.ora for Raw Devices

     Enabling Asynchronous I/O in init.ora for Filesystem Files

     Increasing I/O Throughput at the Linux OS Level

* Increasing Space for Larger SGA (2.7 GB) to Fit Into Memory

     Address Mappings on Linux – Shared Memory and Shared Library Mapping on Linux

     Changing the Base Address “mapped base” for Shared Libraries at the Linux OS Level

     Changing the Base Address for Shared Memory at the Oracle Level

     Giving Oracle Users the Privilege to Change the Base Address for Oracle’s Shared Libraries Without Giving them root Access

     Changing the Base Address for Oracle’s Shared Libraries Automatically During an Oracle Login

     Important Notes

* Using Large Memory Pages (Bigpages)

     Sizing Bigpages

     Configuring Bigpages

* Making Other Performance Related Changes

     Disabling Unneeded Background Processes

* Oracle Errors and Problems

* Useful Linux Performance Utilities

     top Utility

     sar Utility

     vmstat Utility

* Oracle Linux Management

     Determining Which Semaphore Sets and Shared Memory Segments Belong to Each Oracle Database or Instance

* Hardware Recommendation

* References

Introduction

Please point out every error you can find. I welcome email from any readers with comments, suggestions, or corrections. My address is webmaster_at_puschitz.com. I will continue to update and add new information for this article. So make sure to come back. 🙂

Before you begin making any changes to the Linux systems, make sure that the Oracle database is down!

Oracle Limits on Linux

Some limits apply to Red Hat Advanced Server only.

Linux supports 64-bit file I/O on 32-bit Intel platforms.

According to the white paper Oracle9iR2 on Linux: Performance, Reliability and Manageability Enhancements on Red Hat Linux Advanced Server 2.1″, the limits are as follows:

– Number of files per database: 64K

– Number of blocks per file: 4 million

– Maximum block size: 16 KB

– Maximum size for a database file is 64 GB

– Maximum database size is 4 petabytes with 16 KB blocks

On a 4 GB RAM machine, the size of the SGA (SGA utilizes shared memory) can be increased up to is 2.7 GB. This requires changes in Linux and Oracle. By default, the maximum size is 1.7 GB.

On a 8 GB RAM machine, the size of the SGA can be increased up to 7 GB by using the shared memory filesystem “shmfs”. A maximum size of 5.4 GB of SGA can be created using the “bigpages” feature for System V shared memory where the page size is 4 MB vs. the regular 4 KB.

On a machine that supports Physical Address Extension (PAE), the SGA can theoretically have a size of 62 GB. The PAE mechanism allows addressing using 36 bits on IA-32 systems. But current hardware limitations and practical consideration limit the actual size of the SGA on such systems.

The number of local concurrent users on a 4 GB server in non-MTS mode can range from 600 through 1200 without becoming unacceptable slow. For more information on the tpcc run that measured the number of concurrent users, see Linux Virtual Memory in Red Hat Advanced Server 2.1 and Oracle’s Memory Usage Characteristics.

Why Using Red Hat Advanced Server

Red Hat Linux Advanced Server has several features and enhancements that don’t exist in other Red Hat versions. Among other things, Red Hat AS provides:

– Asynchronous I/O

– Process scheduler with CPU affinity, cache affinity, and per CPU runqueues and locks that provide better performance

– “mapped base” (base address for shared libaries) can be changed dynamically allowing larger sizes for the SGA

– Page frame of size 4 MB as opposed to 4 KB can be used for the SGA which improves performance for large SGAs

– The kernel can also use the “high memory” pool (physical memory above 1 GB) for allocating page table entries (PTE) which allow a higher number of Oracle connections

– Elimination of copy to bounce buffer improves I/O performance

Upgrading the Linux Kernel

The recommended kernel for Red Hat Enterprise Linux 2.1 is 2.4.9-e.25 or higher. This kernel has several fixes that are relevant to Oracle including fixes for memory problems and kswapd problems.

If the Linux server has <= 4 GB RAM, the kernel “kernel-smp” should be used for SMP machines, or the kernel “kernel” should be used for UP machines. If the Linux server has > 4 GB RAM, the enterprise kernel “kernel-enterprise” should be used for UP and SMP machines.

To check if these kernels are installed, execute e.g. the following command:

rpm -q kernel-smp kernel-enterprise

To check which kernel is currently running, execute the following command:

uname -a

To install e.g. the enterprise kernel, download the “kernel-enterprise” RPM and execute the following command:

rpm -ivh kernel-enterprise-2.4.9-e.25.i686.rpm

To make sure that the right kernel is booted, check the /etc/grub.conf file if you use GRUB, and change the “default” attribute if necessary. Here is an example:

default=1

timeout=10

splashimage=(hd0,1)/boot/grub/splash.xpm.gz

title Red Hat Linux (2.4.9-e.25enterprise)

        root (hd0,1)

        kernel /boot/vmlinuz-2.4.9-e.25enterprise ro root=/dev/hda2 hdc=ide-scsi

        initrd /boot/initrd-2.4.9-e.25enterprise.img

title Red Hat Linux Advanced Server (2.4.9-e.25smp)

        root (hd0,1)

        kernel /boot/vmlinuz-2.4.9-e.25smp ro root=/dev/hda2 hdc=ide-scsi

        initrd /boot/initrd-2.4.9-e.25smp.img

title Red Hat Linux Advanced Server-up (2.4.9-e.25)

        root (hd0,1)

        kernel /boot/vmlinuz-2.4.9-e.25 ro root=/dev/hda2 hdc=ide-scsi

        initrd /boot/initrd-2.4.9-e.25.img

In this example, the “default” attribute is set to “1” which means that the 2.4.9-e.25smp kernel will be booted. If the “default” attribute would be set to “0”, then the 2.4.9-e.25enterprise kernel would be booted.

After you installed the new kernel and/or made changes to the /etc/grub.conf file, reboot the server.

Once you are sure you don’t need the old kernel anymore, you can remove the old kernel by running:

su – root

rpm -e <OldKernelVersion>

When you remove the kernel, you don’t need to make any changes to the /etc/grub.conf file.

NOTE: Be very careful when removing a kernel! Making a mistake could render the server unbootable.

Sizing Swap Space

In order to perform a typical Oracle 9i installation and to create a simple prototype database, Oracle says that you need a minimum of 512MB of RAM for the Oracle9i Server, and the amount of swap space should be equal to twice the amount of RAM or at least 400 MB, whichever is greater. Oracle also says that the minimum swap space should be at least the same as physical memory size.

Swap Size Recommendations

To summarize Oracle’s recommendation for the database and to take system configurations into account that were used for workload testings, here is what I came up with:

0.5 GB RAM   1 GB – 2 GB Swap Space

  1 GB RAM   2 GB – 3 GB Swap Space

  2 GB RAM   2 GB – 3 GB Swap Space

  3 GB RAM   3 GB Swap Space

  4 GB RAM   4 GB Swap Space

  8 GB RAM   4 GB Swap Space

16 GB RAM   8 GB Swap Space

The swap space will not be utilized until the system runs out of physical memory. So don’t configure too much swap space. Keep in mind that if the system starts using swap space, it has a negative impact to the performance of the database. So make sure that the system has always enough physical RAM and that it doesn’t use swap space continuously.

Checking Physical Memory

You can check the size of physical memory by running the following command:

grep MemTotal /proc/meminfo

You can find a detailed description of the entries in /proc/meminfo at http://www.redhat.com/advice/tips/meminfo.html.

Alternatively, you can use free(1) to check the memory:

# free

             total       used       free     shared    buffers     cached

Mem:       1031004     734656     296348          0     262404     287388

-/+ buffers/cache:     184864     846140

Swap:      2097144      40184    2056960

#

In this example the total amount of available memory is 1031004 KB. 184864 KB are used by programs and 846140 KB are available for more programs.

Don’t get confused with the first line that shows that 296348 KB are free! If you look at the usage figures you can see that most of the increase of memory is for buffers and cache. Linux tries to use all the memory for disk buffers and cache. It helps the system to run faster because disk information is already in memory and Linux doesn’t have to read it from disk again. If space is needed by a program or application like Oracle, Linux will make the space available immediately. So if your system runs for a while, you will usually see a small number for “free” in the first line, and there is nothing to be worried about.

Note: If you create a large SGA (shared memory) and start the database, free won’t show all the memory that has been allocated for SGA as “used” right away. That’s because Linux does not assign page frames to a memory mapping right after it has been created due to reasons of efficiency.

Checking Swap Space Size and Usage

You can check the size and current usage of swap space by running the following command:

cat /proc/swaps

If your swap partition is not large enough, you can add another swap partitions to your system. See “Adding Swap Space” for more information. Adding a permanent swap file to the system is not recommended due to the performance impact of the filesystem layer.

Setting Shared Memory

Shared memory allows processes to access common structures and data by placing them in shared memory segments. It’s the fastest form of IPC (Interprocess Communication) available since no kernel involvement occurs when data is passed between the processes. In fact, data does not need to be copied between the processes.

Oracle uses shared memory segments for the SGA (Shared Global Area) which is an area of memory that is shared by all Oracle background and foreground processes. The size of the SGA has a major impact to Oracle’s performance since it holds database buffer cache and much more.

To see all shared memory settings, run:

ipcs -lm

Setting SHMMAX Parameter

This parameter defines the maximum size in bytes for a shared memory segment. Since the SGA is comprised of shared memory, SHMMAX can potentially limit the size of the SGA. Ideally, SHMMAX should be large enough so that SGA can fit into one segment.

The default size on RH 2.1 AS is 33554432. With this value, the Oracle Database Configuration Assistant failed on my server with the following error message:

ORA-27123: unable to attach to shared memory segment

Setting SHMMAX to 1 GB always worked for me when I setup a medium sized database. However, it is suggested that it should be set to 2 GB; the default maximum size of the SGA is 1.7 GB which requires a larger SHMMAX. And if the available size of the SGA is set to 2.7 GB by changing “mapped base” at the Linux OS level, then SHMMAX should be set to 3 GB. The maximum value of SHMMAX can be set to 4GB-1. (A typical 32-bit Linux system without Physical Address Extension (PAE) is divided into 3 GB user space and 1 GB kernel space.)

The default shared memory limit for SHMMAX can be changed in the proc file system without reboot:

su – root

echo “2147483648” > /proc/sys/kernel/shmmax

Alternatively, you can use sysctl(8) to change it:

sysctl -w kernel.shmmax=2147483648

To make the change permanent, add the following line to the file /etc/sysctl.conf. This file is used during the boot process.

echo “kernel.shmmax=2147483648” >> /etc/sysctl.conf

Setting SHMMNI Parameter

This parameter sets the maximum number of shared memory segments system wide. The default number on RH 2.1 AS is 4096. To my knowledge this value should be sufficient.

# cat /proc/sys/kernel/shmmni

4096

Setting SHMALL Parameter

This parameter sets the total amount of shared memory in pages that can be used at one time on the system. So shmall should always be at least ceil(SHMMAX/PAGE_SIZE).

The default size for shmall on RH 2.1 AS is 2097152. This should be sufficient since it means that the total amount of shared memory available on the system is 2097152*4096 bytes (shmall*PAGE_SIZE). On i386 architectures, the PAGE_SIZE in RHAS 2.1 (UP and SMP kernel) is 4096 bytes unless you use bigpages which supports the configuration of larger memory pages.

# cat /proc/sys/kernel/shmall

2097152

Setting Semaphores

Semaphores can best be described as counters which are used to provide synchronization between processes or between threads within a process for shared resources like shared memories. System V semaphores support semaphore sets where each one is a counting semaphore. So when an application requests semaphores, the kernel releases them in “sets”. The number of semaphores per set can be defined through the kernel parameter SEMMSL.

To see all semaphore settings, run:

ipcs -ls

The SEMMSL Parameter

This parameter defines the maximum number of semaphores per semaphore set.

Oracle recommends to set SEMMSL to the largest PROCESSES init.ora parameter of any database on the Linux system plus 10.

Oracle also recommends to set SEMMSL to a minimum value of 100.

The init.ora parameter PROCESSES specifies the maximum number of operating system processes that can be started by the Oracle instance. In a non MTS environment, Oracle spawns a system user process for each connection. This means that in such an environment the PROCESSES parameter defines the maximum number of simultaneous Oracle connections minus sum of all Oracle background processes.

It can also be said that the PROCESSES value should never be greater than SEMMSL.

The SEMMNI Parameter

This parameter defines the maximum number of semaphore sets in the entire Linux system.

Oracle recommends to set SEMMNI to a minimum value of 100.

The SEMMNS Parameter

This parameter defines the total number of semaphores (not semaphore set) in the entire Linux system. A semaphore set can have more than one semaphore, and according to the semget(2) man page, values greater than SEMMSL * SEMMNI makes it irrelevant.

Setting it to a minimum value of 256 is for initial Oracle installation only.

Oracle recommends to set SEMMNS to the sum of the PROCESSES parameter for each database on the system, adding the largest PROCESSES twice, and then adding 10 for each DB.

The maximum number of semaphores that can be allocated on a Linux system will be the lesser of:

   SEMMNS or (SEMMSL * SEMMNI)

Setting SEMMSL and SEMMNI to 100 makes sure that SEMMNS semaphores can be allocated as determined by the above calculation.

The SEMOPM Parameter

This parameter defines the maximum number of semaphore operations that can be performed per semop(2) system call.

The semop(2) function provides the ability to do operations for multiple semaphores with one semop(2) system call. Since a semaphore set can have the maximum number of SEMMSL semaphores per semaphore set, it is often recommended to set SEMOPM equal to SEMMSL.

Oracle recommends to set SEMOPM to a minimum value of 100.

Setting the Semaphore Kernel Parameters

To determine the values of the four described semaphore parameters, run:

# cat /proc/sys/kernel/sem

250     32000   32      128

Alternatively, you can run:

ipcs -ls

All four described semaphore parameters can be changed in the proc file system without reboot:

su – root

# echo SEMMSL_value SEMMNS_value SEMOPM_value SEMMNI_value > /proc/sys/kernel/sem

# These are the values I’m using since I don’t want to lower Red Hat’s default

# values. The only value I raise is SEMOPM to comply with Oracle’s minimum

# requirement for SEMOPM.

echo 250 32000 100 128 > /proc/sys/kernel/sem

Alternatively, you can use sysctl(8) to change it:

sysctl -w kernel.sem=”250 32000 100 128″

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. This file is used during the boot process.

echo “kernel.sem=250 32000 100 128” >> /etc/sysctl.conf

To see the new updated semaphore settings, run:

ipcs -ls

Setting File Handles

The maximum number of file handles denotes the maximum number of open files that you can have on the Linux system.

Setting System Wide Limit for File Handles

The value in /proc/sys/fs/file-max sets the maximum number of file handles or open files that the Linux kernel will allocate. When you get error messages about running out of file handles, then you might want to raise this limit. The default value on RH 2.1AS is 8192.

For an Oracle server it is recommended that the file handles for the entire system is set to at least 65536.

To determine the maximum number of file handles for the entire system, run:

cat /proc/sys/fs/file-max

To determine the current usage of file handles, run:

$ cat /proc/sys/fs/file-nr

1154    133     8192

The file-nr file displays three parameters:

  – Total allocated file handles

  – Currently used file handles

  – Maximum file handles that can be allocted (see also file-max)

The kernel dynamically allocates file handles whenever a file handle is requested by an application, but the kernel does not free these file handles when they are released by the application. The kernel recycles these file handles instead. This means that over time the total number of allocated file handles will increase even though the number of currently used file handles may be low.

The maximum number of file handles can be changed in the proc file system without reboot:

su – root

echo “65536” > /proc/sys/fs/file-max

Alternatively, you can use sysctl(8) to change it:

sysctl -w fs.file-max=65536

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. This file is used during the boot process.

echo “fs.file-max=65536” >> /etc/sysctl.conf

Setting Shell Limits for the Oracle User

Most shells like Bash provide control over various resources like the maximum allowable number of open file descriptors or the maximum number of processes available to a user.

To see all shell limits, run:

ulimit -a

For more information on ulimit for the Bash shell, see man bash and search for ulimit.

Setting Limits for the Maximum Number of Open File Descriptors for the Oracle User

After you changed and increased /proc/sys/fs/file-max at Setting File Handles, there is still a per user limit of open file descriptors which is set to 1024 by default:

$ su – oracle

$ ulimit -n

1024

$

To change this, you have to edit the file /etc/security/limits.conf as root and make the following changes or add the following lines, respectively:

oracle           soft    nofile          4096

oracle           hard    nofile          63536

The “soft limit” in the first line defines the number of file handles or open files that the Oracle user will have after login. If the Oracle user gets error messages about running out of file handles, then the Oracle user can increase the number of file handles like in this example up to 63536 (“hard limit”) by running the following command:

ulimit -n 63536

You can set the “soft” and “hard” limits higher if necessary. Note that I do not recommend to set the “hard” limit for nofile for the oracle user equal to /proc/sys/fs/file-max. If you do that and the user uses up all the file handles, then the system would run out of file handles. This could mean that you won’t be able to initiate new remote logins any more since the system won’t be able to open any PAM modules which are required for performing a login. That’s why I set the hard limit to 63536 and not to 65536.

You also need to make sure that pam_limits is configured in the file /etc/pam.d/system-auth. This is the PAM module that will read the /etc/security/limits.conf file. The entry should read like:

session     required      /lib/security/pam_limits.so

Here are the two “session” entries I have in my /etc/pam.d/system-auth file:

session     required      /lib/security/pam_limits.so

session     required      /lib/security/pam_unix.so

Now login to the oracle account again since the changes will become effective for new login sessions only.

$ su – oracle

$ ulimit -n

4096

$

Note that the ulimit options are different for other shells.

The default limit for oracle is now 4096 and the oracle user can increase the number of file handles up to 63536:

$ su – oracle

$ ulimit -n

4096

$ ulimit -n 63536

$ ulimit -n

63536

$

To make this change permanent, add “ulimit -n 63536” (for Bash) to the ~oracle/.bash_profile file which is the user startup file for the Bash shell on Red Hat Linux (to verify your shell run: echo $SHELL). To do this you could simply copy/paste the following commands for the oracle’s Bash shell:

su – oracle

cat >> ~oracle/.bash_profile << EOF

ulimit -n 63536

EOF

Setting Limits for the Maximum Number of Processes for the Oracle User

After reading the procedure at Setting Limits for the Maximum Number of Open File Descriptors for the Oracle User, you should now understand what “soft” and “hard” limits are, how to configure pam_limits.so, and how to change the limits.

To see the current limit of the maximum number of processes for the oracle user, run:

su – oracle

ulimit -u

Note that the ulimit options are different for other shells.

To change the “soft” and “hard” limits for the maximum number of processes for the oracle user, add the following lines to the /etc/security/limits.conf file:

oracle           soft    nproc          2047

oracle           hard    nproc          16384

To make this change permanent, add “ulimit -u 16384” (for Bash) to the ~oracle/.bash_profile file which is the user startup file for the Bash shell on Red Hat Linux (to verify your shell run: echo $SHELL). To do this you could simply copy/paste the following commands for the oracle’s Bash shell:

su – oracle

cat >> ~oracle/.bash_profile << EOF

ulimit -u 16384

EOF

Setting Asynchronous I/O

Red Hat Advanced Server supports asynchronous I/O in the kernel. Asynchronous I/O permits Oracle to continue processing after issuing I/Os requests which leads to much higher I/O throughputs. This enhancement also allows Oracle to issue thousands of simultaneous I/O requests with a single system call. It also reduces context switch overhead.

According to a Red Hat webcast I attended, only 2 Oracle dbwriter processes are needed when asynchronous I/O is being used.

To enable Oracle to use asynchronous I/O, it is necessary to relink Oracle. Oracle ships Oracle9iR2 with asynchronous I/O support disabled. According to Oracle, this is necessary to accommodate other Linux distributions that do not support asynchronous I/O.

Relinking Oracle to Enable Asynchronous I/O for Oracle9iR2

# shutdown Oracle

SQL> shutdown

su – oracle

cd $ORACLE_HOME/rdbms/lib

make -f ins_rdbms.mk async_on

make -f ins_rdbms.mk ioracle

# The last step creates a new “oracle” executable “$ORACLE_HOME/bin/oracle”.

# It backs up the old oracle executable to $ORACLE_HOME/bin/oracleO,

# it sets the correct privileges for the new Oracle executable “oracle”,

# and moves the new executable “oracle” into the $ORACLE_HOME/bin directory.

If asynchronous I/O needs to be disabled for any reason, run the following commands:

# shutdown Oracle

SQL> shutdown

su – oracle

cd $ORACLE_HOME/rdbms/lib

make -f ins_rdbms.mk async_off

make -f ins_rdbms.mk ioracle

Enabling Asynchronous I/O in init.ora for Raw Devices

The disk_asynch_io init.ora parameter needs to be set to true:

disk_asynch_io=true

Note that this init.ora parameter is already set to true by default:

SQL> select value, isdefault from v$parameter where name = ‘disk_asynch_io’;

VALUE                          ISDEFAULT

—————————— ———

TRUE                           TRUE

Enabling Asynchronous I/O in init.ora for Filesystem Files

Make sure that all Oracle datafiles reside on filesystems that support asynchronous I/O (e.g. “ext2″). According to Oracle’s white paper Oracle9iR2 on Linux: Performance, Reliability and Manageability Enhancements on Red Hat Linux Advanced Server 2.1”, Oracle9iR2 has been certified with the standard Linux filesystem “ext2” on RH AS 2.1. In addition, Oracle has also been certified for raw devices.

The disk_asynch_io init.ora parameter needs to be set to true (same as for raw devices):

disk_asynch_io=true

Note that this init.ora parameter is already set to true by default:

SQL> select value, isdefault from v$parameter where name = ‘filesystemio_options’;

VALUE                          ISDEFAULT

—————————— ———

none                           TRUE

SQL>

The filesystemio_options init.ora parameter needs to be set to asynch:

filesystemio_options=asynch

This init.ora parameter is platform-specific. By default, this parameter is set to none for Linux and thus needs to be changed.

SQL> select value, isdefault from v$parameter where name = ‘filesystemio_options’;

VALUE                          ISDEFAULT

—————————— ———

none                           TRUE

SQL>

The filesystemio_options can have the following values with Oracle9iR2:

   asynch: This value enables asynchronous I/O on file system files.

   directio: This value enables direct I/O on file system files.

   setall: This value enables both asynchronous and direct I/O on file system files.

   none: This value disables both asynchronous and direct I/O on file system files.

Increasing I/O Throughput at the Linux OS Level

The /proc/sys/fs/aio-max-size parameter can be changed if asynchronous I/O is used for Oracle datafiles residing on filesystems (e.g. “ext2″). To my knowledge, this parameter does not have any effect to raw devices. According to the Oracle9iR2 on Linux: Performance, Reliability and Manageability Enhancements on Red Hat Linux Advanced Server 2.1” document, Oracle9iR2 has been certified with the standard Linux filesystem “ext2” on RH AS 2.1.

To get better I/O throughput for Decision Support Systems (DSS) workloads, the /proc/sys/fs/aio-max-size parameter should be increased to > 1 MB. A typical DSS system queries large amount of data and makes heavy use of full table scans. Parallel Query is particularly designed for DSS.

For Online Transaction Processing (OLTP) workloads, the default size of 131072 would suffice. A typical OLTP system has high throughputs, are insert- and update-intensive, have concurrent access by many users, and have large, continuously growing data volume.

To determine the number of bytes, run:

su – root

# cat /proc/sys/fs/aio-max-size

131072

The maximum number of bytes can be changed for e.g. DSS systems in the proc file system without reboot:

echo “2147483648” > /proc/sys/fs/aio-max-size

Alternatively, you can use sysctl(8) to change it:

sysctl -w fs.aio-max-size=2147483648

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. This file is used during the boot process.

echo “fs.aio-max-size=2147483648” >> /etc/sysctl.conf

Increasing Space for larger SGA (2.7 GB) to Fit Into Memory

If the size of SGA does not need to be increased from 1.7 GB to 2.7 GB, then the following steps can be skipped.

By default, the maximum size for SGA is 1.7 GB on a 32-bit system without Physical Address Extension (PAE). You will also be able to allocate 1.7 GB SGA if you have less than 4 GB RAM. In this case you have to make sure you have enough swap space, however, this will have an impact to the performance of the database. I was even able to bring up a database with a SGA size of 2.64 GB on a test PC that had 256 MB RAM.

Theoretically, the SGA can have a size of up to 62 GB on a system that supports Physical Address Extension (PAE). The PAE mechanism allows addressing using 36 bits on IA-32 systems. But current hardware limitations and practical consideration limit the actual size of the SGA on such a system. Since I do not have such a system, I will not cover the steps for creating SGAs larger than 2.7 GB via the tmpfs filesystem.

To increase the size of the SGA to 2.7 GB without using a shared memory filesystem (tmpfs), the following needs to be done:

  – The base address “mapped base” for Oracle’s shared libraries has to be lowered at the Linux OS level.

  – Oracle needs to be relinked with a lower base address for SGA which uses shared memory segments.

Address Mappings on Linux – Shared Memory and Shared Library Mapping on Linux

Normally, the 4 GB linear address space (also known as virtual address space) for a 32-bit Linux system is split into 4 equal sized sections for different purposes:

0GB-1GB  User space   – Used for executable and brk/sbrk allocations (malloc uses brk for small chunks).

1GB-2GB  User space   – Used for mmaps (shared memory), shared libraries and malloc uses mmap (malloc uses mmap for large chunks).

2GB-3GB  User space   – Used for stack.

3GB-4GB  Kernel Space – Used for the kernel itself.

– The mmaps grow bottom up and the stack grows top down. The unused space used by the one can be used by the other.

– The split between userspace and kernelspace can be changed by setting the kernel parameter PAGE_OFFSET and recompiling the kernel. By default, the PAGE_OFFSET macro yields the value 0xc0000000.

– The split between brk(2) and mmap(2) can be changed by setting the kernel parameter TASK_UNMAPPED_BASE and recompiling the kernel. However, on Red Hat AS this parameter can be changed for individual processes dynamically without reboot or kernel recompilation.

Usually, the portion of address space available for mapping shared libraries and shared memory segments consists of virtual addresses in the range of 0x40000000 (1 GB) – 0xc0000000 (3 GB). On Red Hat AS, 0x40000000 is the default base address for shared libraries and shared memory segments. The default base address for mapping shared memory segments can be changed and overwritten for programs and applications by non-root users. The default base address “mapped base” for loading shared libraries for programs and applications can be changed by the user root only.

The default base address that Oracle uses for SGA (shared memory segment) is 0x50000000 and not 0x40000000. Oracle uses or keeps the space from 0x40000000-0x50000000 for loading Oracle shared libraries. As I mentioned before, 0x40000000 is the default base address on RH AS for loading shared libraries which can only be changed by the user root. Oracle increased the base address for SGA to prevent address range conflicts between the segments (shared memory segment and shared libraries).

If the base address for shared memory segments would be 0x15000000 and if the base address for shared libraries would be 0x40000000, then Oracle cannot create the SGA larger than 0x2b000000 bytes or 688 MB, even though there is address space available above the shared libraries portion. (According to Oracle, Oracle binaries will no longer work if the base address for shared memory segments is lower than the base address shared libraries like in this example. Even though I didn’t experience any problems, I would not recommend it).

If the base address for shared memory segments is 0x50000000 and if the base address for shared libraries is 0x40000000, then Oracle can create a SGA that starts at 0x50000000 and ends almost at 0xc0000000; 0xc0000000 is the address where the kernel address space begins. This means that the SGA can have a size of almost 0x70000000 bytes or 1.792 GB – actually it’s about 100 MB less due to stack space and other use of memory.

Once again, Oracle increased the default base address for SGA to 0x50000000 so that all shared libraries can be loaded below 0x50000000, and the rest of the space up to almost 0xc0000000 can be used for shared memory.

You can verify the address mappings of Oracle processes by viewing the proc file /proc/<pid>/maps where <pid> stands for the Oracle process ID. The default mapping of an Oracle process might look like this:

08048000-0ab11000 r-xp 00000000 08:09 273078     /ora/product/9.2.0/bin/oracle

0ab11000-0ab99000 rw-p 02ac8000 08:09 273078     /ora/product/9.2.0/bin/oracle

0ab99000-0ad39000 rwxp 00000000 00:00 0

40000000-40016000 r-xp 00000000 08:01 16         /lib/ld-2.2.4.so

40016000-40017000 rw-p 00015000 08:01 16         /lib/ld-2.2.4.so

40017000-40018000 rw-p 00000000 00:00 0

40018000-40019000 r-xp 00000000 08:09 17935      /ora/product/9.2.0/lib/libodmd9.so

40019000-4001a000 rw-p 00000000 08:09 17935      /ora/product/9.2.0/lib/libodmd9.so

4001a000-4001c000 r-xp 00000000 08:09 16066      /ora/product/9.2.0/lib/libskgxp9.so



42606000-42607000 rw-p 00009000 08:01 50         /lib/libnss_files-2.2.4.so

50000000-50400000 rw-s 00000000 00:04 163842     /SYSV00000000 (deleted)

51000000-53000000 rw-s 00000000 00:04 196611     /SYSV00000000 (deleted)

53000000-55000000 rw-s 00000000 00:04 229380     /SYSV00000000 (deleted)



bfffb000-c0000000 rwxp ffffc000 00:00 0

As this address mapping shows, shared libraries start at base address 0x40000000. The address mapping also shows that Oracle uses the base address 0x50000000 for SGA (in this example System V shared memory for SGA). Here is a summary of all the entries:

The text (code) section is mapped at 0x08048000:

  08048000-0ab11000 r-xp 00000000 08:09 273078     /ora/product/9.2.0/bin/oracle

The data section is mapped at 0x0ab11000:

  0ab11000-0ab99000 rw-p 02ac8000 08:09 273078     /ora/product/9.2.0/bin/oracle

The uninitialized data segment .bss is allocated at 0x0ab99000:

  0ab99000-0ad39000 rwxp 00000000 00:00 0

The base address for shared libraries is 0x40000000:

  40000000-40016000 r-xp 00000000 08:01 16         /lib/ld-2.2.4.so

The base address for SGA (System V shared memory) is 0x50000000:

  50000000-50400000 rw-s 00000000 00:04 163842     /SYSV00000000 (deleted)

The stack is allocated at 0xbfffb000:

  bfffb000-c0000000 rwxp ffffc000 00:00 0

Now it should become clear what needs to be done to provide more space for SGA. To increase the space for SGA, two base addresses need to be changed. The base address “mapped base” for shared libraries needs to be lowered at the Linux OS level, and the base address for SGA (shared memory) needs to be lowered at the Oracle level (application level).

Note: Once the base addresses have been changed at the Linux OS level and at the Oracle level, all Oracle commands need to be executed with a lower “mapped base”! This means that every new shell must run with a lowered “mapped base”. Further down I will show you how you can automate this so that every Oracle user gets automatically a shell with a lowered “mapped base”.

Changing the Base Address “mapped base” for Shared Libraries at the Linux OS Level

The default base address “mapped base” on RH 2.1AS is TASK_UNMAPPED_BASE = 0x40000000 (decimal 1073741824 or 1 GB). This is the address that splits the section between brk(2) and mmap(2), which defines available space for shared libraries (if it hasn’t been changed and overwritten at the application level) and for shared memory (e.g. SGA).

To change “mapped base” for a Linux process, the file /proc/<pid>/mapped_base needs to be changed where <pid> stands for the process ID. Note that this is not a system wide parameter! So in order to change “mapped base” for the Oracle database (i.e. Oracle processes), the parent shell that starts the database needs to be modified at the Linux OS level to allow it’s child processes to inherit the change. The following procedure shows how this can be done.

Execute the following command to identify the process ID “pid” of the shell process used by the Oracle user that will start the database:

echo $$

As root in another shell, change “mapped base” to 0x10000000 (decimal 268435456 bytes or 256 MB) for the Oracle shell with the pid we identified above:

su – root

echo 268435456 > /proc/<pid>/mapped_base

This will tell the kernel to load shared libraries at the virtual address portion starting at 0x10000000. Now if Oracle is started with sqlplus in the shell used by the Oracle user for which we changed “mapped base”, the Oracle processes will inherit the new base address.

Once the base address for shared memory has been changed at the Oracle level as well, more space will become available for the SGA. To accommodate the increased space for shared memory allocations by the Oracle processes, the maximum value of SHMMAX needs to be raised. This value defines the largest shared memory segment size allowed by the kernel. Since the SGA can be increased up to 2.7 GB with this method, the maximum size for SHMMAX can be rounded to 3000000000. This will allow Oracle to allocate one large shared memory segment for the SGA. This is also what Oracle recommends.

The maximum size SHMMAX for a shared memory segment can be changed in the proc file system without reboot:

su – root

echo “3000000000” > /proc/sys/kernel/shmmax

Alternatively, you can use sysctl(8) to change it:

sysctl -w kernel.shmmax=3000000000

To make the change permanent, add or change the following line in the file /etc/sysctl.conf. This file is used during the boot process.

kernel.shmmax=3000000000

Changing the Base Address for Shared Memory at the Oracle Level

The previous steps showed how to lower the base address “mapped base” for Oracle’s shared libraries to 0x10000000 (256 MB). The following steps show how to lower the base address for shared memory (SGA) for Oracle to 0x15000000 (336 MB).

The base address for SGA (shared memory) should not be lowered to 0x10000000 at the Oracle level. As I explained in the section ” Address Mappings on Linux – Shared Memory and Shared Library Mapping on Linux”, to prevent address range conflicts between the segments (Oracle shared libraries and Oracle shared memory), the address at which the SGA should be attached is 0x15000000. It can be lowered to 0x12000000, but this would require thorough testing. So I would not recommend it.

The following calculation shows how large the SGA can be created:

   0xc0000000  (base address of the kernel space -> 3 GB)

– 0x15000000  (base address of SGA -> 336 MB)

————-

   0xab000000  (decimal 2868903936 or 2.736 GB)

– stack space

– other memory allocations

————

~ 2.65 to 2.70 GB

To lower the base address at which the SGA (shared memory) should be attached, Oracle needs to be relinked. Changing the base address for SGA can be done on Linux with genksms, which is an Oracle utility:

  # shutdown Oracle

  SQL> shutdown

  su – oracle

  cd $ORACLE_HOME/rdbms/lib

  # Make a backup of the ksms.s file if it exists

  [[ -f ksms.s ]] && cp ksms.s ksms.s_orig

  # Modify the attach address in the ksms.s file before relinking Oracle

  genksms -s 0x15000000 > ksms.s

Rebuild the Oracle executable in the $ORACLE_HOME/rdbms/lib directory by entering the following commands:

  # Create a new ksms object file

  make -f ins_rdbms.mk ksms.o

  # Create a new “oracle” executable ($ORACLE_HOME/bin/oracle):

  make -f ins_rdbms.mk ioracle

  # The last step will create a new Oracle kernel that loads the SGA at

  # the address specified by sgabeg in ksms.s:

  # .set   sgabeg,0X15000000

  # It also backs up the old oracle executable to $ORACLE_HOME/bin/oracleO,

  # it sets the correct privileges for the new Oracle executable “oracle”, and

  # moves the new executable “oracle” into the $ORACLE_HOME/bin directory.

Now when Oracle is started, the lowered base addresses for Oracle’s shared library and shared memory (SGA) can be seen with the following commands:

  # Get the pid of e.g. the Oracle checkpoint process

  su – oracle

  $ pgrep -f -x ora_dbw0_$ORACLE_SID -l

  13519 ora_dbw0_test

  # You can also use /sbin/pidof to get the process ID

  $ /sbin/pidof ora_dbw0_$ORACLE_SID

  13519

  $ DBW0_PID=`pgrep -f -x ora_dbw0_$ORACLE_SID`

  $ echo $DBW0_PID

  13519

  # Check the base addresses for shared libraries and shared memory for the

  # process ID 1049:

  $ grep ‘.so’ /proc/$DBW0_PID/maps |head -1

  10000000-10016000 r-xp 00000000 03:02 750738     /lib/ld-2.2.4.so

  $ grep ‘SYS’ /proc/$DBW0_PID/maps |head -1

  15000000-24000000 rw-s 00000000 00:04 262150     /SYSV3ecee0b0 (deleted)

  $

Now you can increase the init.ora parameters db_cache_size or db_block_buffer to create a larger database buffer cache. If the size of the SGA is larger than 2.65 GB, then I would test the database very thoroughly to make sure no other memory allocation problems arise.

For fun I tried to test these settings on a little test PC with 256 MB RAM and 4 GB swap space. I wanted to see if I was able to bring up a database on such a little PC. I set db_block_buffer to 315000 and db_block_size to 8192 (2580480000 bytes), and I was able to bring up a database with 2.654 GB (2850033824 bytes) SGA on this PC:

Total System Global Area 2850033824 bytes

Fixed Size                   450720 bytes

Variable Size             268435456 bytes

Database Buffers         2580480000 bytes

Redo Buffers                 667648 bytes

Giving Oracle Users the Privilege to Change the Base Address for Oracle’s Shared Libraries Without Giving them root Access

As shown above, only root can change the base address “mapped base” for shared libraries. Using sudo we can give Oracle users the privilege to change “mapped base” for their own shells without giving them full root access. Here is the procedure:

su – root

# E.g. create a script called “/usr/local/bin/ChangeMappedBase”

# which changes the “mapped base” for the parent process,

# the shell used by the Oracle user where the “sudo” program

# is executed (forked). Here is an example:

#/bin/sh

# Lowering “mapped base” to 0x10000000

echo 268435456 > /proc/$PPID/mapped_base

# Make sure that owernship and permissions are correct

chown root.root /usr/local/bin/ChangeMappedBase

chmod 755 /usr/local/bin/ChangeMappedBase

# Allow the Oracle user to execute /usr/local/bin/ChangeMappedBase via sudo

echo “oracle   ALL=/usr/local/bin/ChangeMappedBase” >> /etc/sudoers

Now the Oracle user can run /usr/local/bin/ChangeMappedBase to change “mapped base” for it’s own shell:

$ su – oracle

$ cat /proc/$$/mapped_base; echo

1073741824

$ sudo /usr/local/bin/ChangeMappedBase

Password:   # type in the password for the Oracle user account

$ cat /proc/$$/mapped_base; echo

268435456

$

When /usr/local/bin/ChangeMappedBase is executed the first time after an Oracle login, sudo will ask for a password. The password that needs to be entered is the password of the Oracle user account.

Changing the Base Address for Oracle’s Shared Libraries Automatically During an Oracle Login

The procedure in the previous section asks for a password each time /usr/local/bin/ChangeMappedBase is executed the first time after an Oracle login. To have “mapped base” changed automatically during an Oracle login without a password, the following can be done:

Edit the /etc/sudoers file with visudo:

su – root

visudo

Change the entry in /etc/sudoers from:

oracle   ALL=/usr/local/bin/ChangeMappedBase

to read:

oracle   ALL=NOPASSWD: /usr/local/bin/ChangeMappedBase

Make sure bash executes /usr/local/bin/ChangeMappedBase during the login process. You can use e.g. ~oracle/.bash_profile:

su – oracle

echo “sudo /usr/local/bin/ChangeMappedBase” >> ~/.bash_profile

The next time you login to Oracle, the base address for shared libraries will bet set automatically.

$ ssh oracle@localhost

oracle@localhost’s password:

Last login: Sun Apr  6 13:59:22 2003 from localhost

$ cat /proc/$$/mapped_base; echo

268435456

$

Important Notes

When the base address “mapped base” for Oracle’s processes has changed, then every Linux shell that spawns Oracle processes (e.g. listener) must have the same “mapped base” as well. This means that even shells that are used to connect locally to the database need to have the same “mapped base”. For example, if you run sqlplus to connect to the local database, then you will get the following error message if “mapped base” of this shell is not the same as for the Oracle processes:

SQL> connect scott/tiger

ERROR:

ORA-01034: ORACLE not available

ORA-27102: out of memory

Linux Error: 12: Cannot allocate memory

Additional information: 1

Additional information: 491524

SQL>

Using Large Memory Pages (Bigpages)

This feature is very useful for large SGA sizes. In the following example I will show how to use and configure Linux bigpage memory area for System V shared memory segments. System V shared memory segments are allocated for SGA if “shmfs” is not used or configured for SGA.

A separate Linux memory area can be allocated to use 4 MB memory pages rather than the normal 4 kB pages. Large memory pages “bigpages” are locked in memory and do not get swapped out. This means that a whole separate pigpage memory area can be allocated for the entire SGA not to get swapped out of memory. This means that it is very important that the bigpage memory area is only as large as needed for SGA because unused memory in the bigpage pool won’t be available for other use than for shared memory allocations, even if the Linux system starts swapping. It is also important to be aware that if bigpages is set to a high value, then the available memory for user connection will be low. Using bipages also increases TLB (Translation Lookaside Buffers) cache hits which makes the CPUs to run more efficiently in particular with large memory configurations.

Sizing Bigpages

Oracle says that the maximum value of Bigpages should be:

Maximum value of Bigpages = HighTotal / 1024 * 0.8 MB

The bigpage memory area is only available for shared memory. So if bigpages is set to a high value, then the available memory for user connection will be low. If the memory consumption for the maximum number of user connections is known, then Oracle says that bigpages can be calculated as follows:

Maximum value of Bigpages = (HighTotal – Memory required by maximum user connections in KB) / 1024 * 0.8 MB

According to Oracle’s white paper Linux Virtual Memory in Red Hat Advanced Server 2.1 and Oracle’s Memory Usage Characteristics, the assumption is that 20% of memory is reserved for kernel bookkeeping.

The value for “HighTotal” can be obtained with the following command:

grep HighTotal /proc/meminfo

Note that highmem is all memory above (approx) 860MB of physical RAM. This means that “HighTotal” is the the total amount of memory in the high memory region. It should now be clear that large memory pages should only be configured if enough physical RAM is available. For instance, if the server has only 512 MB RAM, then “HighTotal” will be 0 kB. And on my 1 GB RAM desktop PC, “HighTotal” shows 130992 kB.

Here are a few examples for bigpage sizes taken from Tips and Techniques: Install and Configure Oracle9i on Red Hat Linux Advanced Server:

2 GB SGA    2100 MB bigpages

4 GB SGA    4100 MB bigpages

The bigpages feature allows a maximum size of 5.4 GB SGA on a machine with 8 GB RAM.

Configuring Bigpages

The kernel needs to be told to use the bigpages pool for shared memory allocations. The bigpages feature can be enabled for System V shared memory in the proc file system without reboot with the following command:

su – root

echo “1” > /proc/sys/kernel/shm-use-bigpages

Alternatively, you can use sysctl(8) to change it:

sysctl -w kernel.shm-use-bigpages=1

To make the change permanent, add the following line to the file /etc/sysctl.conf. This file is used during the boot process.

echo “kernel.shm-use-bigpages=1” >> /etc/sysctl.conf

Setting kernel.shm-use-bigpages=2 will enable bigpages for “shmfs” which I’m not covering in this article. Setting kernel.shm-use-bigpages=0 will disable the bigpages feature.

The kernel needs to be told how large the bigpage pool should be. If you use GRUB, add the “bigpages” parameter in the etc/grub.conf file and set the maximum value of bigpages as follows. In this example I will set bigpages to 2100 MB for the SMP kernel 2.4.9-e.25 that is started on my database server:

default=1

timeout=10

splashimage=(hd0,1)/boot/grub/splash.xpm.gz

title Red Hat Linux (2.4.9-e.25enterprise)

        root (hd0,1)

        kernel /boot/vmlinuz-2.4.9-e.25enterprise ro root=/dev/hda2 hdc=ide-scsi

        initrd /boot/initrd-2.4.9-e.25enterprise.img

title Red Hat Linux Advanced Server (2.4.9-e.25smp)

        root (hd0,1)

        kernel /boot/vmlinuz-2.4.9-e.25smp ro root=/dev/hda2 hdc=ide-scsi bigpages=2100MB

        initrd /boot/initrd-2.4.9-e.25smp.img

title Red Hat Linux Advanced Server-up (2.4.9-e.25)

        root (hd0,1)

        kernel /boot/vmlinuz-2.4.9-e.25 ro root=/dev/hda2 hdc=ide-scsi

        initrd /boot/initrd-2.4.9-e.25.img

After this change the system needs to be rebooted:

su – root

shutdown -r now

After a system reboot, the “MemFree” value (free system memory) in the /proc/meminfo is subtracted by 2100 MB in this example. The 2100 MB show now up in the “BigPagesFree” which means that 2100 MB are now in a separate allocation area:

grep MemTotal /proc/meminfo

grep BigPagesFree /proc/meminfo

Note that if you configure “bigpages” in the etc/grub.conf file and reboot the system, “BigPagesFree” in /proc/meminfo will be 0 KB if “HighTotal” in /proc/meminfo is 0 KB and if /proc/sys/kernel/shm-use-bigpages is set to “1”.

Making Other Performance Related Changes

Disabling Unneeded Background Processes

Disable or remove slocate from your system. The nightly slocate cron job can become a real performance killer for your database!

Every night slocate updates a database for files on your system which match a given pattern. It helps you to find files on your system very quickly. However, when the cron job is run at night, it will flush the buffers, it can fragment your memory, and it could cause your system to do heavy paging.

The easiest way is to disable updatedb in /etc/cron.daily/slocate.cron or to remove slocate from your system completely:

su – root

rpm -e slocate

X should not run unless you need to. You can stop X by switching to runlevel 3 with the following command:

init 3

To switch back to runlevel 5 that X comes up again, run:

init 5

To set the default runlevel permanently to 3 so that X doesn’t come up with the next reboot, change the following line in /etc/inittab:

id:5:initdefault:

so that it reads:

id:3:initdefault:

You can check for other unneeded background processes by running the command:

/sbin/chkconfig –list

To temporarely disable e.g. ypbind, run:

su – root

service ypbind stop

To permanently disable ypbind, run:

chkconfig ypbind off

Oracle Errors and Problems

The intention of this section is to describe errors and problems that can occur in connection with the changes covered in this article.

For errors regarding the installation of Oracle software and regarding the creation of a database, see Oracle Installation Errors.

ORA-3133 errors and attach errors

Cause(s):

– Running an Oracle binary that has a lower SGA base, but /proc/proc/<pid>/maps has not been adjusted as well.

– SHMMAX value has not been increased large enough.

SQL> startup

ORA-03113: end-of-file on communication channel

SQL>

Cause(s):

– A too large SGA has been configured

– SHMMAX value has not been increased large enough.

– Oracle has been relinked with a lower SGA base address but “mapped base” has not been lowered for the shell at the Linux OS level.

SQL> startup

ORA-27102: out of memory

Linux Error: 12: Cannot allocate memory

Additional information: 1

Additional information: 262148

SQL>

Cause(s):

– This error message comes up if the SGA size if too large.

ORA-01041: internal error. hostdef extension doesn’t exist

Cause(s):

– If this error comes up and the database is not up, then remove all shared memory segments from the Linux OS.

Useful Linux Performance Utilities

top Utility

This utility shows CPU consumption, memory consumption, and “top” sessions on the Linux server:

top

Load Averages:

The first line of the top output shows you a series of three “load average” numbers. These numbers describe the load on the system. The load average is the average number of processes that are waiting in the queue for CPU time (including processes that are waiting for I/O) for the past 1, 5 and 15 minutes.

For example, if a process is sleeping in an uninterruptible state, it will count as a load of 1. Or if you run 3 non-interactive processes that are not waiting for input, then you can expect the average load to be 3. To illustrate that, run the following command in 3 different shells on a server that is not being used:

  while [ 1 ]; do str=”x”; done

This loop will use up all the CPU time that it can get. Now wait for about 2-3 minutes and you will see that the average load for the last 1 minute will increase to be 3 and higher. It will be a little bit higher than 3 since there are other processes running on the system.

In general, a number less than 1 is ideal. A load average value of 3 is high. And a value of 10 is definitely a heavily loaded system where you can expect delays.

You can also use the tload command to display real-time text mode graph on the “load average”.

CPU States:

It shows the load on each processor – the percentage of CPU time in user mode, system mode, niced tasks, and idle.

The “user” percentage shows how much processing time the CPU is spending on user processes, and the “system” percentage shows how much processing time the CPU is spending in the system (kernel). Niced tasks are only those processes whose nice value is negative. And note that the processing time for niced processes will also be counted in system and user time, so the total will be more than 100%.

However, the best indicators of a stressed CPU is the load average which I described above.

Sessions:

This section shows the top sessions (Linux processes) in terms of CPU utilization.

sar Utility

“sar” stands for System Activity Reporter.

CPU Usage:

To check CPU usage over time, run:

sar -u

This command is useful if you want to see overall CPU consumption over time.

%user shows the percentage of CPU utilization at the user level (application).

%system shows the percentage of CPU utilization at the system level (kernel).

To check CPU usage 10 times with a time interval of 3 seconds, run:

sar -u 3 10

Swap Activity:

To check swap activity over time, run:

sar -W

This command is useful if you suspect memory shortages.

pswpin/s shows the total number of swap pages the system brought in per second.

pswpout/s shows the total number of swap pages the system brought out per second.

These numbers should be low. If not, you need more RAM.

To check swap activity 10 times with a time interval of 3 seconds, run:

sar -W 3 10

I/O Activity:

To check physical disk I/O activity over time, run:

sar -b

This command is useful if you suspect that the database is I/O bound.

See manual pages for more information.

To check I/O activity 10 times with a time interval of 3 seconds, run:

sar -b 3 10

vmstat Utility

This utility provides a report that covers process activity, paging, memory usage, disk I/O, and CPU usage.

To create 5 reports with a time interval of 3 seconds, run:

$  vmstat 3 5

   procs                      memory    swap          io     system         cpu

r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id

0  0  0 186460   7416   9424  45272   1   4    25    35  126    33   3   0  96

0  0  0 186460   7416   9432  45272   0   0     0    17  103    18   0   0 100

0  0  0 186460   7288   9440  45272   0   0     0    73  104    23   4   1  95

0  0  1 186460   7288   9440  45272   0   0     0     5  102    12   0   0 100

0  0  0 186460   7288   9440  45272   0   0     0     8  102    14   0   0 100

See man pages for more information.

Oracle Linux Management

Determining Which Semaphore Sets and Shared Memory Segments Belong to Each Oracle Database or Instance

When Oracle hangs or crashed or when Oracle was killed, then sometimes you will see that shared memory segments and/or semaphore sets have not been released or removed by the Oracle background processes. It is important to make sure that the semaphore sets and shared memory segments are released at the Linux OS level before the database or instance is restarted.

Running ipcs will only show you which semaphore sets and which memory segments are owned by the Oracle user account. If you have only one database runnning on your server, then you can simply use the IDs of all shared memory segments and semaphore sets that belong to the Oracle user account and release them via ipcrm:

$ su – oracle

$ ipcs

—— Shared Memory Segments ——–

key shmid owner perms bytes nattch status

0x00000000 0 root 600 196608 2

0x00000001 32769 root 600 655360 2

0x00000000 458755 oracle 660 4194304 0

0x00000000 491524 oracle 660 33554432 0

0x00000000 524293 oracle 660 33554432 0

0x00000000 557062 oracle 660 33554432 0

0x00000000 589831 oracle 660 33554432 0

0x00000000 622600 oracle 660 33554432 0

0x00000000 655369 oracle 660 33554432 0

0x00000000 688138 oracle 660 33554432 0

0x3ecee0b0 720907 oracle 660 4194304 0

—— Semaphore Arrays ——–

key semid owner perms nsems status

—— Message Queues ——–

key msqid owner perms used-bytes messages

$

To release all shared memory segments that are owned by the Oracle user as listed above, run:

$ ipcrm shm 458755 491524 524293 557062 589831 622600 655369 688138 720907

The command for releasing semaphore sets is:

$ ipcrm sem <semid>…

But if you have more than one database or instance running on the Linux servers, then ipcs will NOT show you the semaphore sets and shared memory segments that are owned by each database or instance. The following steps can be used to find the right IDs for each database or instance:

$ su – oracle

$ sqlplus /nolog

SQL> oradebug setmypid

Statement processed.

SQL> oradebug ipc

Information written to trace file.

SQL> select value from v$parameter where name = ‘user_dump_dest’;

VALUE

——————————————————————————–

/opt/oracle/admin/test/udump

SQL>

On my test server, the oradebug ipc command created a file called test_ora_6626.trc in the USER_DUMP_DEST directory /opt/oracle/admin/test/udump. The name of the created trace file is $ORACLE_SID_ora_<pid>.trc where <pid> stands for the process ID of the Oracle foreground process in a non-MTS environment that’s talking to sqlplus here. If you are not sure about the name of the file that was created, run ls -lrt to see the timestamp of the latest trace file created in the USER_DUMP_DEST directory.

When you open the trace file (in my example test_ora_6626.trc), you can find the semaphore ID for this database after the line “Semaphore List=”. Here are the semaphore sets on my test box for the Oracle database:

/opt/oracle/admin/test/udump/test_ora_6626.trc:

[SKIP]

Maximum processes:               = 150

Number of semaphores per set:    = 154

Semaphores key overhead per set: = 4

User Semaphores per set:         = 150

Number of semaphore sets:        = 1

Semaphore identifiers:           = 1

Semaphore List=

98304

————– system semaphore information ————-

—— Shared Memory Segments ——–

[SKIP]

To release all semaphore sets that are owned by the database as listed above, run:

$ ipcrm sem 98304

And here are the shared memory IDs on my test box for the Oracle database:

/opt/oracle/admin/test/udump/test_ora_6626.trc:

[SKIP]

Area #0 `Fixed Size’ containing Subareas 0-0

  Total size 000000000006e078 Minimum Subarea size 00000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      0        0  1671186 0x00000050000000 0x00000050000000

                              Subarea size     Segment size

                          000000000006f000 0000000000400000

Area #1 `Variable Size’ containing Subareas 1-7

  Total size 000000000e000000 Minimum Subarea size 01000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      1        1  1703955 0x00000051000000 0x00000051000000

                              Subarea size     Segment size

                          0000000002000000 0000000002000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      1        2  1736724 0x00000053000000 0x00000053000000

                              Subarea size     Segment size

                          0000000002000000 0000000002000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      1        3  1769493 0x00000055000000 0x00000055000000

                              Subarea size     Segment size

                          0000000002000000 0000000002000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      1        4  1802262 0x00000057000000 0x00000057000000

                              Subarea size     Segment size

                          0000000002000000 0000000002000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      1        5  1835031 0x00000059000000 0x00000059000000

                              Subarea size     Segment size

                          0000000002000000 0000000002000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      1        6  1867800 0x0000005b000000 0x0000005b000000

                              Subarea size     Segment size

                          0000000002000000 0000000002000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      1        7  1900569 0x0000005d000000 0x0000005d000000

                              Subarea size     Segment size

                          0000000002000000 0000000002000000

Area #2 `Redo Buffers’ containing Subareas 8-8

  Total size 00000000000a3000 Minimum Subarea size 00000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      2        8  1933338 0x0000005f000000 0x0000005f000000

                              Subarea size     Segment size

                          00000000000a3000 0000000000400000

Area #3 `skgm overhead’ containing Subareas 9-9

  Total size 0000000000001000 Minimum Subarea size 00000000

   Area  Subarea    Shmid      Stable Addr      Actual Addr

      3        9  1933338 0x0000005f0a3000 0x0000005f0a3000

                              Subarea size     Segment size

                          0000000000001000 0000000000400000

[SKIP]

To release all shared memory segments that are owned by the database as listed above, run:

$ ipcrm shm 1671186 1703955 1736724 1769493 1802262 1835031 1867800 1900569 1933338

To verify if the shared memory segments and semaphore sets have been released, run:

$ ipcs

Hardware Recommendation

It really depends on what kind of database you want to setup and run, how large the database is etc.

But people keep asking me what I would recommend. If you want to get a feeling how well Oracle9i (non-RAC system) runs on Linux/Intel systems, and if you don’t want to spend “too much money”, here is what I would buy:

  – 2-way server, 2.4GHz Xeon

  – 4 GB RAM; RAM is cheap and gives you usually the biggest “bang for the buck”.

  – Large Internal Ultra SCSI disks with a hardware RAID controller card

References

Oracle’s Linux Center

An Overview of Red Hat Advanced Server V2.1 Reliability, Availability, Scalability, and Manageability (RASM) Features

Linux Virtual Memory in Red Hat Advanced Server 2.1 and Oracle’s Memory Usage Characteristics

Tips and Techniques: Install and Configure Oracle9i on Red Hat Linux Advanced Server

Oracle9iR2 on Linux: Performance, Reliability and Manageability Enhancements on Red Hat Linux Advanced Server 2.1

Delivering Leading TPC-C Figures with Red Hat Linux Advanced Server (Red Hat Webcast Tuesday, 22nd October, 2002)

Understanding the Linux Kernel, 2nd edition

Copyright Notice

This article may not be published, sold, reproduced or copied in whole or in part without obtaining permission first. But you are welcome to put links from your site to the article.

The information provided in this article shows how I tuned Linux/Oracle on my server(s) and is distributed AS IS. Every effort has been made to provide the information as accurate as possible, but no warranty or fitness is implied. The use of this information described herein is your responsibility, and to use it in your own environments do so at your own risk.

Comments?  webmaster_at_puschitz.com

서진우

슈퍼컴퓨팅 전문 기업 클루닉스/ 상무(기술이사)/ 정보시스템감리사/ 시스존 블로그 운영자

You may also like...

페이스북/트위트/구글 계정으로 댓글 가능합니다.