Server side load balancing in Oracle RAC 11g

How server side load balancing will happen in Oracle RAC 11g??

 Based on the load information fed by PMON, listener will decide on to which instance the incoming connections to be directed (Or how to distribute oncoming connections among instances) Usually LOCAL_LISTENER parameter should be set to name of the listener defined in the same node as that of the database instance and REMOTE_LISTENER should be set to name of the listener running on some other node which is running another database instance part of the same cluster.

We will take an example of two node RAC cluster, Node A and Node B running instance DB_A and instance DB_B. We will assume that listeners are defined as DB_A and DB_B on nodes A and B respectively. For configuring server side load balancing we will simply configure the parameters as follows.

Node A:

LOCAL_LISTENER=LISTENER_A

REMOTE_LISTENER=LISTENER_B

On Node B:

LOCAL_LISTENER=LISTENER_B

REMOTE_LISTENER=LISTENER_A

When we start the two instances the corresponding PMON processes get dynamically registered with both the listeners and start feeding listeners with load profile information. PMON gathers load information using kstat() unix system call (This information is more or less like load average info from OS).

When a new incoming connection hits any of the listener process, it will redirect the connection to the least loaded node; this can be the local node or remote node registered with this listener based on the load reported by PMON.

Minimum PMON interval to update load is 1min and within that 1 min, listener redirects new incoming connects based on old values provided by PMON. This might some times be not accurate, particularly on heavily loaded systems. To overcome this, From 10.2 onwards, listener calculates least loaded node by calculating lbscore against each RAC database instance. PMON updates two values to listener process, one is goodness and the other is delta. The idea of delta is primarily for accounting in changes in between successive PMON update intervals.

The calculation of lbscore at each listener is as follows:

After instance restart,

Listener lbscore = goodness value updated by PMON

Listener Delta = delta value updated by PMON

After every incoming connection Listener lbscore becomes,

New Listener lbscore = Current Listener lbscore + Listener Delta

After every PMON update, Listener’s lbscore and delta are reset to values supplied by PMON.

Advent of SCAN

11gR2 RAC introduced feature called SCAN (Single Client Access Name), this is like associated a single host name to multiple IP addresses, actually 3 IP addresses. The association of 3 IP addresses to one hostname should be carried out at DNS level or we can let Oracle manage it by opting for Grid Naming Service at cluster software installation time.

After SCAN is configured during cluster software install, each of these 3 IP addresses will have a set of a SCAN VIP resource and a corresponding SCAN Listener process. These 3 SNAC IP addresses are distributed among multiple cluster, Whenever node running a SCAN VIP fails, the SCAN VIP and its SCAN listener are failed over to the surviving nodes.

From 11.2, REMOTE_LISTENER parameter should be set to SCAN value, after this instance will register with SCAN Listener processes. The LOCAL_LISTENER parameter should be set to the VIP defined for the node running database instance.

The instance will periodically update a SCAN listener with following information.

1.  What are the services being offered on the database instance.
2.  Current instance load.
3.  A Recommendation on how many incoming connections can be redirected to the instance.

At the client side the 11.2 client will get 3 IPs for which SCAN host resolves to and tries to connect to listener, if connection with one IP fails the next one is tried in the order supplied by DNS. This is somewhat similar to client side failover based on address list in tns entry in previous Oracle versions. Once a SCAN listener receives incoming request , it will redirect the connection to the least loaded node that’s providing the requested “SERVICE”.

Note: Its recommended to use 11.2 client for having full functionality of SCAN.

Monitoring Load Balancing effectiveness:

The load balancing effectiveness can be monitored by tracking updates in listener log files, detailed explanation is provided in metalink note 263599.1. The note has an interesting discussion on how to calculate effectiveness of client side load balancing (a.k.a TNS load balancing) based on the listener log file contents.

We can also monitor PMON trace to monitor load balancing. PMON instance tracing can be enabled by restarting database instances after setting “event=”10257 trace name context forever, level 16″ in the init.ora/spfile. Check note 787055.1 if you are more interested in knowing PMON tracing (Caution is for 9.2, I cant garentee its working for 10g onwards)

For monitoring lbscore and delta values, set TRACE_LEVEL_LISTENER = 16 in listener.ora file, the values will be captured in listener trace files.

Nagulu Polagani

"We are all apprentices in a craft where no one ever becomes a master."