Oracle DBA RAC Interview Question Answers
What is RAC? What is the benefit of RAC over single instance database?
In Real Application Clusters environments, all nodes concurrently execute transactions against the same database. Real Application Clusters coordinates each node’s access to the shared data to provide consistency and integrity.
Benefits:
Improve response time
Improve throughput
High availability
Transparency
What is Oracle RAC One Node?
Oracle RAC one Node is a single instance running on one node of the cluster while the 2nd node is in cold standby mode. If the instance fails for some reason then RAC one node detect it and restart the instance on the same node or the instance is relocate to the 2nd node in case there is failure or fault in 1st node. The benefit of this feature is that it provides a cold failover solution and it automates the instance relocation without any downtime and does not need a manual intervention. Oracle introduced this feature with the release of 11gR2 (available with Enterprise Edition).
Real Application Clusters:
Oracle RAC is a cluster database with a shared cache architecture that overcomes the limitations of traditional shared-nothing and shared-disk approaches to provide a highly scalable and available database solution for all your business applications. Oracle RAC provides the foundation for enterprise grid computing.
Oracle’s Real Application Clusters (RAC) option supports the transparent deployment of a single database across a cluster of servers, providing fault tolerance from hardware failures or planned outages. Oracle RAC running on clusters provides Oracle’s highest level of capability in terms of availability, scalability, and low-cost computing.
One DB opened by multipe instances so the the db ll be Highly Available if an instance crashes.
Cluster Software. Oracles Clusterware or products like Veritas Volume Manager are required to provide the cluster support and allow each node to know which nodes belong to the cluster and are available and with Oracle Cluterware to know which nodes have failed and to eject then from the cluster, so that errors on that node can be cleared.
Oracle Clusterware has two key components Cluster Registry OCR and Voting Disk.
The cluster registry holds all information about nodes, instances, services and ASM storage if used, it also contains state information ie they are available and up or similar.The voting disk is used to determine if a node has failed, i.e. become separated from the majority. If a node is deemed to no longer belong to the majority then it is forcibly rebooted and will after the reboot add itself again the the surviving cluster nodes.
Advantages of RAC (Real Application Clusters)
Reliability – if one node fails, the database won’t fail
Availability – nodes can be added or replaced without having to shutdown the database
Scalability – more nodes can be added to the cluster as the workload increases
What is a virtual IP address or VIP?
A virtual IP address or VIP is an alternate IP address that the client connections use instead of the standard public IP address. To configure VIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.
What is the use of VIP?
If a node fails, then the node’s VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.
Give situations under which VIP address failover happens:-
VIP addresses failover happens when the node on which the VIP address runs fails, all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.
Using virtual IP we can save our TCP/IP timeout problem because Oracle notification service maintains communication between each nodes and listeners.
What is the significance of VIP address failover?
When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don’t have to wait for TCP connection timeout messages.
What is voting disk?
Voting Disk is a file that sits in the shared storage area and must be accessible by all nodes in the cluster. All nodes in the cluster registers their heart-beat information in the voting disk, so as to confirm that they are all operational. If heart-beat information of any node in the voting disk is not available that node will be evicted from the cluster. The CSS (Cluster Synchronization Service) daemon in the clusterware maintains the heartbeat of all nodes to the voting disk. When any node is not able to send heartbeat to voting disk, then it will reboot itself, thus help avoiding the split-brain syndrome.
For high availability, Oracle recommends that you have a minimum of three or odd number (3 or greater) of voting disks.
Voting Disk – is file that resides on shared storage and Manages cluster members. Voting disk reassigns cluster ownership between the nodes in case of failure.
The Voting Disk Files are used by Oracle Clusterware to determine which nodes are currently members of the cluster. The voting disk files are also used in concert with other Cluster components such as CRS to maintain the clusters integrity.
Oracle Database 11g Release 2 provides the ability to store the voting disks in ASM along with the OCR. Oracle Clusterware can access the OCR and the voting disks present in ASM even if the ASM instance is down. As a result CSS can continue to maintain the Oracle cluster even if the ASM instance has failed.
How many voting disks are you maintaining ?
By default Oracle will create 3 voting disk files in ASM. Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You should always configure an odd number of voting disks >= 3. This is because loss of more than half your voting disks will cause the entire cluster to fail.
You should plan on allocating 280MB for each voting disk file. For example, if you are using ASM and external redundancy then you will need to allocate 280MB of disk for the voting disk. If you are using ASM and normal redundancy you will need 560MB.
Why we need to keep odd number of voting disks ?
Oracle expects that you will configure at least 3 voting disks for redundancy purposes. You should always configure an odd number of voting disks >= 3. This is because loss of more than half your voting disks will cause the entire cluster to fail.
What are Oracle RAC software components?
Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database. Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion. Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor
What are Oracle Clusterware processes for 10g ?
Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.
Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource’s configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user
Event manager daemon (evmd) —A background process that publishes events that crs creates.
Process Monitor Daemon (OPROCD) —This process monitors the cluster and provide I/O fencing.
OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hang check timer on Linux platforms.
RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.
What are Oracle database background processes specific to RAC?
LMS—Global Cache Service Process
LMD—Global Enqueue Service Daemon
LMON—Global Enqueue Service Monitor
LCK0—Instance Enqueue Process
Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances.
What is Cache Fusion?
Transfer of data across instances through private interconnect is called cachefusion. Oracle RAC is composed of two or more instances. When a block of data is read from datafile by an instance within the cluster and another instance is in need of the same block,it is easy to get the block image from the instance which has the block in its SGA rather than reading from the disk. To enable inter instance communication Oracle RAC makes use of interconnects. The Global Enqueue Service(GES) monitors and Instance enqueue process manages the cahce fusion
What is SCAN? (11gR2 feature)
Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g Release 2 feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster.
SCAN provides a single domain name via (DNS), allowing and-users to address a RAC cluster as-if it were a single IP address. SCAN works by replacing a hostname or IP list with virtual IP addresses (VIP).
Single client access name (SCAN) is meant to facilitate single name for all Oracle clients to connect to the cluster database, irrespective of number of nodes and node location. Until now, we have to keep adding multiple address records in all clients tnsnames.ora, when a new node gets added to or deleted from the cluster.
Single Client Access Name (SCAN) eliminates the need to change TNSNAMES entry when nodes are added to or removed from the Cluster. RAC instances register to SCAN listeners as remote listeners. Oracle recommends assigning 3 addresses to SCAN, which will create 3 SCAN listeners, though the cluster has got dozens of nodes.. SCAN is a domain name registered to at least one and up to three IP addresses, either in DNS (Domain Name Service) or GNS (Grid Naming Service). The SCAN must resolve to at least one address on the public network. For high availability and scalability, Oracle recommends configuring the SCAN to resolve to three addresses.
What are SCAN components in a cluster?
1.SCAN Name
2.SCAN IPs (3)
3.SCAN Listeners (3)
What is FAN?
Fast application Notification as it abbreviates to FAN relates to the events related to instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify other processes about the configuration and service level information that includes service status changes such as,UP or DOWN events.Applications can respond to FAN events and take immediate action.
What is TAF?
TAF (Transparent Application Failover) is a configuration that allows session fail-over between different nodes of a RAC database cluster.
Transparent Application Failover (TAF). If a communication link failure occurs after a connection is established, the connection fails over to another active node. Any disrupted transactions are rolled back, and session properties and server-side program variables are lost. In some cases, if the statement executing at the time of the failover is a Select statement, that statement may be automatically re-executed on the new connection with the cursor positioned on the row on which it was positioned prior to the failover.
After an Oracle RAC node crashes—usually from a hardware failure—all new application transactions are automatically rerouted to a specified backup node. The challenge in rerouting is to not lose transactions that were “in flight” at the exact moment of the crash. One of the requirements of continuous availability is the ability to restart in-flight application transactions, allowing a failed node to resume processing on another server without interruption. Oracle’s answer to application failover is a new Oracle Net mechanism dubbed Transparent Application Failover. TAF allows the DBA to configure the type and method of failover for each Oracle Net client.
TAF architecture offers the ability to restart transactions at either the transaction (SELECT) or session level.
What are the requirements for Oracle Clusterware?
1. External Shared Disk to store Oracle Cluster ware file (Voting Disk and Oracle Cluster Registry – OCR)
2. Two netwrok cards on each cluster ware node (and three set of IP address) –
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter node communication between rac nodes used by clusterware and rac database)
IP address set 3 for Virtual IP (VIP) (used as Virtual IP address for client connection and for connection failover)
3. Storage Option for OCR and Voting Disk – RAW, OCFS2 (Oracle Cluster File System), NFS, …..
Which enable the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.
How to find location of OCR file when CRS is down?
If you need to find the location of OCR (Oracle Cluster Registry) but your CRS is down.
When the CRS is down:
Look into “ocr.loc” file, location of this file changes depending on the OS:
On Linux: /etc/oracle/ocr.loc
On Solaris: /var/opt/oracle/ocr.loc
When CRS is UP:
Set ASM environment or CRS environment then run the below command:
ocrcheck
In 2 node RAC, how many NIC’s are r using ?
2 network cards on each clusterware node
Network Card 1 (with IP address set 1) for public network
Network Card 2 (with IP address set 2) for private network (for inter node communication between rac nodes used by clusterware and rac database)
In 2 node RAC, how many IP’s are r using ?
6 – 3 set of IP address
## eth1-Public: 2
## eth0-Private: 2
## VIP: 2
How to find IP’s information in RAC ?
Edit the /etc/hosts file as shown below:
# Do not remove the following line, or various programs
# that requires network functionality will fail.
127.0.0.1 localhost.localdomain localhost
## Public Node names
192.168.10.11 node1-pub.hingu.net node1-pub
192.168.10.22 node2-pub.hingu.net node2-pub
## Private Network (Interconnect)
192.168.0.11 node1-prv node1-prv
192.168.0.22 node2-prv node2-prv
## Private Network (Network Area storage)
192.168.1.11 node1-nas node1-nas
192.168.1.22 node2-nas node2-nas
192.168.1.33 nas-server nas-server
## Virtual IPs
192.168.10.111 node1-vip.hingu.net node1-vip
192.168.10.222 node2-vip.hingu.net node2-vip
What is difference between RAC IP addresses ?
Public IP adress is the normal IP address typically used by DBA and SA to manage storage, system and database. Public IP addresses are reserved for the Internet.
Private IP address is used only for internal clustering processing (Cache Fusion) (aka as interconnect). Private IP addresses are reserved for private networks.
VIP is used by database applications to enable fail over when one cluster node fails. The purpose for having VIP is so client connection can be fail-over to surviving nodes in case there is failure.
Can application developer access the private ip ?
No. private IP address is used only for internal clustering processing (Cache Fusion) ( as interconnect)