You are on page 1of 98

1

RAC

ORACLE 10g RAC OVERVIEW


AGENDA

What is Real Application Cluster (RAC) ?

Why to use RAC ?

Single instance vs RAC

RAC Architecture

Sharing of Resources

Background processes

Internal structures and services

RAC Administration

ORACLE Clusterware

RAC Mechanism

Failover

Load Balancing
WHAT IS RAC ?
Real Application Clusters (RAC) introduced in oracle 9i, is a technology that enables a single
database to work on multiple instances simultaneously on different nodes.
A RAC database requires three components:

cluster nodes

Shared storage

Oracle Clusterware
To identify RAC instance in a database the following can be used:

Use the DBMS_UTILITY.IS_CLUSTER_DATABASE function

Show parameter CLUSTER_DATABASE


WHY USE RAC ?
High Availability
Failover
Reliability
Scalability
Manageability
Recoverability
Transparency
Row Locking

RAC
Error Detection
Buffer Cache Management
Continuous Operations
Load Balancing/Sharing
Reduction in total cost of ownership
SINGLE INSTANCE ~ RAC

Difference between Oracle Single instance and Oracle RAC

Single Instance

Oracle RAC

Single instance accessing database

Multiple instances accessing same database

Database can be local or shared

Database must be on shared.

Minimum one node is Required

Minimum two node required

RAC

Clusterware software is not required

Clusterware Software is required.

Consists of one SGA and one set of


background processes

Each instance having own one SGA and one


set of background processes

One set of redo logs

Multiple set redo logs depending on number


of instances

No Cache fusion used un Single


instance

Cache fusion is used to transfer or access


block from remote instance

V$ views to monitor and manage


instance

For monitoring at the cluster level Oracle


provides GV$ views.

Meeting some of the business


requirements (availability, scalability)
is limited to a single instance
configuration.

Modern business requirements of high


availability and linear scalability are provided
by multiple instances sharing a common
physical database

RAC ARCHITECTURE:

RAC

Public Network:
It is the public IP on which listeners would be listening and clients would contact the listener
on this public IP.
Private Network Interconnect:
It is a network path i.e exclusively used for inter-instance communication used by the
cluster and dedicated to the server nodes of a cluster. It is used for the synchronization
ofresources and in some cases for the transfer of data. It has a high bandwidth and low
latency.
Storage Network:
It is the network which connects the instances to the database in RAC.
WHATS SHARED, WHATS NOT SHARED
Disk access
Resources that manage data
All instances have common data & controls files
NOT SHARED
1) Each node has its own dedicated:
System memory
Operating system
Database instance
Application software
2) Each instance has individual
Log files and
Rollback segments
BACKGROUND PROCESSES:

RAC
1. Global Cache Service Processes (LMSn)
LMSn handles block transfers between the holding instances buffer cache and requesting
foreground process on the requesting instance.
LMS maintains read consistency by rolling back any uncommitted transactions for blocks
that are being requested by any remote instance.
Even if n value(0-9) varies depending on the amount of messaging traffic amongst nodes
in the cluster, there is default, one LMS process per pair of CPUs.
2. Global Enqueue Service Monitor (LMON)
It constantly handles reconfiguration of locks and global resources when a node joins or
leaves the cluster. Its services are also known as Cluster Group Services (CGS).
3. Global Enqueue Service Daemon (LMD)
It manages lock manager service requests for GCS resources and sends them to a service
queue to be handled by the LMSn process. The LMD process also handles global deadlock
detection and remote resource requests (remote resource requests are requests originating
from another instance).
4. Lock Process (LCK)
LCK manages non-cache fusion resource requests such as library and row cache requests
and lock requests that are local to the server. Because the LMS process handles the primary
function of lock management, only a single LCK process exists in each instance.
5. Diagnosability Daemon (DIAG)
This background process monitors the health of the instance and captures diagnostic data
about process failures within instances. The operation of this daemon is automated and
updates an alert log file to record the activity that it performs.
NOTE: In RAC environment, all instances have their separate alert logs.
Global Service Daemon (GSD)
This is a component in RAC that receives requests from the SRVCTL control utility to
execute administrative tasks like startup or shutdown. The command is executed locally on
each node and the results are returned to SRVCTL. The GSD is installed on the nodes by
default.
INTERNAL STRUCTURES AND SERVICES
Global Resource Directory (GRD)
Records current state and owner of each resource
Contains convert and write queues
Distributed across all instances in cluster
Maintained by GCS and GES
Global Cache Services (GCS)

RAC
Implements cache coherency for database
Coordinates access to database blocks for instances
Global Enqueue Services (GES)
Controls access to other resources (locks) including library cache and dictionary cache
Performs deadlock detection
RAC ADMINISTRATION:
Parameter File

Using a shared SPFILE is recommended while each instance can have its own
dedicated parameter file as well.

SPFILE should be placed on a shareable disk subsystem like a rawdevice, a clustered


file system, or Automatic Storage Management (ASM).

Oracle instance parameters for the RAC environment can be grouped into three
major categories i.e Unique, Identical,Instance-Specific Parameters.

If a parameter appears more than once in a parameter file, the last specified value is
the effective value, unless the values are on consecutive linesin that case values from
consecutive lines are concatenated.

To set the value for a parameter the following syntax is used. An asterisk (*) or no
value in place of an instance_nameimplies that the parameter value is valid for all the
instances.

<instance_name>.

<parameter_name>=<parameter_value>*.undo_management=auto

To set the parameter in SPFILE, the following command can be used.


alter system set
<parameter>=<value>scope=<memory/spfile/both>comment=<comments>deferred
sid=<sid, *>

When DBCA is used to create an RAC database, by default it creates an SPFILE on


the shared disk subsystem used. Else it can also be created manually.
Undo Management

Oracle stores original values of the data called Before imagein undo segments to
provide read consistency and to roll back uncommitted transactions.

In the Oracle RAC environment, each instance stores transaction undo data in its
dedicated undo tablespace. For this we must set undo_tablespace for individual instances as
follows.

prod1.undo_tablespace= undo_tbs1

prod2.undo_tablespace=undo_tbs2

Instances in RAC can use either automatic or manual undo management, but the
parameter undo_managementhas to be the same across all the instances.
To increase the size of an undo tablespace, either of the following can be used.

RAC
1. Add another datafile to undo tablespace.
2. Increase the size of the existing datafile(s) belonging to the undo tablespace.
While using Manual Undo Management following considerations aremade.
1. Use manual undo management only if you have very good reason for not using automatic
undo management.
2.Do not create other objects such as tables, indexes, and so on in the tablespace used for
rollback segments.
3.Create one rollback segment for every four concurrent transactions.
Temporary Tablespace

In an RAC environment, all instance share the same temporary tablespace with each
instance creating a temporary segment in the tablespace it is using. The size should be at
least equal to the concurrent maximum requirement of all the instances.

The default temporary tablespace cannot be dropped or taken offline; however it can
be changed followed by dropping or taking offline the original default temporary tablespace.

To determine the temporary tablespace used by each instance


GV$SORT_SEGMENTand GV$TEMPSEG_USAGE views can be queried on INST_IDcolumn
basis.

In a temporary tablespace group, an user will always use the same assigned
temporary tablespace irrespective of the instance being used. Aninstance can reclaim the
space used by other instancestemporary segments in that tablespace if required for large
sorts.
Online Redologs

In an RAC environment, each instance has its own set of online redologfiles and
redologscalled Thread.

The size of an online redologis independent of other instancesredologsizes and is


determined by the local instances workload and backup and recovery considerations.

Each instance has exclusive write access to its own online redologfiles, however it
can read another instances current online redologfile to perform instance recovery if
needed. Thus, an online redologneeds to be located on a shared storage device and cannot
be on a local disk.

Similar to single instance database, views V$LOG and V$LOGFILE can be used in RAC
for getting information about redo logs.

RAC

Archive Logs
Steps to enable or disable log mode
1.Set cluster_database=false for the instance:
SQL>alter system set cluster_database=false scope=spfile sid= prod1;
2. Shut down all the instances accessing the database:
$srvctl stop database -d prod
3. Mount the database using the local instance:
SQL>startup mount
4. Enable archiving:
SQL>alter database archivelog/ noarchivelog;
5. Change the parameter cluster_database=true for the instance prod1:
SQL>alter system set cluster_database=true scope=spfile sid=prod1;
6. Shut down the local instance:
SQL>shutdown ;
7. Bring up all the instances:
$srvctl start database -d prod
Flashback Area (Enable / Disable )
1. Set cluster_database=false for the instance to perform this operation.
SQL>alter system set cluster_database=false scope=spfile sid= prod1;
2. Set DB_RECOVERY_FILE_DEST_SIZE and DB_RECOVERY_FILE_DEST.
The DB_RECOVERY_FILE_DEST parameter should point to a shareable disk subsystem.
SQL>alter system set db_recovery_file_dest_size=200m scope=spfile;
SQL>alter system set db_recovery_file_dest=/ocfs2/flashback scope=spfile;
3. Shut down all instances accessing the database:
$srvctl stop database -d prod
4. Mount the database using the local instance:
SQL>startup mount
5. Enable the flashback by issuing the following command:

RAC
SQL>alter database flashback on/off;
6. Change back the parameter to cluster_database=true for the instance prod1:
SQL>alter system set cluster_database=true scope=spfile sid=prod1
7. Shut down the instance:
SQL>shutdown;
8. Start all the instances:
$srvctl start database -d prod
SRVCTL Utility

To start all instances associated with a database the following command can be
executed from any of the nodes. The command also starts listeners on each node if not
already running.
$srvctl start database -d <DB Name>

Similarly, to shut down all instances associated with the database stop command can
be used which does not stop listeners, as they mightbe serving other database instances
running on the same machine.
$srvctl stop database d <DB Name>

Options specified by -o are directly passed on to SQL *Plus as command-line options


for start/stop commands.
$srvctl stop database -d <DB Name> -o immediate
$srvctl start database -d <DB Name> -o force

To perform operation on individual instance level, -I option can be used.


$srvctl stop database -d <DB Name> -i instance <IN name>
ORACLE CLUSTERWARE

It is a cluster manager integrated in the Database to handle the cluster including


node membership, group services, global resource management, and high availability
functions. It can also be used with non-cluster database.

Names as it has evolved 1.Oracle Cluster Management Services (OCMS) 9.0.1 and 9.2
2.Cluster Ready Services (CRS) ( Generic, portable cluster manager) 10.1
3.Oracle Clusterware (CRS has been renamed) 10.2

Background processes 1. Cluster Synchronization Service (CSS)


2. Cluster Ready Services (CRS)
3. Event Manager (EVM)

10

RAC

To administer Clusterware, CRSCTL utility can be used i.e present in


$ORA_CRS_HOME/bin.

Oracle Clusterware must be installed prior to installing ORACLE database. We need


ROOT user during the installation process to perform various tasks requiring super user
privileges.

Being the first Oracle software to be installed on the system, it is susceptible to


configuration errors. So it is recommended to useCluster Verification Utility (CLUVFY) on all
nodes i.e introduced in Oracle 10.2 but backward compatible with 10.1.

Oracle Clusterware requires two files that must be located on shared storage for its
operation.
1. Oracle Cluster Registry (OCR)
2. Voting Disk
Oracle Cluster Registry (OCR)
1) Located on shared storage and in Oracle 10.2 and above can be mirrored to maximum
two copies.
2) Defines cluster resources including
Databases and Instances ( RDBMS and ASM)
Services and Node Applications (VIP,ONS,GSD)
Listener Process
Voting Disk (Quorum Disk / File in Oracle 9i)
1) Used to determine RAC instance membership and is located on shared storage accessible
to all instances.
2) used to determine which instance takes control of cluster in case of node failure to avoid
split brain
3) In Oracle 10.2 and above can be mirrored to only Odd number of copies (1, 3, 5 etc)
VIRTUAL IP (VIP)

To make the applications highly available and to eliminate SPOF,Oracle 10g


introduced a new feature called CLUSTER VIPs i.e a virtual IP address different from the set
of in cluster IP addresses that is used by the outside world to connect to the database.

A VIP name and address must be registered in the DNS along with standard static IP
information. Listeners would be configured to listen on VIPs instead of the public IP.

When a node is down, the VIP is automatically failed over to oneof the other nodes.
The node that gets the VIP will re-ARPto the world, indicating the new MAC address of the
VIP. Clients are sent error message immediately rather than waiting for the TCP timeout
value.

11

RAC

CACHE FUSION:

1) Underlying technology that enables RAC (starting with 9i and improved performance
with 10g)
2) Protocol that allows instances to combine their data caches intoa shared global cache.
3) Allows any node to get the most up-to-date data information from the cache of any
other node in the cluster without having to access the disk drives again.
4) Needed when Dirty Blockof data is created

Data from disk is read into memory on a node

Data is updated on that node

Data hasnt been written to disk yet

Another node requests the data


WHAT IS FAILOVER?
1) If a node in the shared disk cluster fails, the system dynamically redistributes the
workload among the surviving cluster nodes.
2) RAC checks to detect node and network failures. A disk-based heartbeat mechanism

12

RAC
uses the control file to monitor node membership and the cluster interconnect is regularly
checked to determine correct operation.

3) Enhanced failover reliability in 10g with the use of Virtual IP addresses (VIPs)
4) If one node or instance fails, node detecting failure does the following.
Read redo log of failed instance from last checkpoint
Apply redo to datafiles including undo segments (roll forward)
Rollback uncommitted transactions
Cluster is frozen during part of this process
FAST APPLICATION NOTIFICATION (FAN)
1) FAN is a method introduced in Oracle 10.1, by which applicationscan be informed of
changes in cluster status for Fast node failure detection
Workload balancing
2) Advantageous by preventing applications from Waiting for TCP/IP timeouts when a node fails
Trying to connect to currently down database service
Processing data received from failed node
3) Can be notified using Server side callouts
Fast Connection Failover (FCF)
ONS API
ORACLE NOTIFICATION SERVICE (ONS)
1) ONS, introduced in Oracle 10.1 is a subscribe serviceused by Oracle Clusterware to
propagate messages to :
Nodes in cluster
Middle-tier application servers
Clients
2) ONS is automatically launched for RAC on each node as part of the Oracle Clusterware
installation process.However, it can also be configured to run on nodes hosting client or midtier applications.
3) It is an underlying mechanism for Fast Application Notification (FAN).

13

RAC
4) ONSruns independent of Transparent Application Failover (TAF).
TRANSPARENT APPLICATION FAILOVER (TAF)
1) TAF is a client-side feature that allows for clients to reconnect to surviving databases in
the event of a failure of a database instance.
2) Masks failures to end users; they dont need to log back into the system
3) Applications and users are transparently reconnected to another node
4) Applications and queries continue uninterrupted
5) Transactions can failover and replay
6) Login context maintained
7) DML transactions are rolled back
8) Requires configuration in TNSNAMES.ORA
RAC_FAILOVER =
(DESCRIPTION =
(ADDRESS_LIST =
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = RAC)
(SERVER = DEDICATED) (FAILOVER_MODE =(TYPE=SELECT)
(METHOD=BASIC)(RETRIES=30)(DELAY=5))
) )

What is FAN?
Fast application Notification as it abbreviates to FAN relates to the events related
to instances, services and nodes. This is a notification mechanism that Oracle RAC uses to
notify other processes about the configuration and service level information that includes
service status changes such as, UP or DOWN events. Applications can respond to FAN
events and take immediate action.
Where can we apply FAN UP and DOWN events?
FAN UP and FAN DOWN events can be applied to instances, services and nodes.
State the use of FAN events in case of a cluster configuration change?

14

RAC
During times of cluster configuration changes,Oracle RAC high availability
framework publishes a FAN event immediately when a state change occurs in the
cluster.So applications can receive FAN events and react immediately.This prevents
applications from polling database and detecting a problem after such a state change.
Why should we have seperate homes for ASm instance?
It is a good practice to have ASM home seperate from the
database home(ORACLE_HOME).This helps in upgrading and patching ASM and the
Oracle database software independent of each other.Also,we can deinstall the Oracle
database software independent of the ASM instance.
What is the advantage of using ASM?
Having ASM is the Oracle recommended storage option for RAC databases as the
ASM maximizes performance by managing the storage configuration across the
disks.ASM does this by distributing the database file across all of the available storage
within our cluster database environment.
What is rolling upgrade?
It is a new ASM feature from Database 11g.ASM instances in Oracle database
11g release(from 11.1) can be upgraded or patched using rolling upgrade feature.
This enables us to patch or upgrade ASM nodes in a clustered environment without
affecting database availability.During a rolling upgrade we can maintain a functional cluster
while one or more of the nodes in the cluster are running in differentsoftware versions.
Can rolling upgrade be used to upgrade from 10g to 11g database?
No, it can be used only for Oracle database 11g releases(from 11.1).
State the initialization parameters that must have same value for every instance
in an Oracle RAC database:Some initialization parameters are critical at the database creation time and must
have same values. Their value must be specified in SPFILE or PFILE for every instance.
The list of parameters that must be identical on every instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE

15

RAC
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_PASSWORD_FILE
UNDO_MANAGEMENT
Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?
These parameters can be identical on all instances only if these parameter values are set to
zero.
What two parameters must be set at the time of starting up an ASM instance in a
RAC environment?
The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.
Mention the components of Oracle clusterware:Oracle clusterware is made up of components like voting disk and Oracle
Cluster Registry(OCR).
What is a CRS resource?
Oracle clusterware is used to manage high-availability operations in a cluster. Anything that
Oracle Clusterware manages is known as a CRS resource.Some examples of CRS resources
are database,an instance,a service,a listener,a VIP address,an application process etc.
What is the use of OCR?
Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).
How does a Oracle Clusterware manage CRS resources?
Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).
Name some Oracle clusterware tools and their uses?

16

RAC
OIFCFG allocating and deallocating network interfaces
OCRCONFIG Command-line tool for managing Oracle Cluster Registry
OCRDUMP Identify the interconnect being used
CVU Cluster verification utility to get status of CRS resources
What are the modes of deleting instances from ORacle Real Application
cluster Databases?
We can delete instances using silent mode or interactive mode using
DBCA(Database Configuration Assistant).
How do we remove ASM from a Oracle RAC environment?
We need to stop and delete the instance in the node first in interactive or silent mode.After
that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name
How do we verify that an instance has been removed from OCR after deleting
an instance?
Issue the following srvctl command:
srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat
How do we verify an existing current backup of OCR?
We can verify the current backup of OCR using the following command :
Ocrconfig showbackup
What are the performance views in an Oracle RAC environment?
We have v$ views that are instance specific. In addition we have GV$ views called as global
views that has an INST_ID column of numeric data type.GV$ views obtain information from
individual V$ views.
What are the types of connection load-balancing?

17

RAC
There are two types of connection load-balancing:server-side load balancing and clientside
load balancing.
What is the differnece between server-side and client-side connection
load balancing?
Client-side balancing happens at client side where load balancing is done using listener.In
case of server-side load balancing listener uses a load-balancing advisory to redirect
connections to the instance providing best service.
Give the usage of srvctl:srvctl start instance -d db_name -i inst_name_list [-o start_options]srvctl stop instance -d
name -i inst_name_list [-o stop_options]srvctl stop instance -d orcl -i orcl3,orcl4 -o
immediatesrvctl start database -d name [-o start_options]srvctl stop database -d name
[-o stop_options]srvctl start database -d orcl -o mount

SINGLE CLIENT ACCESS NAME (SCAN) ARCHITECTURE


Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g
Release 2 feature that provides a single name for clients to access Oracle Databases running
in a cluster. The benefit is that the clients connect information does not need to change if
you add or remove nodes in the cluster. Having a single name to access the cluster allows
clients to use the EZConnect client and the simple JDBC thin URL to access any database
running in the cluster, independently of which server(s) in the cluster the database is active.
SCAN provides load balancing and failover for client connections to the database. The SCAN
works as a cluster alias for databases in the cluster.
NETWORK REQUIREMENTS FOR USING SCAN
The SCAN is configured during the installation of Oracle Grid Infrastructure that is
distributed with Oracle Database 11g Release2. Oracle Grid Infrastructure is a single Oracle
Home that contains Oracle Clusterware and Oracle Automatic Storage Management. You
must install Oracle Grid Infrastructure first in order to use Oracle RAC 11g Release 2. During
the interview phase of the Oracle Grid Infrastructure installation, you will be prompted to
provide a SCAN name. There are 2 options for defining the SCAN:
1. Define the SCAN in your corporate DNS (Domain Name Service)
2. Use the Grid Naming Service (GNS)
USING OPTION 1 DEFINE THE SCAN IN YOUR CORPORATE DNS
If you choose Option 1, you must ask your network administrator to create a single name
that resolves to 3 IP addresses using a round-robin algorithm. Three IP addresses are
recommended considering load balancing and high availability requirements regardless of
the number of servers in the cluster. The IP addresses must be on the same subnet as your

18

RAC
public network in the cluster. The name must be 15 characters or less in length, not
including the domain, and must be resolvable without the domain suffix (for example:
sales1-scan must be resolvable as opposed to scan1-scan.example.com). The IPs must
not be assigned to a network interface (on the cluster), since Oracle Clusterware will take
care of it.
You can check the SCAN configuration in DNS using nslookup. If your DNS is set up to
provide round-robin access to the IPs resolved by the SCAN entry, then run the nslookup
command at least twice to see the round-robin algorithm work. The result should be that
each time, the nslookup would return a set of 3 IPs in a different order.
Note: If your DNS server does not return a set of 3 IPs as shown in figure 3 or does not
round-robin, ask your network administrator to enable such a setup. DNS using a roundrobin algorithm on its own does not ensure failover of connections. However, the Oracle
Client typically handles this. It is therefore recommended that the minimum version of the
client used is the Oracle Database 11g Release 2 client.
USING OPTION 2 THE GRID NAMING SERVICE (GNS)
If you choose option 2, you only need to enter the SCAN during the interview. During the
cluster configuration, three IP addresses will be acquired from a DHCP service (using GNS
assumes you have a DHCP service available on your public network) to create the SCAN and
name resolution for the SCAN will be provided by the GNS1.
IF YOU DO NOT HAVE A DNS SERVER AVAILABLE AT INSTALLATION TIME
Oracle Universal Installer (OUI) enforces providing a SCAN resolution during the Oracle Grid
Infrastructure installation, since the SCAN concept is an essential part during the creation of
Oracle RAC 11g Release 2 databases in the cluster. All Oracle Database 11g Release 2 tools
used to create a database (e.g. the Database Configuration Assistant (DBCA), or the
Network Configuration Assistant (NetCA)) would assume its presence. Hence, OUI will not
let you continue with the installation until you have provided a suitable SCAN resolution.
However, in order to overcome the installation requirement without setting up a DNS-based
SCAN resolution, you can use a hosts-file based workaround. In this case, you would use a
typical hosts-file entry to resolve the SCAN to only 1 IP address and one IP address only. It
is not possible to simulate the round-robin resolution that the DNS server does using a local
host file. The host file look-up the OS performs will only return the first IP address that
matches the name. Neither will you be able to do so in one entry (one line in the hosts-file).
Thus, you will create only 1 SCAN for the cluster. (Note that you will have to change the
hosts-file on all nodes in the cluster for this purpose.) This workaround might also be used
when performing an upgrade from former (pre-Oracle Database 11g Release 2) releases.
However, it is strongly recommended to enable the SCAN configuration as described under
Option 1 or Option 2 above shortly after the upgrade or the initial installation. In order
to make the cluster aware of the modified SCAN configuration, delete the entry in the hostsfile and then issue: srvctl modify scan -n <scan_name> as the root user on one node
in the cluster. The scan_name provided can be the existing fully qualified name (or a new

19

RAC
name), but should be resolved through DNS, having 3 IPs associated with it, as discussed.
The remaining reconfiguration is then performed automatically.
SCAN CONFIGURATION IN THE CLUSTER
During cluster configuration, several resources are created in the cluster for SCAN. For each
of the 3 IP addresses that the SCAN resolves to, a SCAN VIP resource is created and a SCAN
Listener is created. The SCAN Listener is dependent on the SCAN VIP and the 3 SCAN VIPs
(along with their associated listeners) will be dispersed across the cluster. This means, each
pair of resources (SCAN VIP + Listener) will be started on a different server in the cluster,
assuming the cluster consists of three or more nodes.
In case, a 2-node-cluster is used (for which 3 IPs are still recommended for simplification
reasons), one server in the cluster will host two sets of SCAN resources under normal
operations. If the node where a SCAN VIP is running fails, the SCAN VIP and its associated
listener will failover to another node in the cluster. If by means of such a failure the number
of available servers in the cluster becomes less than three, one server would again host two
sets of SCAN resources. If a node becomes available in the cluster again, the formerly
mentioned dispersion will take effect and relocate one set accordingly.
DATABASE CONFIGURATION USING SCAN
For Oracle Database 11g Release 2, SCAN is an essential part of the configuration and
therefore the REMOTE_LISTENER parameter is set to the SCAN per default, assuming that
the database is created using standard Oracle tools (e.g. the formerly mentioned DBCA).
This allows the instances to register with the SCAN Listeners as remote listeners to provide
information on what services are being provided by the instance, the current load, and a
recommendation on how many incoming connections should be directed to the instance. In
this context, the LOCAL_LISTENER parameter must be considered. The LOCAL_LISTENER
parameter should be set to the node-VIP. If you need fully qualified domain names, ensure
that LOCAL_LISTENER is set to the fully qualified domain name (e.g. nodeVIP.example.com). By default, a node listener is created on each node in the cluster during
cluster configuration. With Oracle Grid Infrastructure 11g Release 2 the node listener run
out of the Oracle Grid Infrastructure home and listen on the node-VIP using the specified
port (default port is 1521). Unlike in former database versions, it is not recommended to set
your REMOTE_LISTENER parameter to a server side TNSNAMES alias that resolves the host
to the SCAN (HOST=sales1-scan for example) in the address list entry, but use the
simplified SCAN:port syntax as shown in figure 5.
[oracle@mynode] srvctl config scan_listener
SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521
SCAN Listener LISTENER_SCAN2 exists. Port: TCP:1521
SCAN Listener LISTENER_SCAN3 exists. Port: TCP:1521
[oracle@mynode] srvctl config scan

20

RAC
SCAN name: sales1-scan, Network: 1/133.22.67.0/255.255.255.0/
SCAN VIP name: scan1, IP: /sales1-scan.example.com/133.22.67.192
SCAN VIP name: scan2, IP: /sales1-scan.example.com/133.22.67.193
SCAN VIP name: scan3, IP: /sales1-scan.example.com/133.22.67.194
NAME

TYPE

local_listener

string

VALUE

(DESCRIPTION=

(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=133.22.67.111)(PORT=1521))))
remote_listener

string

sales1-scan.example.com:1521

Note: if you are using the easy connect naming method, you may need to modify your
SQLNET.ORA to ensure that EZCONNECT is in the list when specifying the order of the
naming methods used for the client name resolution lookups (the Oracle 11g Release 2
default is NAMES.DIRECTORY_PATH=(tnsnames, ldap, ezconnect))
HOW CONNECTION LOAD BALANCING WORKS USING SCAN
For clients connecting using Oracle SQL*Net 11g Release 2, three IP addresses will be
received by the client by resolving the SCAN name through DNS as discussed. The client will
then go through the list it receives from the DNS and try connecting through one of the IPs
received. If the client receives an error, it will try the other addresses before returning an
error to the user or application. This is similar to how client connection failover works in
previous releases when an address list is provided in the client connection string.
When a SCAN Listener receives a connection request, the SCAN Listener will check for the
least loaded instance providing the requested service. It will then re-direct the connection
request to the local listener on the node where the least loaded instance is running.
Subsequently, the client will be given the address of the local listener. The local listener will
finally create the connection to the database instance.
Note: This example assumes an Oracle 11g R2 client using a default TNSNAMES. ORA:
ORCLservice =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = sales1-scan.example.com)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)

21

RAC
(SERVICE_NAME = MyORCLservice)
))
VERSION AND BACKWARD COMPATIBILITY
The successful use of SCAN to connect to an Oracle RAC database in the cluster depends on
the ability of the client to understand and use the SCAN as well as on the correct
configuration of the REMOTE_LISTENER parameter setting in the database. If the version of
the Oracle Client connecting to the database as well as the Oracle Database version used
are both Oracle Database 11g Release 2 and the default configuration is used as described
in this paper, no changes to the system are typically required.
The same holds true, if the Oracle Client version and the version of the Oracle Database that
this client is connecting to are both pre-11g Release 2 version (e.g. Oracle Database 11g
Release 1 or Oracle Database 10g Release 2, or older). In this case, the pre-11g Release 2
client would use a TNS connect descriptor that resolves to the node-VIPs of the cluster,
while the Oracle pre-11g Release 2 database would still use a REMOTE_LISTENER entry
pointing to the node-VIPs. The disadvantage of this configuration is that SCAN would not be
used and hence the clients are still exposed to changes every time the cluster changes in
the backend. Similarly, if an Oracle Database 11g Release 2 is used, but the clients remain
on a former version. The solution is to change the Oracle client and / or Oracle Database
REMOTE_LISTENER settings accordingly. The following cases need to be considered:
Note: If using a pre-11g Release 2 client (Oracle Database 11g Release or Oracle Database
10g Release 2, or older) you will not fully benefit from the advantages of SCAN. Reason:
The Oracle Client will not be able to handle a set of three IPs returned by the DNS for SCAN.
Hence, it will try to connect to only the first address returned in the list and will more or less
ignore the others. If the SCAN Listener listening on this specific IP is not available or the IP
itself is not available, the connection will fail. In order to ensure load balancing and
connection failover with pre-11g Release 2 clients, you will need to change the
TNSNAMES.ora of the client so that it would use 3 address lines, where each address line
resolves to one of the SCAN VIPs.
Sample TNSNAMES.ora for Oracle Database pre- 11g Release 2 Clients
sales.example.com =(DESCRIPTION=
(ADDRESS_LIST= (LOAD_BALANCE=on)(FAILOVER=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=133.22.67.192)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=133.22.67.193)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=133.22.67.194)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME= salesservice.example.com)))
USING SCAN IN A MAXIMUM AVAILABILITY ARCHITECTURE ENVIRONMENT

22

RAC
If you have implemented a Maximum Availability Architecture (MAA) environment, in which
you use Oracle RAC for both your primary and standby database (in both, your primary and
standby site), which are synchronized using Oracle Data Guard, using SCAN provides a
simplified TNSNAMES configuration that a client can use to connect to the database
independently of whether the primary or standby database is the currently active (primary)
database. In order to use this simplified configuration, Oracle Database 11g Release 2
introduces two new SQL*Net parameters that can be used on for connection strings of
individual clients. The first parameter is CONNECT_TIMEOUT. It specifies the timeout
duration (in seconds) for a client to establish an Oracle Net connection to an Oracle
database. This parameter overrides SQLNET.OUTBOUT_CONNECT_TIMEOUT in the
SQLNET.ORA. The second parameter is RETRY_COUNT and it specifies the number of times
an ADDRESS_LIST is traversed before the connection attempt is terminated. Using these
two parameters, both, the SCAN on the primary site and the standby site, can be used in
the client connection strings. Even, if the randomly selected address points to the site that
is not currently active, the timeout will allow the connection request to failover before the
client has waited unreasonably long (the default timeout depending on the operating system
can be as long as 10 minutes).
sales.example.com =(DESCRIPTION= (CONNECT_TIMEOUT=10)(RETRY_COUNT=3)
(ADDRESS_LIST= (LOAD_BALANCE=on)(FAILOVER=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=sales1-scan)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=sales2-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME= salesservice.example.com)))
USING SCAN WITH ORACLE CONNECTION MANAGER
If you use Oracle Connection Manager (CMAN) with your Oracle RAC Database, the
REMOTE_LISTENER parameter for the Oracle RAC instances should include the CMAN server
so that the CMAN server will receive load balancing related information and can therefore
load balance connections across the available instances. The easiest way to achieve this
would be to add the CMAN-server as an entry to the REMOTE_LISTENER of the databases
that clients want to connect to via CMAN as shown in figure 10. Note also that you will have
to remove the SCAN from the TNSNAMES connect descriptor of the clients and further
configurations will be required for the CMAN server. See the CMAN documentation for more
details.
SQL> show parameters listener
NAME
-

TYPE

listener_networks
string
(DESCRIPTION=(ADDRESS_LIST=

VALUE

local_listener string

23

RAC
(ADDRESS=(PROTOCOL=TCP)
(HOST=148.87.58.109)(PORT=1521))))
remote_listener

string

stscan3.oracle.com:1521,(DESCRIPTION=

(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)
(HOST=CMANserver)(PORT=1521))))

Virtual IP(VIP) in RAC


How new connection establish in Oracle RAC?
For failover configuration we should need to configure our physical ip of host name in
listener configuration. Listener process is accepting new connection request and handover
user process to server process or dispatcher process in Oracle.
Means using listener new connection is being established by Oracle. Once connection gets
established there is no need of listener process. If new connection is trying to get session in
database and listener is down then what will be happening. User process gets error
message and connection fails. Because listener is down in same host or something else
problem. But in Oracle RAC database environment database is in sharing mode. Oracle RAC
database is shared by all connected nodes. Means more than 1 listeners are running in
various nodes.
In Oracle RAC database if user process is trying to get connection with some listener and
found listener is down or node is down then Oracle RAC automatically transfer this request
to another listener on another node. Up to Oracle 9i we use physical IP address in listener
configuration. Means if requested connection gets failed then it will be diverting to another
node using physical IP address of another surviving node. But during this automatically
transfer, connection should need to wait up to get error message of node down or listener
down using TCP/IP connection timeout. Means session should need to wait up to getting
TCP/IP timeout error dictation. Once error message is received oracle RAC automatically
divert this new connection request to another surviving node.
Using physical IP address there is biggest gap to get TCP/IP timeout for failover suggestion.
Session should need to wait for same timeout. High availability of Oracle RAC depends on
this time wasting error message.
Why VIP (Virtual IP) needs in Oracle RAC?
From Oracle 10g, virtual IP considers to configure listener. Using virtual IP we can save our
TCP/IP timeout problem because Oracle notification service maintains communication
between each nodes and listeners. Once ONS found any listener down or node down, it will
notify another nodes and listeners with same situation. While new connection is trying to
establish connection to failure node or listener, virtual IP of failure node automatically divert
to surviving node and session will be establishing in another surviving node. This process

24

RAC
doesnt wait for TCP/IP timeout event. Due to this new connection gets faster session
establishment to another surviving nodes/listener.
Characteristic of Virtual IP in Oracle RAC:
Virtual IP (VIP) is for fast connection establishment in failover dictation. Still we can use
physical IP address in Oracle 10g in listener if we have no worry for failover timing. We can
change default TCP/IP timeout using operating system utilities or commands and kept
smaller. But taking advantage of VIP (Virtual IP address) in Oracle 10g RAC database is
advisable. There is utility also provided to configure virtual IP (vip) with RAC environment
called VIPCA. Default path is $ORA_CRS_HOME/bin. During installation of Oracle RAC, it is
executed.
Advantage of Virtual IP deployment in Oracle RAC:
Using VIP configuration, client can be able to get connection fast even fail over of
connection request to node. Because vip automatically assign to another surviving node
faster and it cant wait for TNS timeout old fashion.
Disadvantage of Virtual IP deployment in Oracle RAC:
Some more configurations is needed in system for assign virtual IP address to nodes like
in /etc/hosts and others. Some misunderstanding or confusion may occur due to multiple IP
assigns in same node.
Important for VIP configuration:
The VIPs should be registered in the DNS. The VIP addresses must be on the same subnet
as the public host network addresses. Each Virtual IP (VIP) configured requires an unused
and resolvable IP address.

Split Brain Syndrome and I/O Fencing (RAC).


Split brain syndrome occurs when the instances in a RAC fails to connect or
ping
to each other via the private interconnect, Although the servers are
physically up
and running and the database instances on these servers is also running.
The individual nodes are running fine and can accept user connections and work
independently.
So, in a two node situation both the instances will think that the other
instance is down because of lack of connection.
The problem which could arise out of this situation is that the sane block
might get read, updated in these individual instances which cause data
integrity
issues, because the block changed in one instance will not be locked
and could be overwritten by another instance.

25

RAC
So, when a node fails, the failed node is prevented from accessing all
the shared disk devices and groups. This methodology is called I/O Fencing,
Disk Fencing or Failure Fencing.
In a RAC environment the node which first detects the other unreachable node
will evict it from the cluster to avoid data corruption.

Oracle RAC Q&A -I


1. Why Node Eviction happens on Oracle RAC ?
Oracle Clusterware evicts the node when following condition occur:
- Node is not pinging via the network hearbeat
- Node is not pinging the Voting Disk
- Node is hung or busy and is unable to perform the above two tasks
Most cases the error cause is written to disk. If no error following the Metalink note: ID 559365.1 touse
Diagwait option which will gives 10 seconds for the node to write logs to error log file.
#crsctl
#crsctl
#crsctl
#crsctl

set css diagwait 13 -force


get css diagwait
check crs
unset css diagwait -f

1a. What is Miscount(MC) in Oracle RAC ?


The Cluster Synchronization Service (CSS) on RAC has Miscount parameter. This value represents
maximum time, in seconds, that a network heartbeat can be missed before entering into a cluster
reconfiguration to evict the node. The default value is 30 seconds (Linux 60 seconds in 10g, 30 sec in
11g).
2. What is the use of CSS Heartbeat Mechanism in Oracle RAC ?
The CSS of the Oracle Clusterware maintains two heartbeat mechanisms
1. The disk heartbeat to the voting device and
2. The network heartbeat across the interconnect (This establish and confirm valid node membership in
the cluster).
Both of these heartbeat mechanisms have an associated timeout value. The disk heartbeat has an
internal i/o timeout interval (DTO Disk TimeOut), in seconds, where an i/o to the voting disk must
complete. The misscount parameter (MC), as stated above, is the maximum time, in seconds, that a
network heartbeat can be missed. The disk heartbeat i/o timeout interval is directly related to the
misscount parameter setting. The Disk TimeOut(DTO) = Miscount(MC) - 15 secconds (some versions are
different).

26

RAC
3. What happens if latencies to voting disks are longer ?
If I/O latencies to the voting disk are greater than the default Disk TimeOut (DTO), then the cluster may
experince CSS node evictions.
4. What is CSS miscount ?
The CSS miscount represents the maximum seconds the network hearbeat can be missed before
entering into cluster reconfiguration and evict the node. The default CSS miscount is 30 seconds. (only for
10g Linux it is 60 secods).
4a. How to change the CSS miscount default value ?
1) Shut down CRS on all but one node. For exact steps use Note 309542.1
2) Execute crsctl as root to modify the misscount:
$ORA_CRS_HOME/bin/crsctl set css misscount
where is the maximum i/o latency to the voting disk +1 second
3) Reboot the node where adjustment was made
4) Start all other nodes shutdown in step 1
5. How to start and stop CRS ?
Note: Typically the Oracle clusterware starts up automatically during startup.
10gR1 and R2
-----------cd /etc/init.d
init.crs stop
init.crs start
To disable crs to start during next reboot. It will not bring down running crs.
init.crs enable
init.crs disable
10gR2 and higher versions Only
-----------------------------Start Oracle Clusterware
crsctl start crs
Stop Oracle Clusterware
crsctl stop crs
6. How to move regular DB to an ASM disk group ?
The following are the steps involved in moving regular db files to ASM disk
group.

27

RAC
Assume:
1. Oracle RAC instance is up already
2. DB name to be moved PROD
3. RAC db and normal DB both are in same instance.
1. Install and bring up Oracle RAC instance and ASM disk group.
2. Comment control file location in the DB you want to move and add ASM
disk name for control_file.
ex. control_file="+DATA_GRP"
3. SQL> startup nomount
SQL> Show parameter => Control_files will show new disk grp
4. Use RMAN to move control file from regular disk to ASM using restore
command.
rman
rman> connect target
rman> restore controlfile from '/u01/oracle/PROD/cntrl01.ctl';
5. Verify using asmcmd
asmcmd> cd DSK_GRP/DATA_GRP/PROD
asmcmd> ls => you can see new controlfile under PROD directory.
6. Now mount the DB
sqlplus "/as sysdba"
sql> alter database mount;
7. Now use RMAN to move the data files.
rman
connect target (connected to PROD)
rman> backup as copy database format '+DATA_GRP';
Note: you can use asmcmd to monitor the data file movements to ASM.
8. rman> swith database to copy;
9. sqlplus "/as sysdba" ; alter database open;
10. select * from v$datafile;
select * from v$tempfile;
select * from v$controlfile;
select * from v$logfile;
11. sql> alter database drop logfile '/u01/.../redo01.log';
alter database add logfile '+DATA_GRP';

28

RAC
Note: Repeat same step for all log files except current used logfile.
select * from v$log to find which one is current
12. alter system switch logfile;
drop the first one which was being used.
13. Now vi init.ora and put full path for controlfile for DB to start
properly.
*.control_file="+DATA_GRP/PROD/controlfile/current.333.433.3333"
13. vi init.ora => Change location of arc to ASM
*.log_archive_dest_1='LOCATION=+DATA_GRP/PROD' => if you omit PROD it
will not work properly.
14. alter system switch logfile; => now the new arc will go to ASM.
15. END
-----------------3. What is a NIC card and HBA card.
Oracle RAC requires a NIC or HBA card which enables the computer to talk to network
or to a storage subsystem.
There are diffrent speeds of BHA card: 1Gbit/S, 2GBit/S, 4, 8, 10, 20 GBits/s
HBA has a unique World Wide Name (WWN),
which is similar to an Ethernet MAC address in that it uses an Organizationally
Unique Identifier (OUI) assigned by the IEEE.
4. What is a TPS.
http://www.dba-oracle.com/m_transactions_per_second.htm
==
5. What is the use of crs_getperm command ?
Used to get permission information.
crs_getperm
Usage: crs_getperm resource_name [-u user|-g group] [-q]
crs_getperm ora.dudb.dudb1.inst
Name: ora.dudb.dudb1.inst
owner:oracle:rwx,pgrp:oinstall:rwx,other::r--,

29

RAC
==
6. what is the use of crs_profile ?
Used to create, validate, delete and update a profile for RAC.

==
7. Where will you check for RAC log files?
==
8. What is OCFS ?
===
8a. What is OCR ?
- Is a binary file used to store configuration information ans status
information. Its like windows registry.
- Maintained by CRS Daemon.
- Can be mirrered in 10R2.
- Include config information of DB, ASM, Services, VIP, Listener and etc

8b. What is Voting Disk?


-Used stores node membership information.
-Used by CSS during split-brain synarios. (two nodes trying to do same task).
-Used to determine RAC instance membership.
8c. What is VIP?
- All application connect using VIP
==
9. What is Oracle ClusterWare ?
a. It is franework which contains application modeling logic.
Invokes application aware agents.
Performs resource recovery. Whan a node goes down, Clusterware framework
recovers the application by relocationg the resources to a live node.
This can be done for non Oracle applications as well. For ex. xclock.

30

RAC
b. Clusterware also hosts OCR cache.
The Oracle Clusterware requires two clusterware components:
a voting disk to record node membership information and the
Oracle Cluster Registry/Repository (OCR) to record cluster configuration information.
The voting disk and the OCR must reside on shared storage.

==
10. What is a resource ?
A resource is a Oracle Clusterware manager application.
'Profile attributes' for a resource is stored in Oracle Cluster Registry.
11. What is OCR?
Oracle Cluster Registry or OCR is a component of Oracle Clusterware Framework.
It stores profile attibute information.
Oracle RAC consists of series of resources.
Other applications can also be treated as a resource.
OCR contains information pertaining to instance-to-node mapping
You cant have more than two OCRs.

11. How to register a resource ?


a. Use crs_profile to create .CAP file with configuration details.
b. use crs_register to read .CAP file and update the OCR.
c. Resources can have dependencies. It will start in order and failover as a single unit.
12. What does crs_start / crs_stop does ?
Reads config info from OCR and calls agent with command 'start'.
The agents (can be user written) actully stops the resource.
crs_start => read OCR config info => calls 'Control Agent' with command start. => Control agent stops the
resource.
crs_stop => read OCR config info => call 'Control agent' with 'stop' => control agent stops app.
==
13. Question: Using the crs_start command to start/stop services.
As per Oracle documentation.....
2) Oracle Database Oracle Clusterware and Oracle Real Application
Clusters Administration and Deployment Guide

31

RAC
10g Release 2 (10.2)
Part Number B14197-03
Page 260 says
"Note: Do not use the Oracle Clusterware commands crs_register, crs_profile,
crs_start or crs_stop on resources with names beginning with the prefix "ora"
unless either Oracle Support asks you to, or unless Oracle has certified you as
described in http://metalink.oracle.com. Server Control (SRVCTL) is the correct
utility to use on Oracle resources. You can create resources that depend on
resources that Oracle has defined. You can also use the Oracle Clusterware commands to
inspect the configuration and status."
==

14. What is the difference between Oracle Clusterware and CRS ?


Oracle Clusterware is formerly known as Cluster Ready Services (CRS). It is an integrated cluster
management solution that enables you to link multiple servers so that they function as a single system or
cluster. The Oracle Clusterware simplifies the infrastructure required for RAC because it is integrated with
the Oracle Database. In addition, Oracle Clusterware is also available for use with single-instance
databases and applications that you deploy on clusters
Note: The commands stating with crs_ are still valid and same.
==
15. What is 'Split brain Syndrome' ?
The Oracle Clusterware manages node membership and 'prevents' 'split brain syndrome' in which two or
more instances attempt to control the database. This can occur in cases where there is a break in
communication between nodes through the interconnect.
16. What is Oracle recomendation for interconnect ?
Oracle recommends that you configure a redundant interconnect to prevent the interconnect from being a
single point of failure.
Oracle also recommends that you use User Datagram Protocol (UDP) on a Gigabit Ethernet for your
cluster interconnect.
Crossover cables are not supported for use with Oracle Clusterware or RAC databases.
17. List the commands used to manage RAC ?
crs_profile
crs_register

32

RAC
crs_relocate
crs_getperm
crs_setperm
crs_stat
srvctl
-crsctl
$crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
$crsctl check cssd
CSS appears healthy
$crsctl check evmd
EVM appears healthy
crsctl add css votedisk - adds a new voting disk
crsctl delete css votedisk - removes a voting disk
crsctl enable crs - enables startup for all CRS daemons
crsctl disable crs - disables startup for all CRS daemons
crsctl start crs - starts all CRS daemons.
crsctl stop crs - stops all CRS daemons. Stops CRS resources
$ crsctl query crs activeversion
CRS active version on the cluster is [10.2.0.1.0]
-ocrdump

-ocrconfig
dsudsbs1:oracle$ ocrconfig -showbackup
dsudsbs1 2009/11/25 19:42:50 /opt/crs/oracle/product/10.2/app/cdata/crs
dsudsbs1 2009/11/25 15:42:49 /opt/crs/oracle/product/10.2/app/cdata/crs
dsudsbs1 2009/11/25 11:42:49 /opt/crs/oracle/product/10.2/app/cdata/crs
dsudsbs1 2009/11/24 19:42:47 /opt/crs/oracle/product/10.2/app/cdata/crs
dsudsbs1 2009/11/12 19:42:12 /opt/crs/oracle/product/10.2/app/cdata/crs

33

RAC
ocrconfig -repair ocr
ocrconfig -replace
ocrconfig -export/-import
ocrconfig -upgrade
-ocrcheck - no param needed.
$ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 0
Used space (kbytes) : 4588
Available space (kbytes) : 4294962708
ID : 1014742862
Device/File Name : /dev/rdsk/c5t0d3
Device/File integrity check succeeded
Device/File Name : /dev/rdsk/c5t0d4
Device/File integrity check succeeded
Cluster registry integrity check succeeded
--

18. How to take backup of Voting disk ?.


Use dd command to backup.
dd if=voting_disk_file of=backup_vt_file
In windoes use ocopy.
To add and remove voting disks use crsctl:
crsctl add css voting_disk_path
crsctl delete css voting_disk_path
if your cluster is down use force option
crsctl add css voting_disk_path -force
==
19. How to find location of voting disk ?
option 1:

34

RAC
crsctl query css votedisk
0. 0 /dev/rdsk/c5t0d5
1. 0 /dev/rdsk/c5t0d6
2. 0 /dev/rdsk/c5t0d7
located 3 votedisk(s).
option 2:
take a ocrdump.
ocrdump -stdout -keyname SYSTEM.css.diskfile

20. What is CRS?


21. What are the log file locations for RAC ?
cd $ORACLE_HOME/log//client
-- when you execute command like oifcfg, ocrconfig and etc
-- a log file will be created here.

cd $ORACLE_HOME/log//crsd
cd $ORACLE_HOME/log//racg

==
22. How to backup OCR ?
Oracle Cluster Registry (OCR) and recovering it. Oracle Clusterware automatically creates OCR backups
every four hours and it always retains the last three backup copies of the OCR. The CRSD process that
creates the backups also creates and retains an OCR backup for each full day and then at the end of a
week a complete backup for the week. So there is a robust backup taking place in the background. And
you guessed it right; you cannot alter the backup frequencies. This is meant to protect you, the DBA, so
that you can copy these generated backup files at least once daily to a different device from where the
primary OCR resides. These files are located at %CRS_home/cdata/my_cluster.

==
23. How to find location of OCR?
ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 0
Used space (kbytes) : 4588
Available space (kbytes) : 4294962708

35

RAC
ID : 1014742862
Device/File Name : /dev/rdsk/c5t0d3
Device/File integrity check succeeded
Device/File Name : /dev/rdsk/c5t0d4
Device/File integrity check succeeded
Cluster registry integrity check succeeded

24. How to restore OCR file if currupted ?


Do the following to restore our OCR on Unix/Linux Systems.
To show the backups, type the commands ocrconfig
showbackup
Check the contents by doing ocrdump -backupfile my_file
Go to bin and stop the CRS. crs stop on all nodes.
Perform the restore ocrconfig
restore my_file
Restart the nodes crs start
We have spoken and seen the CVU (Cluster Verification Utility) play a crucial role during installation in our
RAC on VMware Series. Check the OCRs integrity. Get a verbose output of all of the nodes by doing this:
cluvfy comp ocr n all -verbose
==
25. How to compare all nodes with cluvfy?
cluvfy comp ocr -n all [-verbose]

oracle$ cluvfy comp ocr -n all


Verifying OCR integrity
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Uniqueness check for OCR device passed.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.

36

RAC
Verification of OCR integrity was successful.
dsudsbs1:oracle$
==
26. How to manage ASM?
Administering ASM Instances with SRVCTL in RAC
Use the following command to add configuration information to an existing ASM instance:
srvctl add asm -n mynode_name -i myasm_instance_name -o myoracle_home
If, however, you choose not to add the I option, then the changes are propogated throughout the entire
ASM instance pool.
To remove an ASM instance, use the following syntax:
srvctl remove asm -n mynode_name [-i myasm_instance_name]
In order to enable an ASM instance, use the following syntax:
srvctl enable asm -n mynode_name [-i ] myasm_instance_name
In order to disable an ASM instance use the following syntax:
srvctl disable asm -n mynode_name [-i myasm_instance_name]
Note that you can also use the SRVCTL utility to start, stop, and get the status of an ASM instance. See
the examples below.
To start an ASM instance, do the following:
srvctl start asm -n mynode_name [-i myasm_instance_name] [-o start_options] [-c | -q]
To stop an ASM instance, type the following syntax:
srvctl stop asm -n mynode_name [-i myasm_instance_name] [-o stop_options] [-c | -q]
To list the configuration of an ASM instance do the following:
srvctl config asm -n mynode_name
To get the status of an ASM instance, see the following syntax:
srvctl status asm -n mynode_name
==
27. How to start and stop RAC ?
Starting Up and Shutting Down with SRVCTL
We have covered SRVCTL before, so we'll do a quick syntax check here, to start an instance:
srvctl start instance -d mydb -i "myinstance_list" [-o start_options] [-c connect_str | -q]

37

RAC
To stop, do the following:
srvctl stop instance -d mydb -i " myinstance_list" [-o stop_options] [-c connect_str | -q]
To start and stop the entire RAC cluster database, meaning all of the instances, you will do the following
from your SRVCTL in the command line:
srvctl start database -d mydb [-o start_options] [-c connect_str | -q]
srvctl stop database -d mydb [-o stop_options] [-c connect_str | -q]
There are several options and we will look at all of them in upcoming articles in RAC administration
==
28 . How to take ocrdump?
login as root.
type ocrdump. It create a file as OCRDUMPFILE
vi to see the ocrdump.
if you type again you will get error:
# ocrdump
PROT-303: Dump file already exists [OCRDUMPFILE]
==

Rac Architecture
A cluster is a set of 2 or more machines (nodes) that share or coordinate resources to
perform
the same task.
A RAC database is 2 or more instances running on a set of clustered nodes, with all
instances
accessing a shared set of database files.
Depending on the O/S platform, a RAC database may be deployed on a cluster that uses
vendor clusterware plus Oracle's own clusterware (Cluster Ready Services), or on a cluster
that solely uses Oracle's own clusterware.
Thus, every RAC sits on a cluster that is running Cluster Ready Services. srvctl is the
primary tool DBAs use to configure CRS for their RAC database and processes.
Cluster Ready Services and the OCR
Cluster Ready Services, or CRS, is a new feature for 10g RAC. Essentially, it is Oracle's own
clusterware. On most platforms, Oracle supports vendor clusterware; in these cases, CRS
interoperates with the vendor clusterware, providing high availability support and service
and workload management. On Linux and Windows clusters, CRS serves as the sole

38

RAC
clusterware. In all cases, CRS provides a standard cluster interface that is consistent across
all platforms.
CRS consists of four processes (crsd, occsd, evmd, and evmlogger) and two disks: the
Oracle Cluster Registry (OCR), and the voting disk.
CRS manages the following resources:
The ASM instances on each node
Databases
The instances on each node
Oracle Services on each node
The cluster nodes themselves, including the following processes, or "nodeapps":
VIP
GSD
The listener
The ONS daemon
CRS stores information about these resources in the OCR. If the information in the OCR for
one
of these resources becomes damaged or inconsistent, then CRS is no longer able to manage
that resource. Fortunately, the OCR automatically backs itself up regularly and frequently
Interacting with CRS and the OCR: srvctl
srvctl is the tool Oracle recommends that DBAs use to interact with CRS and the cluster
registry.
Oracle does provide several tools to interface with the cluster registry and CRS more
directly,
at a lower level, but these tools are deliberately undocumented and intended only for use by
Oracle Support. srvctl, in contrast, is well documented and easy to use. Using other tools to
modify
the OCR or manage CRS without the assistance of Oracle Support runs the risk of damaging
the OCR.
Oracle10g RAC Service Architecture
Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services
Overview
Service Architecture
Cluster Ready Services (CRS)
N
ASM Servodeapps
User defined Services ices
Internally Managed Services
Monitoring Services
RAC Service Architecture
Oracle 10g RAC features a service based architecture
This is an improvement over 9i RAC in several ways
Increased flexibility
Increased manageability
Improvements in High Availability

39

RAC
Enables 10g Grid Deployment
RAC Services and High Availability
Oracle Services facilitate high availability of databases and related applications
If key database resources become unavailable (network, storage, etc.):
Instances and Services will be relocated to another node
The failed node will be rebooted
By default, after any server boot-up, Oracle attempts to restart all services on the node
Cluster Ready Services
Manage the RAC Cluster
Several Different Services
OracleCRSService
Oracle CSService
OracleEVMService
OraFenceService
Required for RAC installation
Installed in its own CRS_HOME
CRS Basics
Used to manage RAC
Only one set of CRS Daemons per system
Multiple instances share the same CRS
CRS runs as both root and Oracle users
CRS must be running before RAC can start
CRS Management
Started automatically
Can stop and start manually
Start the OracleCRSService
Stop the OracleCRSService
Uses the voting disk and OCR (Oracle Cluster Repository)
Requires 3 network addresses
Public
Private
Virtual Public
CRS Services
OracleCRSService
Cluster Ready Services Daemon
OracleCSService
Oracle Cluster Synchronization Service Daemon
OracleEVMService
Event Manager Daemon
OraFenceService
Process Monitor
Cluster Ready Services Daemon

40

RAC
OracleCRSService
Runs as Administrator user
Automatically restarted
Manages Application Resources
Starts, stops and fails-over application resources
Maintains the OCR (Oracle Cluster Repository)
Keeps state information in the OCR
Oracle Cluster Synchronization Service Daemon
OracleCSService
Runs as Administrator user
Maintains the heartbeat (failure causes system reboot)
Provides Node Membership
Group Access
Basic Cluster Locking
Can integrate with 3rd party clustering products or run standalone
OracleCSService also works with non-RAC systems
Event Manager Daemon
OracleEVMService
Runs as Administrator user
Restarts on failure
Generates Events
Starts the racgevt thread to invokes Server Callouts
Process Monitor
OraFenceService
Runs as Administrator user
Locked in memory to monitor the cluster
Provides I/O fencing
OraFenceService periodically monitors cluster status, and can reboot the node if a
problem is detected An OraFenceService failure results in
Oracle Clusterware restarting the node
RACG
RACG is a behind-the-scenes process (or thread) that extends clusterware to
support Oracle-specific requirements and complex resources.
Runs server callout scripts when FAN events occur.
Runs as processes (or threads), not as a service (racgmain.exe, racgimon.exe)
Cluster Ready Services Management
Log Files
OracleCRSService
%ORA_CRS_HOME%\log\hostname\crsd
Oracle Cluster Registry (OCR)
%ORA_CRS_HOME%log\hostname\ocr
OracleEVMService

41

RAC
%ORA_CRS_HOME%\log\hostname\evmd
OracleCSService
%ORA_CRS_HOME%log\hostname\cssd
RACG
%ORA_CRS_HOME%log\hostname\racg
Nodeapp Services
Nodeapps are a standard set of Oracle application services that are automatically
launched for RAC
Virtual IP (VIP)
Oracle Net Listener
Global Services Daemon (GSD)
Oracle Notification Service (ONS)
Nodeapp services run on each node
Can be relocated to other nodes through the virtual IP

VIP (Virtual IP)


Creates a virtual IP address used by the Listener
The virtual IP address fails over between nodes
Multiple virtual IP addresses can exist on the same system (during failover)
Independent of the Oracle Instance
Potential Problem if more than one database per node
Global Services Daemon (GSD)
The daemon which executes SRVCTL commands
GSD receives requests from SRVCTL to execute administrative tasks, such as startup or
shutdown
The command is executed locally on each node, and the results are sent back to SRVCTL.
The daemon is installed on the nodes by default. It is important that you do not kill this
process and it should not be deleted.
Listener
Server-side component of Oracle Net
Listens for incoming client connection requests
Manages the traffic to the server; when a client requests a network session with a server,
the listener actually receives the request and brokers the client request
If the client's information matches the listener's information, then the listener grants
a connection to the server.
Oracle Notification Service (ONS)
The Oracle Notification Service is installed automatically on each RAC node as a
Node Application
ONS starts automatically with each boot
ONS uses a simple push/subscribe method to publish event messages to all RAC
nodes with active ONS daemons

42

RAC
ONS and Fast Application Notification
ONS can be configured to run on nodes hosting client or mid-tier applications
ONS is the key component of Fast Application Notification (FAN)
Can be utilized to extend RAC high availability and load balancing to mid-tier
applications
Independent of True Application Failover
Less reliance on network configuration
User Defined Services
User defined, named services may be created to manage database resources that are
associated with application workloads
One or more database instances may be mapped to a single service
A database instance may be assigned to one or more services
The Automated Workload Repository may be used to monitor Service metrics
User Defined Services and Failover
Services can be defined with preferred and alternate instances
A service may be assigned to start on preferred instances
The same service may have alternate instances assigned for failover
If multiple services are assigned for the same database, the preferred and alternate
instance assignments may be different for each service
Automatic Storage Management Services
Automatic Storage Management (ASM) is a storage option for creating and managing
databases
ASM operates like a Logical Volume Manager between the physical storage and the
database.
A small, automatically managed Oracle database instance is created on each node (if ASM is
chosen
as a storage option)
ASM instances start automatically as Oracle services
Internally Managed Services
When the Global Services Daemon is started as a part of the Node Applications,
it in turn launches key internally managed services
The Global Cache Service manages Cache Fusion and in-memory data buffers
The Global Enqueue Service manages inter-instance locking and RAC recovery
GCS and GES show up as OS processes or threads,
but GSD is the only service that can be externally controlled
GCS and GES together manage a set of virtual tables in memory,
called the Global Resouce Directory
Global Cache Service (GCS)
The controlling process that implements Cache Fusion.
Manages the status and transfer of data blocks across the buffer caches of all instances.
Tightly integrated with the buffer cache manager to enable fast lookup of resource
information in the Global Resource Directory.
Maintains the block mode for blocks in the global role.

43

RAC
Employs various background processes (or threads) such as the
Global Cache Service Processes (LMSn) and Global Enqueue Service Daemon (LMD).
Global Enqueue Service Monitor (LMON)
Background process that monitors the entire cluster to manage global resources.
Manages instance and process expirations and recovery for GCS.
Handles the part of recovery associated with global resources
Global Resource Directory
The data structures associated with global resources. It is distributed across all
instances in a cluster.
Global Cache Service and Global Enqueue Service maintain the Global Resource
Directory to record information about resources and enqueues held globally.
The Global Resource Directory resides in memory and is distributed throughout the
cluster to all nodes. In this distributed architecture, each node participates in
managing global resources and manages a portion of the Global Resource Directory.
Monitoring RAC Services
%ORA_CRS_HOME%\bin\crs_stat
NAME=ora.rac1.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac1.oem
TYPE=application
TARGET=ONLINE
STATE=ONLINE
NAME=ora.rac1.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE
Monitoring RAC Services
Creating a tabular report:
%ORA_CRS_HOME%\bin\crs_stat -t
Name Type Target State Host
-------------------------------------------------------------------------------------------------ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.oem application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora.rac2.gsd application ONLINE ONLINE rac2

44

RAC
ora.rac2.oem application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
Review
What advantages does a service based architecture offer?
What four services comprise Cluster Ready Services?
Nodeapps consists of which four applications?
True or False: a database instance may be assigned to multiple services
Summary
Service Architecture
Cluster Ready Services (CRS)
OracleCRSService
OracleCSService
OracleEVMService
OraFenceService
Nodeapps
VIP
Listener
GSD
ONS
User defined Services
ASM Services
Internally managed services
Global Cache Service
Global Enqueue Service
Global Resource Directory
Monitoring Services

Oracle RAC Interview Question With Answer


What is RAC?
RAC stands for Real Application cluster. It is a clustering solution from Oracle Corporation
that ensures high availability of databases by providing instance failover, media failover
features.
What is RAC and how is it different from non RAC databases?
RAC stands for Real Application Cluster, you have n number of instances running in their
own separate nodes and based on the shared storage. Cluster is the key component and is a
collection of servers operations as one unit. RAC is the best solution for high performance
and high availably. Non RAC databases has single point of failure in case of hardware failure
or server crash.

45

RAC
Give the usage of srvctl :
srvctl start instance -d db_name -i "inst_name_list" [-o start_options]
srvctl stop instance -d name -i "inst_name_list" [-o stop_options]
srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate
srvctl start database -d name [-o start_options]
srvctl stop database -d name [-o stop_options]
srvctl start database -d orcl -o mount
Mention the Oracle RAC software components :
Oracle RAC is composed of two or more database instances. They are composed of Memory
structures and background processes same as the single instance database.Oracle RAC
instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that
enable cache fusion.Oracle RAC instances are composed of following background processes:
ACMSAtomic Controlfile to Memory Service (ACMS)
GTX0-jGlobal Transaction Process
LMONGlobal Enqueue Service Monitor
LMDGlobal Enqueue Service Daemon
LMSGlobal Cache Service Process
LCK0Instance Enqueue Process
RMSnOracle RAC Management Processes (RMSn)
RSMNRemote Slave Monitor
What is GRD?
GRD stands for Global Resource Directory. The GES and GCS maintains records of the
statuses of each datafile and each cahed block using global resource directory.This process
is referred to as cache fusion and helps in data integrity.
What are the different network components are in 10g RAC?
public, private, and vip components
Private interfaces is for intra node communication. VIP is all about availability of application.
When a node fails then the VIP component fail over to some other node, this is the reason
that all applications should based on vip components means tns entries should have vip
entry in the host list
Give Details on ACMS:
ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is
an agent that ensures a distributed SGA memory update(ie)SGA updates are globally
committed on success or globally aborted in event of a failure.
What is Cache Fusion?
Cache fusion is the mechanism to transfer the data block from memory to memory of one

46

RAC
node to the other.If two nodes require the same block for query or update, the block must
be transfered from the cache of one node to the other. RAC system must equipped with lowlatency and high speed inter-connect to make it happen.
Give Details on Cache Fusion:
Oracle RAC is composed of two or more instances. When a block of data is read from
datafile by an instance within the cluster and another instance is in need of the same
block,it is easy to get the block image from the insatnce which has the block in its SGA
rather than reading from the disk. To enable inter instance communication Oracle RAC
makes use of interconnects. The Global Enqueue Service(GES) monitors and Instance
enqueue process manages the cahce fusion.
Cache Fusion is essentially a memory-to-memory transfer of data between the nodes in the
RAC environment. Before Cache Fusion, a node was required to write some of the data to
disk before it could be transferred to the next node in the cluster. Cache Fusion does a
straight memory-to-memory transfer. In addition, each node's SGA has a map of what data
is contained in the other node's data caches.
The performance improvement is phenomenal. Oracle leverages the vendor's high speed
interconnects between the nodes to achieve the cache-to-cache data transfers. Before
Cache Fusion, when you added a node to the cluster to increase performance of the
application, it didn't always provide you with the performance improvement that you hoped
for. With Cache Fusion, you can easily cost justify the addition of another node into a RAC
cluster to increase the performance of the application running on it. Oracle sales pitches
describe it as 'near linear horizontal scalability'.
What are the major RAC wait events?
In a RAC environment the buffer cache is global across all instances in the cluster and hence
the processing differs.The most common wait events related to this are gc cr request and gc
buffer busy
GC CR request :the time it takes to retrieve the data from the remote cache
Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will
increase the amount of data blocks requested by an Oracle session. The more blocks
requested typically means the more often a block will need to be read from a remote
instance via the interconnect.)
GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested
data block.
Give details on GTX0-j :
The process provides transparent support for XA global transactions in a RAC
environment.The database autotunes the number of these processes based on the workload
of XA global transactions.
Give details on LMON:
This process monitors global enques and resources across the cluster and performs global
enqueue recovery operations.This is called as Global Enqueue Service Monitor.

47

RAC
Give details on LMD:
This process is called as global enqueue service daemon. This process manages incoming
remote resource requests within each instance.
Give details on LMS:
This process is called as Global Cache service process.This process maintains statuses of
datafiles and each cahed block by recording information in a Global Resource
Dectory(GRD).This process also controls the flow of messages to remote instances and
manages global data block access and transmits block images between the buffer caches of
different instances.This processing is a part of cache fusion feature.
Give details on LCK0:
This process is called as Instance enqueue process.This process manages non-cache fusion
resource requests such as libry and row cache requests.
Give details on RMSn:
This process is called as Oracle RAC management process.These pocesses perform
managability tasks for Oracle RAC.Tasks include creation of resources related Oracle RAC
when new instances are added to the cluster.
Give details on RSMN:
This process is called as Remote Slave Monitor.This process manages background slave
process creation andd communication on remote instances. This is a background slave
process.This process performs tasks on behalf of a co-ordinating process running in another
instance.
What components in RAC must reside in shared storage?
All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shred storage.
What is the significance of using cluster-aware shared storage in an Oracle RAC
environment?
All instances of an Oracle RAC can access all the datafiles,control files, SPFILE's, redolog
files when these files are hosted out of cluster-aware shared storage which are group of
shared disks.
Give few examples for solutions that support cluster storage:
ASM(automatic storage management),raw disk devices,network file system(NFS), OCFS2
and OCFS(Oracle Cluster Fie systems).
What is an interconnect network?
An interconnect network is a private network that connects all of the servers in a cluster.
The interconnect network uses a switch/multiple switches that only the nodes in the cluster
can access.

48

RAC
How can we configure the cluster interconnect?
Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unix
and linux systems we use UDP and RDS(Reliable data socket) protocols to be used by Oracle
Clusterware.Windows clusters use the TCP protocol.
Can we use crossover cables with Oracle Clusterware interconnects?
No, crossover cables are not supported with Oracle Clusterware intercnects.
What is the use of cluster interconnect?
Cluster interconnect is used by the Cache fusion for inter instance communication.
How do users connect to database in an Oracle RAC environment?
Users can access a RAC database using a client/server configuration or through one or more
middle tiers ,with or without connection pooling.Users can use oracle services feature to
connect to database.
What is the use of a service in Oracle RAC environment?
Applications should use the services feature to connect to the Oracle database.Services
enable us to define rules and characteristics to control how users and applications connect
to database instances.
What are the characteristics controlled by Oracle services feature?
The charateristics include a unique name, workload balancing and failover options,and high
availability characteristics.
What enables the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application connections across all of the
instances in an Oracle RAC database.
What is a virtual IP address or VIP?
A virtl IP address or VIP is an alternate IP address that the client connectins use instead of
the standard public IP address. To configureVIP address, we need to reserve a spare IP
address for each node, and the IP addresses must use the same subnet as the public
network.
What is the use of VIP?
If a node fails, then the node's VIP address fails over to another node on which the VIP
address can accept TCP connections but it cannot accept Oracle connections.
Give situations under which VIP address failover happens:
VIP addresses failover happens when the node on which the VIP address runs fails, all
interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from
the network.

49

RAC
What is the significance of VIP address failover?
When a VIP address failover happens, Clients that attempt to connect to the VIP address
receive a rapid connection refused error .They don't have to wait for TCP connection timeout
messages.
What are the administrative tools used for Oracle RAC environments?
Oracle RAC cluster can be administered as a single image using OEM(Enterprise
Manager),SQL*PLUS,Servercontrol(SRVCTL),clusterver ificationutility(cvu),DBCA,NETCA
How do we verify that RAC instances are running?
Issue the following query from any one node connecting through SQL*PLUS.
$connect sys/sys as sysdba
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER column,host_:instancename
under INST_NAME column.
What is FAN?
Fast application Notification as it abbreviates to FAN relates to the events related to
instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify
other processes about the configuration and service level information that includes service
status changes such as,UP or DOWN events.Applications can respond to FAN events and
take immediate action.
Where can we apply FAN UP and DOWN events?
FAN UP and FAN DOWN events can be applied to instances,services and nodes.
State the use of FAN events in case of a cluster configuration change?
During times of cluster configuration changes,Oracle RAC high availability framework
publishes a FAN event immediately when a state change occurs in the cluster.So applications
can receive FAN events and react immediately.This prevents applications from polling
database and detecting a problem after such a state change.
Why should we have seperate homes for ASM instance?
It is a good practice to have ASM home seperate from the database
hom(ORACLE_HOME).This helps in upgrading and patching ASM and the Oracle database
software independent of each other.Also,we can deinstall the Oracle database software
independent of the ASM instance.
What is the advantage of using ASM?
Having ASM is the Oracle recommended storage option for RAC databases as the ASM
maximizes performance by managing the storage configuration across the disks.ASM does
this by distributing the database file across all of the available storage within our cluster
database environment.
What is rolling upgrade?
It is a new ASM feature from Database 11g.ASM instances in Oracle database 11g

50

RAC
release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables
us to patch or upgrade ASM nodes in a clustered environment without affecting database
availability.During a rolling upgrade we can maintain a functional cluster while one or more
of the nodes in the cluster are running in different software versions.
Can rolling upgrade be used to upgrade from 10g to 11g database?
No,it can be used only for Oracle database 11g releases(from 11.1).
State the initialization parameters that must have same value for every instance in an
Oracle RAC database:
Some initialization parameters are critical at the database creation time and must have
same values.Their value must be specified in SPFILE or PFILE for every instance.The list of
parameters that must be identical on every instance are given below:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)
PARALLEL_MAX_SERVERS
REMOTE_LOGIN_passWORD_FILE
UNDO_MANAGEMENT
What is ORA-00603: ORACLE server session terminated by fatal error or ORA29702: error occurred in Cluster Group Service operation?
RAC node name was listed in the loopback address...
Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?
These parameters can be identical on all instances only if these parameter values are set to
zero.
What two parameters must be set at the time of starting up an ASM instance in a RAC
environment?The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.
Mention the components of Oracle clusterware:
Oracle clusterware is made up of components like voting disk and Oracle Cluster
Registry(OCR).
What is a CRS resource?
Oracle clusterware is used to manage high-availability operations in a cluster.Anything that
Oracle Clusterware manages is known as a CRS resource.Some examples of CRS resources
are database,an instance,a service,a listener,a VIP address,an application process etc.

51

RAC

What is the use of OCR?


Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).
How does a Oracle Clusterware manage CRS resources?
Oracle clusterware manages CRS resources based on the configuration information of CRS
resources stored in OCR(Oracle Cluster Registry).
Name some Oracle clusterware tools and their uses?
OIFCFG - allocating and deallocating network interfaces
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry
OCRDUMP - Identify the interconnect being used
CVU - Cluster verification utility to get status of CRS resources
What are the modes of deleting instances from ORacle Real Application cluster
Databases?
We can delete instances using silent mode or interactive mode using DBCA(Database
Configuration Assistant).
How do we remove ASM from a Oracle RAC environment?
We need to stop and delete the instance in the node first in interactive or silent mode.After
that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_name
srvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name
How do we verify that an instance has been removed from OCR after deleting an
instance?
Issue the following srvctl command:
srvctl config database -d database_name
cd CRS_HOME/bin
./crs_stat
How do we verify an existing current backup of OCR?
We can verify the current backup of OCR using the following command : ocrconfig
-showbackup
What are the performance views in an Oracle RAC environment?
We have v$ views that are instance specific. In addition we have GV$ views called as global
views that has an INST_ID column of numeric data type.GV$ views obtain information from
individual V$ views.
What are the types of connection load-balancing?
There are two types of connection load-balancing:server-side load balancing and client-side

52

RAC
load balancing.
What is the difference between server-side and client-side connection load
balancing?
Client-side balancing happens at client side where load balancing is done using listener.In
case of server-side load balancing listener uses a load-balancing advisory to redirect
connections to the instance providing best service.
What are the three greatest benefits that RAC provides??
The three main benefits are availability, scalability, and the ability to use low cost
commodity hardware. RAC allows an application to scale vertically, by adding CPU, disk and
memory resources to an individual server. But RAC also provides horizontal scalability, which
is achieved by adding new nodes into the cluster. RAC also allows an organization to bring
these resources online as they are needed. This can save a small or midsize organization a
lot of money in the early stages of a project.
In a RAC environment, if a node in the cluster fails, the application continues to run on the
surviving nodes contained in the cluster. If your application is configured correctly, most
users won't even know that the node they were running on became unavailable.

What are the major RAC wait events?


In a RAC environment the buffer cache is global across all instances in the cluster and hence
the processing
differs.The most common wait events related to this are gc cr request and gc buffer busy
GC CR request: the time it takes to retrieve the data from the remote cache
Reason: RAC Traffic Using Slow Connection or Inefficient queries (poorly tuned queries will
increase the amount of data blocks
requested by an Oracle session. The more blocks requested typically means the more often
a block will need to be read from a remote instance via the interconnect.)
GC BUFFER BUSY: It is the time the remote instance locally spends accessing the requested
data block.
What are the different network components in Oracle 10g RAC?
We have public, private, and VIP components. Private interfaces is for intra node
communication. VIP is all about availability of application. When a node fails then the VIP
component will fail over to some other node, this is the reason that all applications should
be based on VIP components. This means that tns entries should have VIP entry in the host
list.

Convert Single Instance ( Non-RAC ) Database to RAC


Here we are convering a Non-RAC Single Instance on a node where Cluster Services are
Running to a RAC Instance.
Perform the following procedures to convert this type of single-instance database

53

RAC
to a RAC database: =======================================
1. Change the directory to the lib subdirectory in the rdbms directory under the Oracle
home.
2. Relink the oracle binary by executing the following commands , to convert the Oracle
Home to RAC-Enabled:
(Database should be down , to relink 'oracle' executable with rac_on)
make -f ins_rdbms.mk rac_on
make -f ins_rdbms.mk ioracle
Manual Database Conversion Procedure
=======================================
1. Create the OFA directory structure on all of the nodes that you have added.
2. copy the database datafiles, control files, redo logs, and server parameter file to their
ASM Disk Groups (when using ASM) / corresponding raw devices (whan using raw devices )/
Cluster File systems (OCFS2/third party Cluster file systems) using the respective
command.
For ASM , the steps are outlined in : Migrate-Convert-Database-from-Non-ASM-to-ASMUsing-RMAN
3. Re-create the control files by executing the CREATE CONTROLFILE SQL statement with
the REUSE keyword and specify MAXINSTANCES and MAXLOGFILES, and so on, as needed
for your RAC configuration. The MAXINSTANCES recommended default is 32.
4. Shut down the database instance.
5. If your single-instance database was using an SPFILE parameter file, then create a
temporary PFILE from the SPFILE using the following SQL statement:
CREATE PFILE='pfile_name' from spfile='spfile_name'
6. Set the CLUSTER_DATABASE parameter to TRUE, set the
INSTANCE_NUMBER parameter to a unique value for each instance, using a
sid.parameter=value syntax.
7. Start up the database instance using the PFILE created in step 5.
8. If your single-instance database was using automatic undo management, then create an
undo tablespace for each additional instance using the CREATE UNDO TABLESPACE SQL
statement.
9. Create redo threads that have at least two redo logs for each additional instance. If you
are using raw devices, then ensure that the redo log files are on raw devices. Enable the
new redo threads by using an ALTER DATABASE SQL statement. Then shutdown the
database instance.
10. Copy the Oracle password file from the initial node, or from the node from which you
are working, to the corresponding location on the additional nodes on which the cluster
database will have an instance. Make sure that you replace the ORACLE_SID name in each

54

RAC
password file appropriately for each additional instance.
11. Add REMOTE_LISTENER=LISTENERS_DB_NAME and
sid.LOCAL_LISTENER=LISTENER_SID parameters to the PFILE.
12. Configure the net service entries for the database and instances and address entries for
the LOCAL_LISTENER for each instance and REMOTE_LISTENER in the tnsnames.ora file and
copy it to all nodes.
13. Create the SPFILE from the PFILE. If you are not using a cluster file system, then ensure
that the SPFILE is on a raw device.
14. Add the configuration for the RAC database and its instance-to-node mapping using
SRVCTL.
15. Start the RAC database using SRVCTL.
After starting the database with SRVCTL, your conversion process is complete and, for
example, you can execute the following SQL statement to see the statuses of all the
instances in your RAC database:
select * from v$active_instances

RAC - How to Modify the VIP Address or VIP Hostname


Oracle 10g or 11g uses Virtual IP address (VIP) in clustered environment for clients to
connect to the database. During the installation of Oracle Clusterware users are prompted to
enter VIP and VIP hostname for each node in the cluster.
The VIP information is stored in OCR (Oracle Cluster Registry) and also in different HA
framework. Changing the VIP Address or VIP Hostname involves modification of the
nodeapps, which includes the VIP,GSD, Listener, and ONS(Oracle Notification Services).The
VIP can be changed while the nodeapps are running, but the changes will take effect only
when nodeapps are restated.
Note that stopping nodeapps may cause other resources to be stopped for ex: - ASM,
instance or database, so the change should be made during scheduled outage.
Follow the steps to change the VIP address or VIP hostname.
Step 1:- Check the original configuration before change
$ srvctl config nodeapps -n -a
Using '-a' will give you the current VIP hostname, VIP address and interface
Example:
# srvctl config nodeapps -n rac1 -a
VIP exists.: /rac1-vip/192.168.2.31/255.255.255.0/eth0
The
The
The
The

VIP Hostname is 'rac1-vip


VIP IP address is '192.168.2.31'
VIP subnet mask is '255.255.255.0'
Interface Name used by the VIP is called 'eth0'

Step 2:- Stop Instance,ASM,Nodeapps resources

55

RAC
$srvctl stop instance -d devdb -i devdb1
$srvctl stop asm -n rac1
$srvctl stop nodeapps -n rac1
Step 3:- Verify the VIP Address is no longer running by using below command
$ifconfig -a
You can also check the resources status using crs_stat command.
Step 4:- Update /etc/hosts file with new VIP Address or VIP hostname on node1 and also
update DNS to associate the new IP address with VIP hostname as per /etc/hosts file.
Step 5:- Modify VIP Address or VIP hostname on nodeapps by using srvctl command (Run
as root)
#srvctl modify nodeapps -n [-o ] [-A ]
Where
-n < node_name> - Node name.
-o - Oracle Home for the cluster software (CRS-Home).
-A The node level VIP address (/netmask[/if1[|if2|...]]).
Example:- Modify the VIP Address to 192.168.2.41
#srvctl modify nodeapps -n rac1 -A 192.168.2.41/255.255.255.0/eth0
Use below command to change to change VIP address using VIP hostname.The srvctl
command will resolve the IP to hostname or the hostname to IP address. You can use the
same command to change the VIP hostname from rac1-vip to rac01-vip
#srvctl modify nodeapps -n rac1 -A rac01-vip/255.255.255.0/eth0
Step 6:- Verify the change by running below command
$srvctl config nodeapps -n rac1 a
Step7:- Start all resources
$srvct start nodeapps -n rac1
$srvctl start asm -n rac1
$srvctl start instance -d devdb i devdb1
Step8:- Repeat the same steps on all remaining nodes in the cluster.

Voting Disk Backup and Recovery


Voting disk manages node membership information and it is used by Cluster synchronization
services demon (CSSD).
Backing up Voting Disks:Run the below command to back up the voting disk.
$ dd if=voting_disk_name of=backup_file_name
or
$ dd if=voting_disk_name of=backup_file_name bs=4k
Recovering Voting Disks:Run the below command to recover a voting disk
$ dd if=backup_file_name of=voting_disk_name

56

RAC
You can change the Voting Disk Configuration dynamically after the installation.
Please note that you need to run the command as root.
Run the below command to add a voting disk:
# crsctl add css votedisk_path
You can have upto 32 Voting disks.
Run the following command to remove a voting disk:
# crsctl delete css votedisk_path

Restore and Recover OCR from Backup in Oracle RAC


Backup of OCR has been covered in How to Backup OCR
Step 1: Locate physical the OCR backups using showbackup command.
#ocrconfig showbackup
Step 2: Review the contents
#ocrdump backupfile backup_file_name
Step 3: Stop the Oracle clusterware on all the nodes.
#crsctl stop crs
Step 4: Restore the OCR backup
# ocrconfig restore $CRS_HOME/cdata/crs/day.ocr
OR
Restore the OCR from export/logical backup.
# ocrconfig import export_file_name
For ex: - # ocrconfig import /backup/oracle/exp_ocrbackup.dmp
Step 5: Restart the Clusterware on all nodes.
#crsctl start crs
Step 6: Check the OCR integrity
# cluvfy comp ocr n all

Backup of Oracle Cluster Registry (OCR) in Oracle RAC


There are two methods for OCR Backup (Oracle Cluster Registry)
1. Automatically generated OCR files under $CRS_HOME/cdata/crs
2. OCR export/logical backup
The Oracle Clusterware automatically creates OCR backups
-Every four hours: last three copies
-At the End of the Day: last two copies
-At the end of the week: last two copies.
To backup OCR file, copy the generated file from $CRS_HOME/cdata/crs to your backup
directory (/backup/oracle).

57

RAC
You must run the backup as root.
Run the below command to take OCR export backup.
# ocrconfig -export export_file_name
With Oracle RAC 11g Release 1, you can take a manual backup of the OCR with the
command:
# ocrconfig manualbackup
Restoration of OCR from Backup : Restore-and-Recover-OCR-from-Backup

Set Misscount,Disktimeout and reboottime for CSS in Oracle RAC


1) Disktimeout:
====================
Disk Latencies in seconds from node-to-Votedisk. Default Value is 200. (Disk IO)
crsctl get css disktimeout
2) Misscount:
====================
Network Latencies in second from node-to-node (Interconnect). Default Value is 60 Sec
(Linux) and 30 Sec in Unix platform. (Network IO)
Misscount < Disktimeout
crsctl get css misscount
3) RebootTime :
(default 3 seconds) -the amount of time allowed for a node to complete a reboot after the
CSS daemon has been evicted.
crsctl get css reboottime
4) To Edit these values :
==========================================
1.
2.
3.
4.
5.

shut down CRS on all nodes but one as root run crsctl on that remaining node
$CRS_HOME/bin/crsctl set css misscount <n> [-force] (<n> is seconds)
$CRS_HOME/bin/crsctl set css reboottime <r> [-force](<r> is seconds)
$CRS_HOME/bin/crsctl set css disktimeout <d> [-force](<d> is seconds)
reboot the remaining node (the node where you just made the change)

Srvctl Major Options and Usage in Oracle RAC


Database Related Commands
============================================
srvctl start instance -d <db_name> -i <inst_name> = Starts an instance
srvctl start database -d <db_name> = Starts all instances
srvctl stop database -d <db_name> = Stops all instances, closes database

58

RAC
srvctl
srvctl
srvctl
srvctl
srvctl
srvctl
srvctl
srvctl

stop instance -d <db_name> -i <inst_name> = Stops an instance


start service -d <db_name> -s <service_name> = Starts a service
stop service -d <db_name> -s <service_name> = Stops a service
status service -d <db_name> = Checks status of a service
status instance -d <db_name> -i <inst_name> = Checks an individual instance
status database -d <db_name> = Checks status of all instances
start nodeapps -n <node_name> = Starts gsd, vip, listener, and ons
stop nodeapps -n <node_name> = Stops gsd, vip and listener

srvctl start nodeapps -n (node)


This will bring up the gsd, ons, listener, and vip. The same command can shut down the
nodeapps by replacing start with stop.
srvctl start asm -n (node)
This will bring up our ASM instances on nodes
srvctl start instance -d (database) -i (instance)
This will bring up our Database Instance
srvctl start service -d (database) -s (service)
Will start a load balanced/TAF service.
Status of all instances and services
srvctl status database -d devdb
Instance devdb1 is running on node rac1
Instance devdb2 is running on node rac2
Status of a single instance
srvctl status instance -d devdb -i devdb2
Instance devdb2 is running on node rac2
Status of node applications on a particular node
srvctl status nodeapps -n rac1
VIP is running on node: rac1
GSD is running on node: rac1
Listener is running on node: rac1
ONS daemon is running on node: rac1
srvctl status nodeapps -n rac2
VIP is running on node: rac2
GSD is running on node: rac2
Listener is running on node: rac2
ONS daemon is running on node: rac2
Status of an ASM instance
srvctl status asm -n rac1
ASM instance +ASM1 is running on node rac1.
srvctl status asm -n rac2
ASM instance +ASM2 is running on node rac2.

59

RAC
List all configured databases
srvctl config database
devdb
Display configuration for our RAC database
srvctl config database -d devdb
rac1 devdb1 /u01/app/oracle/product/10.2.0/db_1
rac2 devdb2 /u01/app/oracle/product/10.2.0/db_1
Display the configuration for node applications - (VIP, GSD, ONS, Listener)
srvctl config nodeapps -n rac1 -a -g -s -l
VIP exists.: /rac1-vip/192.168.2.31/255.255.255.0/eth0
GSD exists.
ONS daemon exists.
Listener exists.
Display the configuration for the ASM instance(s)
srvctl config asm -n rac1
+ASM1 /u01/app/oracle/product/10.2.0/db_1
Sequence to Start and Stop Services : Sequence-to-Start-and-Stop-RAC-Services

Sequence to Start and Stop RAC Services


Prior to starting these services , the RAC Clusterware Daemons should be Up and Running.
For Details Refer to : Starting-Stopping-Enable-Disable-Clusterware-Processes-Daemons
Sequence to Start and stop RAC Services
==================================
Follow the steps below to start and stop individual application resource.
srvctl start
srvctl start
srvctl start
srvctl start
srvctl start
srvctl start
crs_stat -t

nodeapps -n <node1 hostname>


nodeapps -n <node2 hostname>
asm -n <node1 hostname>
asm -n <node2 hostname>
database -d <database name>
service -d <database name> -s <service name>

Follow the steps below to start and stop individual application resource.
srvctl stop
srvctl stop
srvctl stop
srvctl stop
srvctl stop
srvctl stop
crs_stat -t

service -d <database name> -s <service name>


database -d <database name>
asm -n <node1 hostname>
asm -n <node2 hostname>
nodeapps -n <node1 hostname>
nodeapps -n <node2 hostname>

60

RAC
For complete details on Usage of srvctl , Use : Srvctl-Major-Options-and-Usage

Start ,Stop,Enable,Disable Clusterware Processes / Daemons in Oracle RAC


Starting and Stopping of the Clusterware Processes needs to be done as root user.
Here ohasd refers to Oracle High Availability Services Daemon Link to Oracle Clusterware
components
crsctl start/stop crs - Manage start/stop the entire Oracle Clusterware stack on a node,
including the OHASD process, this command is to be used only on the local node..
crsctl start/stop cluster - Manage start/stop the Oracle Clusterware stack on local node if
you do not specify either -all or -n and nodes remote if option -n or -all be specified , NOT
including the OHASD process. You can't start/stop clusterware stack without OHASD process
running.
crsctl check crs
This command verifies that the above background daemons are functioning.
To check the processes => There are three main background processes you can see when
doing a
ps ef|grep d.bin.

crsctl disable crs


This command will prevent CRS from starting on a reboot. Note there is no return output
from the command.
crsctl enable crs
Enable CRS on reboot

Instance Recovery in RAC


The CRS Daemons first detects the node and instance failure. It communicates the failure
status to the GCS by way of the LMON process. At this stage, any surviving instance in the
cluster initiates the recovery process.
1) During the first phase of recovery, which is the GES reconfiguration, Oracle first
reconfigures the GES enqueues. Then Oracle reconfigures the GCS resources. During this
time, all GCS resource requests and write requests are temporarily suspended. However,
processes and transactions can continue to modify data blocks as long as these processes
and transactions have already acquired the necessary enqueues.
2) After the reconfiguration of enqueues that the GES controlled, a log read and the
remastering of GCS resources occur in parallel.LMON is doing the remastering and SMON is
determining the recovery set using the edo logs of the failed instances. At the end of this
step the block resources that need to be recovered have been identified.

61

RAC
If multiple instance have failed then a redo merge of the redo logs of all the failed instances
would be used to determine the recovery set
3) Buffer space for recovery is allocated and the resources that were idenetified in the
previous reading of the log are claimed as recovery resources. (Therefore size the Buffer
Cache for RAC at least 25% more than for Single Instances)
Then, assuming that there are PIs of blocks to be recovered in other caches in the cluster
database, resource buffers are requested from other instances. The resource buffers are the
starting point of recovery for a particular block.
4) All resources and enqueues required for subsequent processing have been acquired and
the Global Resource Directory is now unfrozen. Any data blocks that are not in recovery set
can now be accessed. Note that the system is already partially available.
5) The cache layer recovers and writes each block identified in step 2, releasing the
recovery resources immediately after block recovery so that more blocks become available
as cache recovery proceeds.
6) After all blocks have been recovered and the recovery resources have been released, the
system is again fully available. Recovered blocks are available after recovery completes.

Time Difference between RAC Nodes


If the time difference between the RAC nodes is out of sync (time difference > 30 sec) then
it will result one of the following issues
1. CRS installation failure on remote node
2. RAC node reboots periodically
3. CRS Application status UKNOWN or OFFLINE
To avoid these issues, configure NTP (Network Time Protocol) on both nodes using any one
of the following methods
1. system-config-time or system-config-date or dateconfig
Type command system-config-time or system-config-date or dateconfig at terminal --> Click
Network Time Protocol ->check Enable Network Time Protocol and select NTP server -->
Click OK
2. date MMDDHHMMSYY
Type command date with current date and time
3. /etc/ntp.conf
Update /etc/ntp.conf file with timeservers IP addresses and start or restart the ntp daemon
$ /etc/init.d/ntp start
or
$ /etc/rc.d/init.d/ntp start
Once RAC nodes are time sync you might need to shutdown and startup the CRS

62

RAC
$ crs_stop -all
$ crs_start -all

Cache Fusion in Oracle RAC


Cache Fusion is a shared cache architecture that uses high speed low latency interconnects
available today on clustered systems to maintain database cache coherency. Database
blocks are shipped across the interconnect to the node where access to the data is needed.
This is accomplished transparently to the application and users of the system.
Overview of Cache Fusion Processing
By default, a resource is allocated for each data block that resides in the cache of an
instance. Due to Cache Fusion and the elimination of disk writes that occur when other
instances request blocks for modifications, the performance overhead to manage shared
data between instances is greatly diminished. Not only do Cache Fusion's concurrency
controls greatly improve performance, but they also reduce the administrative effort for Real
Application Clusters environments.
Cache Fusion addresses several types of concurrency as described under the
following headings:
Concurrent
Concurrent
Concurrent
Concurrent

Reads on Multiple Nodes


Reads and Writes on Different Nodes
Writes on Different Nodes
Reads on Multiple Nodes

Concurrent reads on multiple nodes occur when two instances need to read the same
data block. Real Application Clusters resolves this situation without synchronization because
multiple instances can share data blocks for read access without cache coherency conflicts.
Concurrent Reads and Writes on Different NodesA read request from an instance for a
block that was modified by another instance and not yet written to disk can be a request for
either the current version of the block or for a read-consistent version. In either case, the
Global Cache Service Processes (LMSn) transfer the block from the holding instance's cache
to the requesting instance's cache over the interconnect.
Concurrent Writes on Different Nodes
Concurrent writes on different nodes occur when the same data block is modified frequently
by different instances. In such cases, the holding instance completes its work on the data
block after receiving a request for the block. The GCS then converts the resources on the
block to be globally managed and the LMSn processes transfer a copy of the block to the
cache of the requesting instance. The main features of this processing are:
The Global Cache Service (GCS) tracks a each version of a data block, and each version is
referred to as a past image (PI). In the event of a failure, Oracle can reconstruct the
current version of a block by using the information in a PI.
The cache-to-cache data transfer is done through the high speed IPC interconnect, thus
eliminating disk I/O.
Cache Fusion limits the number of context switches because of the reduced sequence of

63

RAC
round trip messages. Reducing the number of context switches enables greater cache
coherency protocol efficiency. The database writer (DBWn) processes are not involved in
Cache Fusion block transfers.
Write Protocol and Past Image Tracking
When an instance requests a block for modification, the Global Cache Service Processes
(LMSn) send the block from the instance that last modified it to the requesting instance. In
addition, the LMSn process retains a PI of the block in the instance that originally held it.
Writes to disks are only triggered by cache replacements and during checkpoints. For
example, consider a situation where an instance initiates a write of a data block and the
block's resource has a global role. However, the instance only has the PI of the block and
not the most current buffer. Under these circumstances, the instance informs the GCS and
the GCS forwards the write request to the instance where the most recent version of the
block is held. The holder then sends a completion message to the GCS. Finally, all other
instances with PIs of the block delete them.
Resource Control, Cache-to-Cache Transfer, and Cache Coherency
The GCS assigns and opens resources for each data block read into an instance's buffer
cache. Oracle closes resources when the resources do not manage any more buffers or
when buffered blocks are written to disk due to cache replacement. When Oracle closes a
resource, it returns the resource to a list from which Oracle can assign new resources.
Block Access Modes and Buffer States
An additional concurrency control concept is the buffer state which is the state of a buffer in
the local cache of an instance. The buffer state of a block relates to the access mode of the
block. For example, if a buffer state is exclusive current (XCUR), an instance owns the
resource in exclusive mode.
To see a buffer's state, query the STATUS column of the V$BH dynamic performance view.
The V$BH view provides information about the block access mode and their buffer state
names as follows:
With a block access mode of NULL the buffer state name is CR--An instance can
perform a consistent read of the block. That is, if the instance holds an older version of the
data.
With a block access mode of S the buffer state name is SCUR--An instance has shared
access to the block and can only perform reads.
With a block access mode of X the buffer state name is XCUR--An instance has
exclusive access to the block and can modify it.
With a block access mode of NULL the buffer state name is PI--An instance has made
changes to the block but retains copies of it as past images to record its state before
changes.
Only the SCUR and PI buffer states are Real Application Clusters-specific. There can be only
one copy of any one block buffered in the XCUR state in the cluster database at any time. To
perform modifications on a block, a process must assign an XCUR buffer state to the buffer
containing the data block.
For example, if another instance requests read access to the most current version of the

64

RAC
same block, then Oracle changes the access mode from exclusive to shared, sends a current
read version of the block to the requesting instance, and keeps a PI buffer if the buffer
contained a dirty block.
At this point, the first instance has the current block and the requesting instance also has
the current block in shared mode. Therefore, the role of the resource becomes global. There
can be multiple shared current (SCUR) versions of this block cached throughout the cluster
database at any time.

Strategies to Apply Patch to RAC Instances


1) Patching Oracle RAC Nodes Individually - High Availability (Rollover Patching)
In Rolling Patching, each node is shut down, the patch is applied, then each node is brought
back up again. This is done node by node separately until all nodes in Oracle RAC are
patched. This is the most efficient mode of applying an interim patch to an Oracle RAC setup
because this results in no downtime. Only some patches can be applied in this mode. The
type is generally specified in the patch metadata.
The main advantage of this type of patching is that there is absolutely no downtime while
applying patches because only one system is brought down at any given time.
This is best suited when we are applying one-off patches that support this methodology, and
want to maintain high availability of your targets, so when one node is being patched, the
other nodes are available for service.
2) Patching Oracle RAC in Offline Mode
a) Patching Oracle RAC Nodes Individually when each node has its own
ORACLE_HOME (individual)
Option 1 : In All Node Patching, all Oracle RAC nodes are initially brought down and the
patch is applied on all the nodes, then all the nodes are brought back up. This mode is
normally used for very critical patches and it leads to maximum downtime. OPatch uses this
mode as the default for patch applications unless specified otherwise.
Option 2 : In Minimum Downtime Patching, the nodes are divided into sets. Initially, the
first set is shut down and the patch is applied to it. After this, the second set is shut down.
The first set is brought up and patch is applied to the second set. The second set is also
brought up now. All the nodes in the Real Application Clusters are now patched. This mode
leads to less downtime for the Real Application Clusters when both the sets are brought
down.
b) Patching Oracle RAC Nodes Collectively - This needs to be done when we need
to patch a shared ORACLE_HOME
All Oracle RAC nodes are brought down and the patch is applied on one of the nodes, then
all the nodes are brought back up.
For details on how to use opatch refer to : Applying-database-patch-using-opatch-andopatch-options&p=540#post540

65

RAC

Global Resource Directory , GCS and GES


GRD stands for Global Resource Directory. The GES and GCS maintains records of the
statuses of each datafile and each cahed block using global resource directory.This process
is referred to as cache fusion and helps in data integrity.
Oracle RAC is composed of two or more instances. When a block of data is read from
datafile by an instance within the cluster and another instance is in need of the same
block,it is easy to get the block image from the instance which has the block in its SGA
rather than reading from the disk. To enable inter instance communication Oracle RAC
makes use of interconnects. The Global Enqueue Service(GES) monitors and Instance
enqueue process manages the cahce fusion.
To Understand => The GCS and GES itself is not a process. It is implemented by
the LMDs,LMSs and other RAC background processes coordinating with one
another on multiple nodes to handle GLOBAL Resources.
The GCS and GES maintain a Global Resource Directory to record information about
resources. The Global Resource Directory resides in memory, is distributed throughout the
cluster, and is available to all active instances. In this distributed architecture, each node
participates in the management of information in the directory. This distributed scheme
provides fault tolerance and enhanced runtime performance.
The GCS and GES ensure the integrity of the Global Resource Directory even if multiple
nodes fail. The shared database is always accessible if at least one instance is active after
recovery is completed. The fault tolerance of the resource directory also enables Real
Application Clusters instances to start and stop at any time, in any order.

Purpose of the ONS Daemon in RAC


Purpose of the ons daemon
The Oracle Notification Service daemon is an daemon started by the CRS clusterware as
part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware
as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon receive a subset of published clusterware events via
the local evmd and racgimon clusterware daemons and forward those events to application
subscribers and to the local listeners.
This in order to facilitate:
a. the FAN or Fast Application Notification feature or allowing applications to respond to
database state changes.
b. the 10gR2 Load Balancing Advisory, the feature that permit load balancing accross
different rac nodes dependent of the load on the different nodes. The rdbms MMON is

66

RAC
creating an advisory for distribution of work every 30seconds and forward it via racgimon
and ONS to listeners and applications.

Oracle database background processes specific to RAC


Lock monitor (LMON) process:
The LMON process monitors all instances in a cluster to detect the failure of an instance. It
then facilitates the recovery of the global locks held by the failed instance. It is also
responsible for reconfiguring locks and other resources when instances leave or are added
to the cluster (as they fail and come back online, or as new instances are added to the
cluster in real time).
Lock manager daemon (LMD) process:
The LMD process handles lock manager service requests for the global cache service
(keeping the block buffers consistent between instances). It works primarily as a broker
sending requests for resources to a queue that is handled by the LMSn processes. The LMD
handles global deadlock detection/resolution and monitors for lock timeouts in the global
environment.
Lock manager server (LMSn) process:
In a RAC environment, each instance of Oracle is running on a different machine in a cluster,
and they all access, in a read-write fashion, the same exact set of database files. To achieve
this, the SGA block buffer caches must be kept consistent with respect to each other. This is
one of the main goals of the LMSn process In earlier releases of Oracle Parallel Server (OPS)
this was accomplished via a ping. That is, if a node in the cluster needed a read-consistent
view of a block that was locked in exclusive mode by another node, the exchange of data
was done via a disk flush (the block was pinged). This was a very expensive operation just
to read data. Now, with the LMSn, this exchange is done via very fast cache-to-cache
exchange over the clusters high-speed connection. You may have up to ten LMSn processes
per instance.
Its primary job is to transport blocks across the nodes for cache-fusion requests. If there is
a consistent-read request, the LMS process rolls back the block, makes a Consistent-Read
image of the block and then ship this block across the HSI (High Speed Interconnect) to the
process requesting from a remote node.
Lock (LCK0) process:
This process is very similar in functionality to the LMD process described earlier, but it
handles requests for all global resources other than database block buffers.
Diagnosability daemon (DIAG) process:
The DIAG process is used exclusively in a RAC environment. It is responsible for monitoring
the overall health of the instance, and it captures information needed in the processing of
instance failures.

Oracle Clusterware processes and Components


Oracle Clusterware processes for 10g
Cluster Synchronization Services (ocssd) Manages cluster node membership and runs as
the oracle user; failure of this process results in cluster restart.

67

RAC
Cluster Ready Services (crsd) The crs process manages cluster resources (which could be
a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application
process, and so on) based on the resource's configuration information that is stored in the
OCR. This includes start, stop, monitor and failover operations. This process runs as the root
user
Event manager daemon (evmd) A background process that publishes events that crs
creates.
Process Monitor Daemon (OPROCD) This process monitor the cluster and provide I/O
fencing. OPROCD performs its check, stops running, and if the wake up is beyond the
expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure
results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on
Linux platforms.
RACG (racgmain, racgimon) Extends clusterware to support Oracle-specific requirements
and complex resources. Runs server callout scripts when FAN events occur.
Oracle Clusterware Components
Voting Disk Oracle RAC uses the voting disk to manage cluster membership by way of a
health check and arbitrates cluster ownership among the instances in case of network
failures. The voting disk must reside on shared disk.
Oracle Cluster Registry (OCR) Maintains cluster configuration information as well as
configuration information about any cluster database within the cluster. The OCR must
reside on shared disk that is accessible by all of the nodes in your cluster

Add Node to a Cluster


Step 1: Account for Dependencies and Prerequisites
The new node should have the
- same version of the operating system as the existing nodes, including all patches required
for Oracle.
- maintain the current naming convention.
- have the same required rpms installed
- have the kernel parameters set to the same values as the other nodes
- have the same oracle user and group
- have the GID of user oracle are identical to that of the other RAC nodes.
- have the same ulimits for the oracle user as the other nodes
1) Add the entries for the new host (public ip,private ip and vip) in the /etc/hosts for all the
servers
2) Setup the user equivalence with SSH.
When adding nodes to the cluster, Oracle copies files from the node where the installation
was originally performed to the new node in the cluster. Such a copy process is performed

68

RAC
either by using ssh protocol where available or by using the remote copy ( rcp). In order for
the copy operation to be successful, the oracle user on the RAC node must be able to login
to the new RAC node without having to provide a password or passphrase.
ensure passwordless command execution
ssh
ssh
ssh
ssh

rac1 hostname
rac3 hostname
rac1-priv hostname
rac3-priv hostname

Note: When performing these tests for the first time, the operating system will display a key
and request the user to accept or decline. Enter Yes to accept and register the key. Tests
should be performed on all other nodes across all interfaces in the cluster except for the
VIP.
Step 2: Install Oracle Clusterware
Oracle Clusterware is already installed on the cluster; the task here is to add the new node
to the clustered configuration. This task is performed by executing the Oracle provided
utility called addnode located in the Clusterwares home oui/bin directory. Oracle
Clusterware has two files (Oracle cluster repository, OCR; and Oracle Cluster
Synchronization service, CSS, voting disk) that contain information concerning the cluster
and the applications managed by the Oracle Clusterware. These files need to be updated
with the information concerning the new node. The first step in the clusterware installation
process is to verify if the new node is ready for the install.
Cluster Verification. In Oracle Database 10g Release 2, Oracle introduced a new utility called
Cluster Verification Utility (CVU)as part of the clusterware software. Executing the utility
using the appropriate parameters determines the status of the cluster. At this stage, before
beginning installation of the Oracle Clusterware, you should perform two verifications:
If the hardware and operating system configuration is complete:
a) cluvfy stage -post hwos -n rac1,rac3
b) cluvfy stage -pre crsinst -n rac1,rac3
Step 3: Configure Oracle Clusterware
Using the OUI requires that the terminal from where this installer is ran be X-windows
compatible. If not, an appropriate X-windows emulator should be installed and the emulator
invoked using the DISPLAY command using the following syntax.
export DISPLAY=<client IP address>:0.0
The next step is to configure the Clusterware on the new node rac3.
For this, as mentioned earlier, Oracle has provided a new executable called addNode.sh
located in the <Clusterware Home>/oui/bin directory.
Execute the script <Clusterware Home>/oui/bin/addNode.sh.
Welcome - click on Next.

69

RAC
Specify Cluster Nodes to Add to Installation - In this screen, OUI lists existing nodes in the
cluster and in the bottom half of the screen lists the new node(s) information to be added in
the appropriate columns. Once the information is entered click on Next.
Public Node Name Private Node Name Virtual Host Name
rac3 rac3-priv rac3-vip
Cluster Node Addition Summary - Verify the new node is listed under the New Nodes
drilldown and click on Install.
Once all required clusterware components are copied from rac1 to rac3, OUI prompts to
execute three files:
/usr/app/oracle/oraInventory/orainstRoot.sh on node rac3
[root@rac3 oraInventory]# ./orainstRoot.sh
Changing permissions of /u01/app/oracle/oraInventory to 770.
Changing groupname of /u01/app/oracle/oraInventory to dba.
The execution of the script is complete
Execute $ORA_CRS_HOME/install/rootaddnode.sh on node rac1.
(The addnoderoot.sh file will add the new node information to the OCR using the srvctl
utility. Note the srvctl command with the nodeapps parameter at the end of the script
output below.)
[root@rac1 install]# ./rootaddnode.sh
$ORA_CRS_HOME/root.sh on node rac3.
[root@rac3]# ./root.sh
invoke VIPCA also as root. (VIPCA will also configure GSD and ONS resources on the new
node.)
Welcome - click on Next.
On completion of the Oracle Clusterware installation, the following files are created in their
respective directories.
Clusterware files:
[root@rac3 root]# ls -ltr /etc/init.d/init.*
The operating system provided inittab file is updated with the following entries.
[root@rac3 root]# tail -5 /etc/inittab
....
Click on OK after all the listed scripts have run on all nodes.
End of Installation Click on Exit.
Verify if the Clusterware has all the nodes registered using the olsnodes command.
[oracle@rac1 oracle]$ olsnodes
rac1
rac2

70

RAC
rac3
[oracle@rac1 oracle]$
Verify if the cluster services is started, using the crs_stat command.
[oracle@rac1 oracle]$ crs_stat -t
Name Type Target State Host
Verify if the VIP services are configured at the OS level. The virtual IP address is configured
and added to the OS network configuration and the network services are started. The VIP
configuration could be verified using the ifconfig command at the OS level.
[oracle@rac3 oracle]$ ifconfig -a
Note: eth0:1 indicates that it is a VIP address for the basic host eth0. When the node fails
eth0:1 will be moved to a surviving node in the cluster. The new identifier for the VIP on the
failed over server will be indicated by eth0:2 or higher, depending on what other nodes have
failed in the cluster and the VIP has migrated.
Step 4: Install Oracle Software
The next step is to install the Oracle software on the new node. As mentioned earlier, Oracle
has provided a new executable called addNode.sh located in the $ORACLE_HOME/oui/bin
directory.
Execute the script $ORACLE_HOME/oui/in/addNode.sh.
Welcome - click on Next.
Specify Cluster Nodes to Add to Installation - In this screen OUI lists existing nodes in the
cluster and in the bottom half of the screen lists the new node(s). Select the node rac3.
Once the information is entered, click on Next.
Cluster Node Addition Summary - Verify if the new node is listed under the New Nodes
drilldown and click on the Install button.
When copy of Oracle software to node rac3 is complete, OUI will prompt you to execute
/u01/app/oracle/product/10.2.0/db_1/root.sh script from another window as the root user
on the new node(s) in the cluster.
[root@rac3 db_1]# ./root.sh
Running Oracle10 root.sh script...
Step 5: Add New Instance(s)
DBCA has all the required options to add additional instances to the cluster.
Requirements:
Make a full cold backup of the database before commencing the upgrade process.
Oracle Clusterware should be running on all nodes.
Welcome screen - select Oracle Real Application Cluster database and click on Next.
Step 1 of 7: Operations - A list of all operations that can be performed using the DBCA are
listed. Select Instance Management and click on Next.
Step 2 of 7: Instance Management - A list of instance management operations that can be
performed are listed. Select Add an Instance and click on Next.

71

RAC
Step 3 of 7: List of cluster databases - A list of clustered databases running on the node are
listed. In this case the database running on node rac1 is devdb; select this database. In the
bottom part of the screen, DBCA requests you to Specify a user with SYSDBA system
privileges:
Username: sys
Password: < > and click on Next.
Step 4 of 7: List of cluster database instances - The DBCA lists all the instances currently
available on the cluster. Verify if all instances are listed and click on Next.
Step 5 of 7: Instance naming and node selection - DBCA lists the next instance name in the
series and requests the node on which to add the instance. In our example the next
instance name is devdb3 and the node name is rac3. Click on Next after making the
appropriate selection. At this stage there is a small pause before the next screen appears as
DBCA determines the current state of the new node and what services are configured on the
existing nodes.
Step 6 of 7: Database Services - If the current configuration has any database services
configured, this screen will appear (else is skipped). Make the appropriate selections and
click on Next when ready.
Step 7 of 7: Instance Storage - In this screen, DBCA will list the instances specific files such
as undo tablespaces, redo log groups, and so on. Verify if all required files are listed and
click on Finish.
Database Configuration Assistant: Summary - After verifying the summary, click on OK to
begin the software installation.
DBCA verifies the new node rac3, and as the database is configured to use ASM, prompts
with the message ASM is present on the cluster but needs to be extended to the following
nodes: [rac3]. Do you want ASM to be extended? Click on Yes to add ASM to the new
instance.
In order to create and start the ASM instances on the new node, Oracle requires the
Listener to be present and started. DBCA prompts with requesting permission to configure
the listener using port 1521 and listener name LISTENER_rac3. Click on Yes if the default
port is good, else click on No and manually execute NetCA on rac3 to create the listener
using a different port.
Database Configuration Assistant progress screen - Once instance management is complete,
the user is prompted with the message "Do you want to perform another operation? Click
on No to end.
At this stage, the following is true:
The clusterware has been installed on node rac3 and is now part of the cluster.
The Oracle software has been installed on node rac3.
The ASM5 and new Oracle instance devdb3 has been created and configured on rac3.
Verify that the upgrade is successful.

72

RAC
Verify that all instances in the cluster are started using the V$ACTIVE_INSTANCES view
from any of the participating instances. For example:
SQL> select * from v$active_instances;
SQL> SELECT NAME,STATE,TYPE FROM V$ASM_DISKGROUP;
SQL> SELECT NAME FROM V$DATAFILE;
[oracle@rac1 oracle]$ srvctl status database -d devdb
Instance devdb3 is running on node rac3
[oracle@rac1 oracle]$ srvctl status service -d devdb
Step 6: Set the Environment and the TNS entries
For easy administration and navigation, you should define several different environment
variables in the login profile.
[oracle@rac3 oracle]$ more .bash_profile
export ORACLE_BASE=...
export ORACLE_HOME=..
export ORA_CRS_HOME=..
export PATH=..
export ORACLE_ADMIN=..
export TNS_ADMIN=..
Also add the new network address to the clinet tnsnames.ora file to the appropriate connect
descriptors.
<conn_identifier> =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = rac1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = rac2-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = rac3-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = ..)
)
)
If the servers are configured to use FAN features, add the new server address to
the onsctl files on all database servers.
The ons.config file is located in
[oracle@oradb4 oracle]$ more $ORACLE_HOME/opmn/conf/ons.config
localport=6101
remoteport=6201
loglevel=3

73

RAC
useocr=on
nodes=rac2.dbalounge.com:6201,
rac1.dbalounge.com:6201,
rac3.dbalounge.com:6201
onsclient1.dbalounge.com:6200,onsclient2.dbalounge .com:6200

Step by Step Creation of ASM Instance


Overview
Automatic Storage Management (ASM) is a new feature in Oracle10g that alleviates the DBA
from having to manually manage and tune disks used by Oracle databases. ASM provides
the DBA with a file system and volume manager that makes use of an Oracle instance
(referred to as an ASM instance) and can be managed using either SQL or Oracle Enterprise
Manager.
Only one ASM instance is required per node. The same ASM instance can manage ASM
storage for all 10g databases running on the node.
When the DBA installs the Oracle10g software and creates a new database, creating an ASM
instance is a snap. The DBCA provides a simple check box and an easy wizard to create an
ASM instance as well as an Oracle database that makes use of the new ASM instance for
ASM storage. But, what happens when the DBA is migrating to Oracle10g or didn't opt to
use ASM when a 10g database was first created. The DBA will need to know how to
manually create an ASM instance and that is what this article provides.
Configuring Oracle Cluster Synchronization Services (CSS)
Automatic Storage Management (ASM) requires the use of Oracle Cluster Synchronization
Services (CSS), and as such, CSS must be configured and running before attempting to use
ASM. The CSS service is required to enable synchronization between an ASM instance and
the database instances that rely on it for database file storage.
In a non-RAC environment, the Oracle Universal Installer will configure and start a singlenode version of the CSS service. For Oracle Real Application Clusters (RAC) installations, the
CSS service is installed with Oracle Cluster Ready Services (CRS) in a separate Oracle home
directory (also called the CRS home directory). For single-node installations, the CSS service
is installed in and runs from the same Oracle home as the Oracle database.
Because CSS must be running before any ASM instance or database instance starts, Oracle
Universal Installer configures it to start automatically when the system starts. For Linux /
UNIX platforms, the Oracle Universal Installer writes the CSS configuration tasks to the
root.sh which is run by the DBA after the installation process.
With Oracle10g R1, CSS was always configured regardless of whether you chose to
configure ASM or not. On the Linux / UNIX platform, CSS was installed and configured via
the root.sh script. This caused a lot of problems since many did not know what this process
was, and for most of them, didn't want the CSS process running since they were not using
ASM.
Oracle listened carefully to the concerns (and strongly worded complaints) about the CSS
process and in Oracle10g R2, will only configure this process when it is absolutely
necessary. In Oracle10g R2, for example, if you don't choose to configure an ASM standalone instance or if you don't choose to configure a database that uses ASM storage, Oracle
will not automatically configure CSS in the root.sh script.

74

RAC
In the case where the CSS process is not configured to run on the node (see above), you
can make use of the $ORACLE_HOME/bin/localconfig script in Linux / UNIX or
%ORACLE_HOME%\bin\localconfig.bat batch file in Windows. For example in Linux, run the
following command as root to configure CSS outside of the root.sh script after the fact:
$ su
# $ORACLE_HOME/bin/localconfig all
Creating the ASM Instance
The following steps can be used to create a fully functional ASM instance named +ASM. The
node I am using in this example also has a regular 10g database running named PRODDB.
These steps should all be carried out by the oracle UNIX user account:
1. Create Admin Directories
We start by creating the admin directories from the ORACLE_BASE. The admin directories
for the existing database on this node, (PRODDB), is located at
$ORACLE_BASE/admin/PRODDB. The new +ASM admin directories will be created alongside
the PRODDB database:
mkdir
mkdir
mkdir
mkdir
mkdir

-p
-p
-p
-p
-p

$ORACLE_BASE/admin/+ASM/bdump
$ORACLE_BASE/admin/+ASM/cdump
$ORACLE_BASE/admin/+ASM/hdump
$ORACLE_BASE/admin/+ASM/pfile
$ORACLE_BASE/admin/+ASM/udump

2. Create Instance Parameter File


In this step, we will manually create an instance parameter file for the ASM instance. This is
actually an easy task as most of the parameters that are used for a normal instance are not
used for an ASM instance. Note that you should be fine by accepting the default size for the
database buffer cache, shared pool, and many of the other SGA memory sructures. The only
exception is the large pool. I like to manually set this value to at least 12MB. In most cases,
the SGA memory footprint is less then 100MB. Let's start by creating the file init.ora and
placing that file in $ORACLE_BASE/admin/+ASM/pfile. The initial parameters to use for the
file are:
# vi $ORACLE_BASE/admin/+ASM/pfile/init.ora
asm_diskstring='/dev/raw/*'
background_dump_dest=/u01/app/oracle/admin/+ASM/bdump
core_dump_dest=/u01/app/oracle/admin/+ASM/cdump
user_dump_dest=/u01/app/oracle/admin/+ASM/udump
instance_type=asm
compatible=10.1.0.4.0
large_pool_size=12M
remote_login_passwordfile=exclusive
After creating the $ORACLE_BASE/admin/+ASM/pfile/init.ora file, UNIX users should create
the following symbolic link:
$ ln -s $ORACLE_BASE/admin/+ASM/pfile/init.ora $ORACLE_HOME/dbs/init+ASM.ora
Identify RAW Devices
Before starting the ASM instance, we should identify the RAW device(s) (UNIX) or logical
drives (Windows) that will be used as ASM disks. For the purpose of this article, I have four
RAW devices setup on Linux:
# ls -l /dev/raw/raw[1234]

75

RAC
crw-rw---crw-rw---crw-rw---crw-rw----

1
1
1
1

oracle
oracle
oracle
oracle

dba
dba
dba
dba

162,
162,
162,
162,

1
2
3
4

Jun
Jun
Jun
Jun

2
2
2
2

22:04
22:04
22:04
22:04

/dev/raw/raw1
/dev/raw/raw2
/dev/raw/raw3
/dev/raw/raw4

Starting the ASM Instance


Once the instance parameter file is in place, it is time to start the ASM instance. It is
important to note that an ASM instance never mounts an actual database. The ASM instance
is responsible for mounting and managing disk groups.
# su - oracle
$ ORACLE_SID=+ASM; export ORACLE_SID
$ sqlplus "/ as sysdba"
SQL> startup
SQL> create spfile from pfile='/u01/app/oracle/admin/+ASM/pfile/init.ora';
SQL> shutdown
ASM instance shutdown
SQL> startup
ASM instance started
SQL> create spfile from pfile='C:\oracle\product\10.1.0\admin\+ASM\pfile\i nit.ora';
File created.
SQL> shutdown
ASM instance shutdown
SQL> startup
ASM instance started
You will notice when starting the ASM instance, we received the error:
ORA-15110: no diskgroups mounted
This error can be safely ignored.
Notice also that we created a server parameter file (SPFILE) for the ASM instance. This
allows Oracle to automatically record new disk group names in the asm_diskgroups instance
parameter, so that those disk groups can be automatically mounted whenever the ASM
instance is started.
Now that the ASM instance is started, all other Oracle database instances running on the
same node will be able to find it.
Verify RAW / Logical Disk Are Discovered
At this point, we have an ASM instance running, but no disk groups to speak of. ASM disk
groups are created using from RAW (or logical) disks.
Available (candidate) disks for ASM are discovered by use of the asm_diskstring instance
parameter. This parameter contains the path(s) that Oracle will use to discover (or see)
these candidate disks. In most cases, you shouldn't have to set this value as the default
value is set for the supported platform.
The following table is a list of default values for asm_diskstring on supported platforms
when the value of the instance parameter is set to NULL (the value is not set):
Operating System Default Search String

76

RAC
Solaris (32/64 bit) /dev/rdsk/*
Windows NT/XP \\.\orcldisk*
Linux (32/64 bit) /dev/raw/*
HP-UX /dev/rdsk/*
HP-UX(Tru 64) /dev/rdisk/*
AIX /dev/rhdisk/*
I have four RAW devices setup on Linux: You can follow Creating san with
openfiler present in this post to create the raw devices
# ls -l /dev/raw/raw[1234]
crw-rw---- 1 oracle dba 162, 1 Jun 2 22:04 /dev/raw/raw1
crw-rw---- 1 oracle dba 162, 2 Jun 2 22:04 /dev/raw/raw2
crw-rw---- 1 oracle dba 162, 3 Jun 2 22:04 /dev/raw/raw3
crw-rw---- 1 oracle dba 162, 4 Jun 2 22:04 /dev/raw/raw4
I now need to determine if Oracle can find these four disks. The view V$ASM_DISK can be
queried from the ASM instance to determine which disks are being used or may potentially
be used as ASM disks. Note that you must log into the ASM instance with SYSDBA
privileges. Here is the query that I ran from the ASM instance:
$ ORACLE_SID=+ASM; export ORACLE_SID
$ sqlplus "/ as sysdba"
SQL> SELECT group_number, disk_number, mount_status, header_status, state, path
2 FROM v$asm_disk
Note the value of zero in the GROUP_NUMBER column for all four disks. This indicates that a
disk is available but hasn't yet been assigned to a disk group. The next section details the
steps for creating a disk group.
Creating Disk Groups
Disk Group Mirroring and Failure Groups
Before defining the type of mirroring within a disk group, you must group disks into failure
groups. A failure group is one or more disks within a disk group that share a common
resource, such as a disk controller, whose failure would cause the entire set of disks to be
unavailable to the group. In most cases, an ASM instance does not know the hardware and
software dependencies for a given disk. Therefore, unless you specifically assign a disk to a
failure group, each disk in a disk group is assigned to its own failure group. Once the failure
groups have been defined, you can define the mirroring for the disk group; the number of
failure groups available within a disk group can restrict the type of mirroring available for
the disk group. There are three types of mirroring available: external redundancy, normal
redundancy, and high redundancy.
External Redundancy External redundancy requires only one disk location and assumes
that the disk is not critical to the ongoing operation of the database or that the disk is
managed externally with high-availability hardware such as a RAID controller.
Normal Redundancy Normal redundancy provides two-way mirroring and requires at least
two failure groups within a disk group. Failure of one of the disks in a failure group does not
cause any downtime for the disk group or any data loss other than a slight performance hit
for queries against objects in the disk group; when all disks in the failure group are online,

77

RAC
read performance is typically improved because the requested data is available on more
than one disk.
High Redundancy High redundancy provides three-way mirroring and requires at least
three failure groups within a disk group. The failure of disks in two out of the three failure
groups is for the most part transparent to the database users, as in normal redundancy
mirroring. Mirroring is managed at a very low level. Extents, not disks, are mirrored. In
addition, each disk will have a mixture of both primary and mirrored (secondary and
tertiary) extents on each disk. Although a slight amount of overhead is incurred for
managing mirroring at the extent level, it provides the advantage of spreading out the load
from the failed disk to all other disks instead of a single disk.
Disk Group Dynamic Rebalancing
Whenever you change the configuration of a disk groupwhether you are adding or
removing a failure group or a disk within a failure groupdynamic rebalancing occurs
automatically to proportionally reallocate data from other members of the disk group to the
new member of the disk group. This rebalance occurs while the database is online and
available to users; any impact to ongoing database I/O can be controlled by adjusting the
value of the initialization parameter ASM_POWER_LIMIT to a lower value. Not only does
dynamic rebalancing free you from the tedious and often error-prone task of identifying hot
spots in a disk group, it also provides an automatic way to migrate an entire database from
a set of slower disks to a set of faster disks while the entire database remains online. Faster
disks are added as a new failure group in the existing disk group with the slower disks and
the automatic rebalance occurs. After the rebalance operations complete, the failure groups
containing the slower disks are dropped, leaving a disk group with only fast disks. To make
this operation even faster, both the add and drop operations can be initiated within the
same alter diskgroup command.
Create new DiskGroup
Now I will create a new disk group named PRODDB_DATA1 and assign all four discovered
disks to it. The disk group will be configured for NORMAL REDUNDANCY which results in
two-way mirroring of al files within the disk group. Within the disk group, I will be
configuring two failure groups, which defines two independent sets of disk that should never
contain more than one copy of mirrored data (mirrored extents).
The new disk group should be created from the ASM instance using the following SQL:
SQL> CREATE DISKGROUP PRODDB_data1 NORMAL REDUNDANCY
2 FAILGROUP controller1 DISK '/dev/raw/raw1', '/dev/raw/raw2'
3 FAILGROUP controller2 DISK '/dev/raw/raw3', '/dev/raw/raw4';

Diskgroup created.
Now, let's take a look at the new disk group and disk details:
SQL> select group_number, name, total_mb, free_mb, state, type
2 from v$asm_diskgroup;

78

RAC
SQL> select group_number, disk_number, mount_status, header_status, state, path,
failgroup
2 from v$asm_disk;
Using Disk Groups
Finally, let's start making use of the new disk group! Disk groups can be used in place of
actual file names when creating database files, redo log members, control files, etc.
Let's now login to the database instance running on the node that will be making use of the
new ASM instance. I had a database instance already created and running on the node
named PRODDB. The database was created using the local file system for all database files,
redo log members, and control files:
$ ORACLE_SID=PRODDB; export ORACLE_SID
$ sqlplus "/ as sysdba"
SQL> @dba_files_all
Let's now create a new tablespace that makes use of the new disk group:
SQL> create tablespace users2 datafile '+PRODDB_DATA1' size 100m;
Tablespace created.
And that's it! The CREATE TABLESPACE command (above) uses a datafile named
+PRODDB_DATA1. Note that the plus sign (+) in front of the name PRODDB_DATA1
indicates to Oracle that this name is a disk group name, and not an operating system file
name. In this example, the PRODDB instance queries the ASM instance for a new file in that
disk group and uses that file for the tablespace data. Let's take a look at that new file
name:
SQL> @dba_files_all
APPENDEX A
Contents of dba_files_all.sql
SET LINESIZE 147
SET PAGESIZE 9999
SET VERIFY OFF
COLUMN
COLUMN
COLUMN
COLUMN
COLUMN
COLUMN

tablespace FORMAT a29 HEADING 'Tablespace Name / File Class'


filename FORMAT a64 HEADING 'Filename'
filesize FORMAT 99,999,999,999 HEADING 'File Size'
autoextensible FORMAT a4 HEADING 'Auto'
increment_by FORMAT 99,999,999,999 HEADING 'Next'
maxbytes FORMAT 99,999,999,999 HEADING 'Max'

BREAK ON report
COMPUTE SUM OF filesize ON report

79

RAC
SELECT /*+ ordered */
d.tablespace_name tablespace
, d.file_name filename
, d.bytes filesize
, d.autoextensible autoextensible
, d.increment_by * e.value increment_by
, d.maxbytes maxbytes
FROM
sys.dba_data_files d , v$datafile v , (SELECT value FROM v$parameter WHERE name =
'db_block_size') e
WHERE
(d.file_name = v.name)
UNION
SELECT
d.tablespace_name tablespace
, d.file_name filename
, d.bytes filesize
, d.autoextensible autoextensible
, d.increment_by * e.value increment_by
, d.maxbytes maxbytes
FROM
sys.dba_temp_files d , (SELECT value FROM v$parameter WHERE name = 'db_block_size')
e
UNION
SELECT '[ ONLINE REDO LOG ]' , a.member , b.bytes , null , TO_NUMBER(null),
TO_NUMBER(null)
FROM v$logfile a , v$log b
WHERE a.group# = b.group#
UNION
SELECT '[ CONTROL FILE ]' , a.name , TO_NUMBER(null) , null , TO_NUMBER(null),
TO_NUMBER(null)
FROM v$controlfile a ORDER BY 1,2
/

Migrate / Convert Oracle Database from Non-ASM to ASM Using RMAN


The following method shows how a Non-ASM database can be migrated to ASM using RMAN:
Modify the parameter file of the target database as follows:
- ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;
- Set the DB_CREATE_FILE_DEST and DB_CREATE_ONLINE_LOG_DEST_n parameters to
the relevant ASM disk groups.
- Remove the CONTROL_FILES parameter from the spfile so the control files will be moved
to the DB_CREATE_* destination and the spfile gets updated automatically.

80

RAC
ALTER SYSTEM SET db_create_file_dest='+DATA' SCOPE=SPFILE;
ALTER SYSTEM SET db_recovery_file_dest='+FLASH' SCOPE=SPFILE;
ALTER SYSTEM SET control_files='+DATA' SCOPE=SPFILE;
- If you are using a pfile these parameter must be set to the appropriate ASM files or
aliases.
Shutdown the database
SQL> SHUTDOWN IMMEDIATE
Start the database in nomount mode.
Make sure the Environment is set properly
RMAN> CONNECT TARGET /
RMAN> STARTUP NOMOUNT
Restore the controlfile into the new location from the old location.
RMAN> RESTORE CONTROLFILE FROM 'old_control_file_name';
Mount the database.
RMAN> ALTER DATABASE MOUNT;
Copy the database into the ASM disk group.
RMAN> BACKUP AS COPY DATABASE FORMAT '+DATA';
Switch all datafile to the new ASM location.
RMAN> SWITCH DATABASE TO COPY;
Recover the database.
RMAN> RECOVER DATABASE;
Using SQL*Plus to migrate flashback logs, change tracking file and temp files:
SQL> ALTER DATABASE FLASHBACK OFF;
SQL> ALTER DATABASE FLASHBACK ON;
SQL> ALTER DATABASE OPEN;
Create New temporary tablespace in ASM disk group and Drop the old ones.
SQL> CREATE TEMPORARY TABLESPACE temp1 TEMPFILE +DATA;
SQL> DROP TABLESPACE old_temporary_tablespace including contents and datafiles;
SQL> ALTER DATABASE ENABLE BLOCK CHANGE TRACKING;
Create new redo logs in ASM Diskgroup and delete the old ones.
SQL> ALTER SYSTEM SET db_create_file_dest='+DATA' SCOPE=SPFILE;
ALTER SYSTEM SET db_recovery_file_dest='+FLASH' SCOPE=SPFILE;
ALTER SYSTEM SET control_files='+DATA'
System altered.

81

RAC
SQL>
System altered.
SQL>
System altered.
SQL> SHUTDOWN IMMEDIATE
rac1-> rman
Recovery Manager: Release 10.2.0.1.0 - Production on Sun Nov 27 17:33:16 2011
Copyright (c) 1982, 2005, Oracle. All rights reserved.
RMAN> connect target /
connected to target database (not started)
RMAN> startup nomount;

RMAN> RESTORE CONTROLFILE FROM '<original location>/control02.ctl';


Starting restore at 27-NOV-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=156 devtype=DISK
channel ORA_DISK_1: copied control file copy
output filename=+DATA/stagedb/controlfile/backup.267.768333209
Finished restore at 27-NOV-11
RMAN>
RMAN> BACKUP AS COPY DATABASE FORMAT '+DATA';

Starting backup at 27-NOV-11


allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=156 devtype=DISK
channel ORA_DISK_1: starting datafile copy
input datafile fno=00001 name=/u01/app/oracle/oradata/stagedb/system01.dbf
output filename=+DATA/stagedb/datafile/system.268.768333503
tag=TAG20111127T175818 recid=1 stamp=768333584
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:01:26
channel ORA_DISK_1: starting datafile copy

82

RAC
input datafile fno=00003 name=/u01/app/oracle/oradata/stagedb/sysaux01.dbf
output filename=+DATA/stagedb/datafile/sysaux.269.768333589
tag=TAG20111127T175818 recid=2 stamp=768333626
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:46
channel ORA_DISK_1: starting datafile copy
input datafile fno=00002 name=/u01/app/oracle/oradata/stagedb/undotbs01.dbf
output filename=+DATA/stagedb/datafile/undotbs1.270.768333635
tag=TAG20111127T175818 recid=3 stamp=768333641
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile fno=00004 name=/u01/app/oracle/oradata/stagedb/users01.dbf
output filename=+DATA/stagedb/datafile/users.271.768333651 tag=TAG20111127T175818
recid=4 stamp=768333653
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:08
channel ORA_DISK_1: starting datafile copy
copying current control file
output filename=+DATA/stagedb/controlfile/backup.272.768333659
tag=TAG20111127T175818 recid=5 stamp=768333660
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current SPFILE in backupset
channel ORA_DISK_1: starting piece 1 at 27-NOV-11
channel ORA_DISK_1: finished piece 1 at 27-NOV-11
piece
handle=+DATA/stagedb/backupset/2011_11_27/nnsnf0_tag20111127t175818_0.273.7683
33663 tag=TAG20111127T175818 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:05
Finished backup at 27-NOV-11
RMAN>
RMAN> SWITCH DATABASE TO COPY;
datafile
datafile
datafile
datafile

1
2
3
4

switched
switched
switched
switched

to
to
to
to

datafile
datafile
datafile
datafile

copy
copy
copy
copy

"+DATA/stagedb/datafile/system.268.768333503"
"+DATA/stagedb/datafile/undotbs1.270.768333635"
"+DATA/stagedb/datafile/sysaux.269.768333589"
"+DATA/stagedb/datafile/users.271.768333651"

RMAN> RECOVER DATABASE;


Starting recover at 27-NOV-11
using channel ORA_DISK_1
starting media recovery
media recovery complete, elapsed time: 00:00:00

83

RAC
Finished recover at 27-NOV-11

RMAN> exit
Recovery Manager complete.
rac1-> sqlplus
SQL*Plus: Release 10.2.0.1.0 - Production on Sun Nov 27 18:04:35 2011
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Enter user-name: / as sysdba
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> select flashback_on from v$database;
FLASHBACK_ON
-----------------NO
SQL>
SQL> alter database open;
Database altered.
SQL> select tablespace_name from dba_tablespaces;
TABLESPACE_NAME
-----------------------------SYSTEM
UNDOTBS1
SYSAUX
TEMP
USERS

SQL> create temporary tablespace temp1 tempfile '+DATA';


Tablespace created.
SQL> ALTER DATABASE DEFAULT TEMPORARY TABLESPACE TEMP1;

84

RAC
Database altered.
SQL> DROP TABLESPACE TEMP INCLUDING CONTENTS AND DATAFILES;
Tablespace dropped.

SQL> select * from v$logfile;


GROUP# STATUS TYPE
---------- ------- ------MEMBER
-------------------------------------------------------------------------------IS_
--3 ONLINE
/u01/app/oracle/oradata/stagedb/redo03.log
NO
2 ONLINE
/u01/app/oracle/oradata/stagedb/redo02.log
NO
1 ONLINE
/u01/app/oracle/oradata/stagedb/redo01.log
NO

SQL> alter database add logfile group 4 '+DATA';


Database altered.
SQL> alter database add logfile group 5 '+DATA';
Database altered.
SQL> alter database add logfile group 6 '+DATA';
alter system switch logfile;

Drop the old redo log files - Need to make sure that the redo logs groups that are dropped
are not CURRENT ot ACTIVE

85

RAC
SQL> select group#, status from v$log;
GROUP# STATUS
---------- ---------------1 INACTIVE
2 INACTIVE
3 INACTIVE
4 CURRENT
5 UNUSED
6 UNUSED
6 rows selected.
SQL> alter database drop logfile group 1;
Database altered.
SQL> alter database drop logfile group 2;
Database altered.
SQL> alter database drop logfile group 3;
Database altered.
SQL> select group#, status from v$log;
GROUP# STATUS
---------- ---------------4 CURRENT
5 UNUSED
6 UNUSED

Oracle ASM Views and Usage - Quick Reference/Understanding


V$ASM_DISK
===================
In an ASM instance, contains one row for every disk discovered by the ASM instance,
including disks that are not part of any disk group.
This view performs disk discovery every time it is queried.
To find the free space in an ASM disk :
select group_number, disk_number, name, failgroup, create_date, path,
total_mb,free_mb from v$asm_disk;
In a DB instance, contains rows only for disks in the disk groups in use by that DB instance.

86

RAC
V$ASM_DISKGROUP
===================
In an ASM instance, describes a disk group (number, name, size related info, state, and
redundancy type).
To find the free space in an ASM diskgroup :
select name, group_number, name, type, state, total_mb, free_mb from
v$asm_diskgroup;
In a DB instance, contains one row for every ASM disk group mounted by the local ASM
instance.
This view performs disk discovery every time it is queried.
V$ASM_OPERATION
===================
In an ASM instance, contains one row for every active ASM long running operation executing
in the ASM instance.
To see the current ASM operations in Progress :
select group_number, operation, state, power, actual, sofar, est_work, est_rate,
est_minutes from v$asm_operation;
In a DB instance, contains no rows.
V$ASM_ALIAS
==============
In an ASM instance, contains one row for every alias present in every disk group mounted
by the ASM instance.
In a DB instance, contains no rows.
V$ASM_CLIENT
===============
In an ASM instance, identifies databases using disk groups managed by the ASM instance.
In a DB instance, contains information about the ASM instance if the database has any open
ASM files.

V$ASM_FILE
In an ASM instance, contains one row for every ASM file in every disk group mounted by the
ASM instance.
In a DB instance, contains no rows.

Startup and Shutdown of ASM Instances

87

RAC
ASM instance are started and stopped in a similar way to normal database instances.
The options for the STARTUP command are:
FORCE - Performs a SHUTDOWN ABORT before restarting the ASM instance.
MOUNT - Starts the ASM instance and mounts the disk groups specified by the
ASM_DISKGROUPS parameter.
NOMOUNT - Starts the ASM instance without mounting any disk groups.
OPEN - This is not a valid option for an ASM instance.
The options for the SHUTDOWN command are:
NORMAL - The ASM instance waits for all connected ASM instances and SQL sessions to
exit then shuts down.
TRANSACTIONAL - Same as IMMEDIATE.
IMMEDIATE - The ASM instance waits for any SQL transactions to complete then shuts
down. It doesn't wait for sessions to exit.
SHUTDOWN NORMAL and SHUTDOWN IMMEDIATE would return an error if a database
instance is UP (ie. using the diskgroups managed by the ASM Instance)
SQL> shutdown immediate;
ORA-15097: cannot SHUTDOWN ASM instance with connected RDBMS instance
SQL>
SQL>
ABORT - The ASM instance shuts down instantly.
srvctl can be used to start all the ASM instances which are a part of the cluster.
Please refer to Srvctl-Major-Options-and-Usage

ASM Instance Init Parameters - Quick View


ASM_DISKGROUPS
The ASM_DISKGROUPS initialization parameter specifies a list of the names of disk groups
that an ASM instance mounts at startup.
Eg :
SQL> show parameter asm_diskgroups
NAME TYPE VALUE
------------------------------------ ----------- -----------------------------asm_diskgroups string DATA, FLASH
ASM_DISKSTRING
The ASM_DISKSTRING initialization parameter specifies a comma-delimited list of strings
that limits the set of disks that an ASM instance discovers. The discovery strings can include
wildcard characters. Only disks that match one of the strings are discovered.
Eg:
SQL> show parameter asm_diskstring
NAME TYPE VALUE

88

RAC
------------------------------------ ----------- -----------------------------asm_diskstring string /dev/oracleasm/disks/VOL*
ASM_POWER_LIMIT
The ASM_POWER_LIMIT initialization parameter specifies the default power for disk
rebalancing. The default value is 1 and the range of allowable values is 0 to 11 inclusive. A
value of 0 disables rebalancing.
SQL> show parameter asm_power
NAME TYPE VALUE
------------------------------------ ----------- -----------------------------asm_power_limit integer 11
INSTANCE_TYPE
The INSTANCE_TYPE initialization parameter must be set to ASM for an ASM instance. This
is a required parameter and cannot be modified.
SQL> show parameter instance_type
NAME TYPE VALUE
------------------------------------ ----------- -----------------------------instance_type string asm
SQL> show parameter instance_name
NAME TYPE VALUE
------------------------------------ ----------- -----------------------------instance_name string +ASM1
SQL>

Create, Alter, Mount, Dismount, Check, Resize, Rebalance ASM Disk Group
Create Disk Group:
Create Disk groups using the CREATE DISKGROUP statement and specify the level of
redundancy.
Disk group redundancy types:NORMAL REDUNDANCY - Two-way mirroring, requiring two failure groups.
HIGH REDUNDANCY - Three-way mirroring, requiring three failure groups.
EXTERNAL REDUNDANCY - No mirroring for disks that are already protected using hardware
RAID or mirroring.
Example 1 : External Redundancy
SQL> create diskgroup DATA2 external redundancy disk '/dev/oracleasm/disks/VOL1' name
DATA_0002;

89

RAC
Example 2 : Normal Redundancy
SQL> CREATE DISKGROUP data NORMAL REDUNDANCY
FAILGROUP failure_group_1 DISK '/dev/oracleasm/disks/VOL2' NAME
DATA_0003,'/dev/oracleasm/disks/VOL3' NAME data_0004,
FAILGROUP failure_group_2 DISK '/dev/oracleasm/disks/VOL4' NAME
DATA_0005,'/dev/oracleasm/disks/VOL4' NAME DATA_0006;
Drop Disk Group:
Drop disk group using DROP DISKGROUP statement.
SQL> DROP DISKGROUP data INCLUDING CONTENTS;
Alter Disk Group:
Add or remove disks from disk groups Using ALTER DISKGROUP statement. You can also use
wildcard "*" to reference disks.
Add a disk.
SQL> ALTER DISKGROUP data ADD DISK ''/dev/oracleasm/disks/VOL6' ;
Drop/remove a disk.
SQL> ALTER DISKGROUP data DROP DISK DATA_0006;
The UNDROP command used to undo only pending drop of disks. After you drop the disks
you cannot revert.
SQL> ALTER DISKGROUP data UNDROP DISKS;
Diskgroup Rebalance:
Disk groups can be rebalanced manually Using REBALANCE clause and you can modify the
POWER clause default value.
SQL> ALTER DISKGROUP DATA REBALANCE POWER 8;
MOUNT and DISMOUNT DiskGroups:
Normally Disk groups are mounted at ASM instance startup and dismounted at shutdown.
Using MOUNT and DISMOUNT options you can make one or more Disk Groups available or
unavailable.
SQL> ALTER DISKGROUP data MOUNT;
SQL> ALTER DISKGROUP data DISMOUNT;
SQL> ALTER DISKGROUP ALL MOUNT;
SQL> ALTER DISKGROUP ALL DISMOUNT;
DiskGroup Check:
Use CHECK ALL to verify the internal consistency of disk group metadata and repair in case
of any error.
SQL> ALTER DISKGROUP data CHECK ALL;
Add a Directory to a Diskgroup
SQL> ALTER DISKGROUP data add directory '+DATA/DEVDB/<dirname>;
DiskGroup resize:
Resize the one or all disks in the Diskgroup.

90

RAC
Resize all disks in a failure group.
SQL> ALTER DISKGROUP data RESIZE DISKS IN FAILGROUP failure_group_1 SIZE 1024G;
Resize a specific disk.
SQL> ALTER DISKGROUP data RESIZE DISK DATA_0006 SIZE 100G;
Resize all disks in a disk group.
SQL> ALTER DISKGROUP data RESIZE ALL SIZE 100G;
Verify the Disk / Disk Group Modification from v$asm_diskgroup , v$asm_disk.
Verify Rebalance operations from v$asm_operation

Server-side Transparent Application Failover (TAF) in Oracle RAC


Oracle Database 10g Release 2, introduces server-side TAF when using
services. After you create a service, you can use the dbms_service.modify_service pl/sql
procedure to define the TAF policy for the service. Only the basic method is supported. Note
this is different than the TAF policy (traditional client TAF) that is supported by srvctl and EM
Services page.
If your service has a server side TAF policy defined, then you do not have to
encode TAF on the client connection string.
If the instance where a client is connected, fails, then the connection will be failed over to
another instance in the cluster that is supporting the service. All restrictions of TAF still
apply.
NOTE: both the client and server must be 10.2 and aq_ha_notifications must be set to true
for the service.
Sample code to modify service:
execute dbms_service.modify_service (service_name => 'gl.dbalounge.com' , aq_ha_notifications => true , failover_method => dbms_service.failover_method_basic , failover_type => dbms_service.failover_type_select , failover_retries => 180 , failover_delay => 5 , clb_goal => dbms_service.clb_goal_long);
For complete Example of setting up TAF using Services , refer to : Setup-of-Services-inRAC-with-FAN-and-TAF

Transparent Application Failover (TAF) Setup on Client Side for Oracle RAC
Transparent Application Failover (TAF) is a client-side feature that allows for clients to
reconnect to surviving nodes in the event of a failure of an instance. The reconnect happens
automatically from within the OCI (Oracle Call Interface) library.
Uncommitted transactions : are rolled back and server side program variables and session

91

RAC
properties will be lost.
In some case the select statements automatically re-executed on the new connection with
the cursor positioned on the row on which it was positioned prior to the failover.
For high availability and scalability, Oracle provides the Transparent Application Failover
feature part of Oracle Real Application Clusters (RAC).
The failover is configured in tnsnames.ora file, the TAF settings are placed in
CONNECT_DATA section of the tnsnames.ora using FAILOVER_MODES parameters.
TYPE: TAF supports three types of failover types
1.SESSION failover - If a user's connection is lost, SESSION failover establishes a new
session automatically created for the user on the backup node. This type of failover does not
attempt to recover selects. This failover is ideal for OLTP (online transaction processing)
systems, where transactions are small.
2.SELECT failover If the connection is lost, Oracle Net establishes a connection to another
node and re-executes the SELECT statements with cursor positioned on the row on which it
was positioned prior to the failover. This mode involves overhead on the client side and
Oracle NET keeps track of SELECT statements. This approach is best for data warehouse
systems, where the transactions are big and complex
3.NONE: This setting is the default and no failover functionality is provided. Use this setting
to prevent failover.
Example for Failover :
DEVDB =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = RAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = RAC2-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = DEVDB.dbalounge.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 10)(DELAY = 5))
)
)
METHOD: This parameters determines how failover occurs from the primary node to the
backup node
BASIC: Use this mode to establish connections at failover time, no work on the backup
server until failover time.
PRECONNECT: Use this mode to pre-established connections. This PRECONNECT mode
provides faster failover but requires that the backup instance be capable of supporting all
connections from every supported instance.
RETRIES: Use this parameter to specify number of times to attempt to connect to attain a
failover. If DELAY is specified but RETRIES is not specified, RETRIES default to five retry
attempts.

92

RAC
DELAY: Use this parameter to Specify the amount of time in seconds to wait between
connect attempts. If RETRIES is specified but DELAY is not specified, DELAY default to one
second.
Example:
DEVDB1 =
(DESCRIPTION =
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = RAC1-VIP)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = RAC2-VIP)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = DEVDB.dbalounge.com)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 10)(DELAY = 5))
)
)
Please note that you can pre-establish a connection to reduce the failover time using
METHOD=PRECONNECT option.
To verify that TAF is correctly configured, you query FAILOVER_TYPE, FAILOVER_METHOD,
and FAILED_OVER columns in the V$SESSION view.
SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER FROM
GV$SESSION
EXAMPLE :
=========
SQL> SELECT inst_id,sid,username,FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER
FROM GV$SESSION where FAILOVER_TYPE!='NONE'
/
INST_ID SID USERNAME FAILOVER_TYPE FAILOVER_M FAI
---------- ---------- ------------------------------ ------------- ---------- --1 128 SHAILESH SELECT BASIC NO
1 137 SANDEEP SELECT BASIC NO
1 138 PUSHKAR SELECT BASIC NO
2 130 MUKESH SELECT BASIC NO
2 133 SANDEEP SELECT BASIC NO
2 135 RAJ SELECT BASIC NO
2 137 VIKAS SELECT BASIC NO
8 rows selected.
Shutdown instance 1.
SQL> shut immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

93

RAC
SQL>
SQL> /
INST_ID SID USERNAME FAILOVER_TYPE FAILOVER_M FAI
---------- ---------- ------------------------------ ------------- ---------- --2 127 SANDEEP SELECT BASIC YES
2 129 PUSHKAR SELECT BASIC YES
2 130 MUKESH SELECT BASIC NO
2 132 SHAILESH SELECT BASIC YES
2 133 SANDEEP SELECT BASIC NO
2 135 RAJ SELECT BASIC NO
2 137 VIKAS SELECT BASIC NO
7 rows selected.

Oracle RAC Services Setup with Load Balancing , TAF and FAN
Services are entities that can be defined in Oracle RAC databases that enable you to group
database workloads and route work to the optimal instances that are assigned to offer the
service.
To manage workloads, you can define services that you assign to a particular application or
to a subset of an application's operations. You can also group work by type under services.
For example, online users can be a service while batch processing can be another and
reporting can be yet another service type.
When you define a service, you define which instances normally support that service. These
are known as the PREFERRED instances. You can also define other instances to support a
service if the service's preferred instance fails. These are known as AVAILABLE instances.
1) Create a Service
Such a service can be created with :
srvctl add service -d DEVDB -s OLTP -r RAC1,RAC2 a RAC3,RAC4
2) Create the TNS entry for the Service
A service called OLTP has been created on the instances RAC1, RAC2 and RAC3,RAC4 are
the available instances.
The client will connect with the following Oracle*Net alias:
OLTP =
(DESCRIPTION=
(LOAD_BALANCE=ON)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=rac1-vip)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=rac2-vip)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=rac3-vip)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=rac4-vip)(PORT=1521))
)
(CONNECT_DATA=

94

RAC
(SERVICE_NAME=OLTP)))
3) Configured Advanced Queueing on the server side :
FAN has two methods for publishing events to clients, the Oracle Notification Service (ONS),
which is used by Java Database Connectivity (JDBC) clients including the Oracle Application
Server 10g, and Oracle Streams, Advanced Queueing which is used by Oracle Call Interface
(OCI) and Oracle Data Provider for .NET (ODP.NET) clients. When using Advanced Queueing,
you must enable the service to use the queue by setting AQ_HA_NOTIFICATIONS to true.
If we are wanting to use FAN for OCI or ODP.NET clients , then set :
aq_tm_processes=1
4) Set the Connection load balance goal (CLB_GOAL)
CLB_GOAL tells the load balancing advisor which technique to use when performing
connection-based load balancing
SHORTUse the SHORT connection load balancing method for applications that have shortlived connections.
Using CLB_GOAL=SHORT enables Load Balancing Advisory.
LONGUse the LONG connection load balancing method for applications that have longlived connections.
Use this for SQL*Forms , and Connection Pools or Long Running Sessions.
Service Quality goal (GOAL)
Also, the services must be configured to either provide the best response time
(SERVICE_TIME) or the better throughput (THROUGHPUT).
SERVICE_TIME - Oracle will privilege the least loaded node which corresponds to the one
able to answer faster.
THROUGHPUT - the requests will be redirected to the node with the highest number of calls
per second. In theory,the one being able to perform the highest transaction rate (may be
because of a stronger machine in the cluster).
SQL> execute dbms_service.modify_service (service_name => 'OLTP' ,goal
=>DBMS_SERVICE.GOAL_SERVICE_TIME ,clb_goal => dbms_service.clb_goal_long);
PL/SQL procedure successfully completed.
5) Configure the Service for TAF and Turn on AQ HA event notifications ( for FAN)
by using the DBMS_SERVICE package.
execute dbms_service.modify_service ( service_name => 'OLTP' ,aq_ha_notifications => true ,failover_method => dbms_service.failover_method_basic ,failover_type => dbms_service.failover_type_session ,failover_retries => 180, failover_delay => 5 );
6) The services can be started and stopped using the following commands.

95

RAC
srvctl start service -d DEVDB -s OLTP
7) The client connection pool must be setup with the following properties in the
connection string:
Load balacing=true -- for the FAN workload events
HA Events=true -- for the FCF events

Apply Oracle Database Patch using opatch and opatch options


Whenever we encounter ORACLE bugs or INTERNAL ERRORS(eg:ORA-600,ORA-7445) we
need to apply certain piece of code (provided by oracle support) to
the binaries of ORACLE.This piece of code is known as PATCH and the process of applying
patch is known as PATCHING
When we apply the patch to Oracle software installation, it updates the executable files,
libraries, and object files in the ORACLE home directory.
Patches are applied by using OPatch, a utility supplied by Oracle.
To download Patch :We obtain patches from My Oracle Support, which is the Oracle Support Services Web site
To check the Opatch version :[oracle@localhost OPatch]$ opatch version
Invoking OPatch 10.2.0.1.0
OPatch Version: 10.2.0.1.0
OPatch succeeded.
APPLYING PATCH:1) MUST read the Readme.txt file included in opatch file, look for any prereq. steps/ post
installation steps or and DB related changes.
Also, make sure that you have the correct opatch version required by this patch.
2) Check the ORACLE_HOME Environment Variable
3)Update the path environment variable =
oracle@localhost OPatch]$ export PATH=$PATH:$ORACLE_HOME/OPatch
4)Make sure you have a good backup of database.
5)Make a note of all Invalid objects in the database prior to the patch.
6)Shutdown All the Oracle Processes running from that Oracle Home,including the Listener
and Database instance.
7) MUST Backup your oracle Home and Inventory

96

RAC
tar cvf - $ORACLE_HOME $ORACLE_HOME/oraInventory | gzip >
Backup_oracle.tar.gz
8) Unzip the patch in $ORACLE_HOME/OPatch
9) cd to the patch directory and do opatch -apply to apply the patch.
$ORACLE_HOME/OPatch/opatch apply
We can get all the options using opatch -help
Usage: opatch [ -help ] [ -r[eport] ] [ command ]
command := apply
lsinventory
prereq
query
rollback
util
version
To Uninstall the Patch :[oracle@localhost OPatch]$ opatch rollback -id<patch id number>
For complete details of the opatch options and impact , refer to this link : opatch-options

Opatch Options
Patching Procedure to apply a patch using opatch has been covered in :
Applying database patch using opatch
The most commonly-used commands running OPatch are:
1. opatch apply ...
This command applies a patch from the patch directory. The OUI inventory is updated with
the patch's information.
2. opatch rollback ....
This is the command to rollback or undo a patch installation. The respective patch
information is removed from the inventory.
3. opatch lsinventory
lsinventory lists all the patches installed on the Oracle Home.
4. opatch lsinventory -detail
lsinventory -detail gives list of patches and products installed in the ORACLE_HOME.
5. opatch version
version option displays the version of OPatch installed.
6. opatch napply
napply applies the patches in a directory. This is used while applying a patch that is a bundle
of individual patches. napply eliminates the overhead of running opatch multiple times by

97

RAC
the administrator. The napply option skips subsets or duplicates if they are already installed.
7. opatch nrollback
nrollback rolls back the patches using the IDs specified.
8. opatch apply -minimize_downtime
This is specific to Real Application Clusters (RAC) enabled instances (DB tier patches). The
-minimize_downtime option allows you to apply a patch by bringing down one database
server instance at a time. OPatch applies the patch to the quiesced server instance, the
instance is brought back up, and then OPatch moves on to the next database server in a
Real Application Clusters pool.
9. opatch apply -force
-force overrides conflicts with an existing patch by removing the conflicting patch from the
system.
Caution: This option should be only used when explicitly it is said safe to be used in the
README of the patch.
10. opatch apply -invPtrLoc <...>
The option -invPtrLoc can be used to specify the oraInst.loc file in case it's not in the default
location e.g., /etc/oraInst.loc in Linux. The argument to this option is the location of this
file.
11. opatch query
The query command can be used to find out useful information about the patch
Syntax to be used:
opatch query [-all] [-jre <LOC> ] [-oh <LOC> ] \ [-get_component] [-get_os] [-get_date]
[-get_base_bug] \[-is_rolling_patch] [-is_online_patch] \ [-has_sql] [ <Patch Location> ]
all
Retrieves all information about a patch. This is equivalent to setting all available options.
get_base_bug
Retrieves bugs fixed by the patch.
get_component
Retrieves components the patch affects.
get_date
Retrieves the patch creation date and time.
is_online_patch
Indicates true if the patch is an online patch. Otherwise, the option is false.
is_rolling_patch
Indicates true if the patch is a rolling patch. Otherwise, the option is false.
oh
Specifies the Oracle home directory to use instead of the default directory. This takes

98

RAC
precedence over the ORACLE_HOME environment variable.
Patch Location
Indicates the path to the patch location. If you do not specify the location, OPatch assumes
the current directory is the patch location.

You might also like