Tuesday, January 22, 2008

Data Guard 9i Log Transportation on RAC

PURPOSE
-------

This article gives an overview about how to create a Data Guard
Configuration on Real Application Clusters (RAC). The Configuration
you can find here is for a Physical Standby Database.


SCOPE & APPLICATION
-------------------

You can see which initialisation parameters you have to use / change
and how the Log Transport is organized in a RAC-DataGuard environment.

NOTE: The Data Guard Broker and it's interfaces (DGMGRL and Data
Guard Manager from Enterprise Manager) do not support Data Guard
on RAC environements in Oracle 9i releases.


1. Overview Archiving on RAC
----------------------------

You can either run Noarchivelog or Archivelogmode on a RAC environment. If you
configure the Archivelogmode, each instance creates its own Archivelogs from
its Log Threads. These Archivelogs can either be stored on their local private
disks or on a formatted partition in the Shared Disk. Typically you store them
on the Shared Disk, to get each instance accessibility on all Archivelogs.
Alternative, you can transfer the Archivelogs between the primary instances via
the Service: LOG_ARCHIVE_DEST_n=(SERVICE= ...)


2. Create Standby Database
--------------------------

There are two types of Standby Databases supported for RAC. You can either
create the Standby database on a Single-Node or also on a Cluster system.
Due to the Broker being unable to support RAC environments, you have to
setup the Standby Configuration manually.

2.1 Standby Database on a Single-Node
-------------------------------------

In case of creating the Standby Database on a Single-Node you can mostly
proceed like a normal Standby Database creation, that means, first of all you
take a full backup of your Primary Database and create a Standby Controlfile
(ALTER DATABASE CREATE STANDBY CONTROLFILE AS from any
instance). Then you restore the Database including the Standby Contolfile on
your Standby Site.
Next mind to adjust the corresponding Initialisation parameters if needed
(eg. DB_FILE_NAME_CONVERT, STANDBY_ARCHIVE_DEST, CONTROL_FILES). For more
information about the correct settings, Please also follow this Note for
apprpriate Network (TNSNAMES.ORA and LISTENER.ORA) settings. In the next step
we have to configure the Logtransport-Services. This is done by setting up the
LOG_ARCHIVE_DEST_n-parameter on all primary instances. All the instances have
to archive the Logs to the Service of the Single-Node Standby System. The
Standby System creates a "pool" of Archivelogs and on the basis of the SCN it
can determine which Archivelog from which Thread is the next one to apply. So
you have to set the next free LOG_ARCHIVE_DEST_n-parameter to the Standby
Service. The other setting like which process to use for transfer or type of
transfer (SYNC or ASYNC,...) depend on your prefered Protection Mode. (Look at
Init.ora Parameter "LOG_ARCHIVE_DEST_n" Reference Note). If
this is all done, you now can STARTUP MOUNT the Standby Database and set it to
the RECOVERY MODE.

2.2 Standby Database on a RAC-System
------------------------------------

2.2.1 Standby Database is also a Cluster (RAC-) System
------------------------------------------------------
It is also possible to create the Standby System on a RAC-environment, too.
This provides much more comfort, scalability and performance. If you have two
identical systems, in case of switchover or failover, there should be no
performance and availability decrease.
Typically, you have the same number of Instances (Nodes) on the Standby system
as on the Primary System.
First of all you proceed like a normal Standby Database creation. You take full
backup from your Primary Database and create the Standby controlfile from any
of the Instances. Then you Prepare the Standby system for the Database (create
the RAW-devices and logical links, configure your Hardware and install the
Oracle Software on each Node). After that, you are able to restore the backup
including the new created Standby Controlfile. Then also mind to setup the
appropriate parameters in the PFILE or SPFILE for EACH instance correctly (see
2.1 Standby Database on a Single-Node) and also configure the Network parameter
(TNSNAMES.ORA, LISTENER.ORA) corresponing your requirements and settings.
The greatest difference in comparison to the Single-Node system is, that we now
have multiple Standby instances and only one of them can perform the Recovery.
Bascially every Primary Instance transports its Archivelogs to a corresponding
Standby Instance. The Standby Instance receiving the Logs now transfers them to
the instance perfoming the Recovery. This all is handled by the
LOG_ARCHIVE_DEST_n-parameter. Furthermore it is recommended to create Standby
Redologs on the Standy Database for each Standby instance.
For clarification, we will show you a small example:

We have a Primary Database with two Nodes and two Instances (A nd B). We
equally have a Standby-RAC enviroment, also containing the Standby Database
with two Nodes and Instances (C and D). In the normal Primary Database is
Archivelog Mode enabled. Each Primary Instance archives its Redologs on a
formatted Partition of the Shared Disk, if a Cluster File System (CFS) is
supported for the used Platform. Else the Archivelogs are stored on each's
Private Disk area. Furthermore you must use different formats for naming the
Archivelogs to prevent reciproactive overwritings. It is recommended to use
at least the %t in LOG_ARCHIVE_FORMAT to prevent this. The %t represents the
Thread Number where the Log comes from.
Accordingly, we have the following relevant settings:

Instance A:
LOG_ARCHIVE_DEST_1=(location=/u10/ARCHIVE)
LOG_ARCHIVE_FORMAT=arc_%s_%t

Instance B:
LOG_ARCHIVE_DEST_1=(location=/u10/ARCHIVE)
LOG_ARCHIVE_FORMAT=arc_%s_%t

Next we have to add the Log Tansport for each Primary Instance to the
corresponding Standby Instance. As we have Standby Redologs created and want
to have maximum Performance, we have now these settings:

Instance A:
LOG_ARCHIVE_DEST_1=(location=/u10/ARCHIVE)
LOG_ARCHIVE_DEST_2=(SERVICE=C LGWR ...)
LOG_ARCHIVE_FORMAT=arc_%s_%t

Instance B:
LOG_ARCHIVE_DEST_1=(location=/u10/ARCHIVE)
LOG_ARCHIVE_DEST_2=(SERVICE=D LGWR ...)
LOG_ARCHIVE_FORMAT=arc_%s_%t

Now we want to designate Instance D as the Recovering Instance. Therefore the
Archivelogs from the Instance C have to be transfered to Instance D. Instance
D is also performing the Archiving to disk process (Again on the Shared Disk
to be available for both instances). The resulting settings for Instance C
and D now are:

Instance C:
LOG_ARCHIVE_DEST_1=(SERVICE=D ARCH SYNC)

Instance D:
LOG_ARCHIVE_DEST_1=(loaction=/u10/ARCHIVE)

If all this is done, you can now STARTUP NOMOUNT and
ALTER DATABASE MOUNT STANDBY DATABASE all instances and put the
Recovering Instance into RECOVERY Mode.

2.2.2 Standby Database is a Single Node System
----------------------------------------------
Another possibility for a RAC-Standby Configuration is to place the Standby
Database on a Single Node System.
Basically, the prerequisites are the same. The difference now is, that all
Instances from the Primary Database have to ship their Archivelogs to one single
system. Due to the Information (SCN's) in the Archivelogs, the Managed Recovery
Process (MRP) on the Standby Database finds out the correct order to apply the
Archivelogs from the different Threads on the Standby Database.
So in contrast to 2.2.1, we have now as the Primary Instances in our example
Instance A and B. The Single Node Standby Instance is C. This results in these
settings for LOG_ARCHIVE_DEST:

Instance A and B:
LOG_ARCHIVE_DEST_1=(location=/u10/ARCHIVE)
LOG_ARCHIVE_DEST_2=(SERVICE=D LGWR ...)

Please also mind to create different Standby RedoLog Threads on the Standby
Database if you use them.
Instance C now receives all the Logs and applies them to the Standby Database.

Instance C:
LOG_ARCHIVE_DEST_1=(loaction=/u10/ARCHIVE)

2.2.3 Cross-Instance Archival
-----------------------------
In addition to the Parameters shown above, you can configure Cross-instance-
Archival. This means you can ship your Archivelogs from one Primary Instance to
another Primary Instance. This could be helpful in case of Gap-resolution if one
Instance looses Network connect or Archivelogs are deleted by fault. To enable
Cross-instance Archival, you simply have to take one free LOG_ARCHIVE_DEST_n-
parameter and configure it for shipping to another Primary Instance. For
Cross-Instance Archival only the Archive process is supported and can be used.
If we use again our example from 2.2.1, the result would be:

Instance A:
LOG_ARCHIVE_DEST_1=(location=/u10/ARCHIVE)
LOG_ARCHIVE_DEST_2=(SERVICE=C LGWR ...)
LOG_ARCHIVE_DEST_3=(SERVICE=B ARCH ...)
LOG_ARCHIVE_FORMAT=arc_%s_%t

If you also consider to use Cross-instance Archival on Instance B:

Instance B:
LOG_ARCHIVE_DEST_1=(location=/u10/ARCHIVE)
LOG_ARCHIVE_DEST_2=(SERVICE=C LGWR ...)
LOG_ARCHIVE_DEST_3=(SERVICE=A ARCH ...)
LOG_ARCHIVE_FORMAT=arc_%s_%t


3. Gap Resolution and FAL
-------------------------

Basically, there are two Modes of Gap Resolution:
- The automatic one, which is controlled by the Archiver and turned on
automatically. The Automatic Gap resolution works proactive and could
therefore produce some Network overhead.
- The FAL (Fetch Archive Log) Method, which requests certain Logs again from
a remote system. On the Receiving Instance you have to publish one or
more FAL-Server where it can receive its ArchiveLogs from and on the Sending
Instance you have to publish one or more FAL-Clients, who are allowed to
receive ArchiveLogs from that Instance. The publishing is done by the
Initialisation Parameters FAL_SERVER and FAL_CLIENT.
FAL is not supported for Logical Standby Databases.
A typical configuration for our example 2.2.1 could be:

Instance A and B:
FAL_CLIENT=C

Instance C and D:
FAL_SERVER=A,B

With Cross-Instance Archival (2.2.3) configured, Instance C could now also
receive missing Archivelogs from Instance B and Instance D from Instance A.


4. Protection Mode
------------------

All supported Protection Modes for Single-Node-systems are also supported for
RAC-environments in 9.2. You only have to set all Transport Services equal for
each primary instance. Please refer to the Data Guard Concepts and
Administration Guide for more information about specific settings.


5. Switchover and Failover
--------------------------

Switchover and Failovers can also be performed on RAC-Data Guard configurations.
Please perform the COMMIT TO SWITCHOVER on the Recovering instance. The rest
is equal to a Single-Node System. Of course you can also Switchover/Failover
from a RAC- to a Single Node Database. Please also refer to the Data Guard
Concepts and Administration Guide for more information.