Oracle-Database: Backup & Recovery6

Subject: Physical Backup and Recovery: An Insider's Perspective

PHYSICAL BACKUP AND RECOVERY: AN INSIDER'S PERSPECTIVE

INTRODUCTION
============

This article discusses the most common concepts of backup and
recovery of the Oracle database.

This article discusses physical backups and not logical backups.
(An export of the database is an example of a logical backup.)

Why do we need backups?
======================

The most important responsibility of a database administrator is to
prepare for the possibility of media, hardware and software failure.
Should any of these failures occur the major goal is to make the
database available to users within an acceptable time, ensuring
that committed data is undamaged.

This paper discusses only physical backups and not logical backups.
An export of the database is an example of a logical backup.

What are the things that go wrong and eventually lead to recovery?
=================================================================

One or more database files are damaged.
One or more redo log files (including online) are damaged.
One or all control files are damaged.

What is recovery?
================

Restoring the files from backup and rolling forward in time is RECOVERY.

AN INTERNAL VIEW
================

Some admnistrative actions can destroy a database beyond repair. This is
particularly true of backup and recovery commands. To understand the
dangers of certain commands, a database administrator should have a
good understanding of the contents and purpose of the control file,
datafiles and log files. The basics are provided here.

Control file
============

A control file reflects the structure of a database at particular points
in time. It contains the checkpoint information, names of log files
and data files, header information of the files and log sequence number,
which is very important for recovery purposes. The recovery is done
only by applying the log files whose sequence number is greater than
log sequence number in the control file.

Datafile information in control file:
------------------------------------

- Names of datafiles and log files with exact path.
- File size.
- Block size.(Oracle block size)
- Whether the datafile is online or offline.
- Whether the datafile was taken offline automatically or not.
- Whether the datafile belongs to the system tablespace or not.
- Entry for each datafile giving a log sequence number when tablespace
was taken offline.

Log file information in control file:
------------------------------------

- Name with exact path.
- File size.
- Block size. (O/S block size)
- Log sequence#
- Has the file been archived.

Information in the datafile header
==================================

- Log sequence number of next log file that could be applied.
- Whether online backup in progress.

Information in the log file header
==================================

- Log sequence#
- Archival information.

BACKUP
======

Offline Backup
==============

A backup taken when the database is shut down normally is known as
offline or a cold backup. Copying of the datafiles, control file and
online redo log files must be done by using an operating system
copy utility. This is a considered as a complete backup of the
database. Any changes made after this backup will be unrecoverable if
the database is running in NOARCHIVELOG mode. All transactions are
recorded in online redo log files whether archiving or not. When redo
logs are archived (ARCHIVELOG mode), ORACLE allows you to apply these
transactions after restoring files that were damaged (assuming an
active Redo log file was not among the files damaged).

Whenever the schema of the database is changed, such as a new datafile
is added or a file is renamed or a tablespace is created or dropped,
shutdown the database and at least make a copy of the control file and
the newly added datafile. A complete backup of the database is
preferred.

Online Backup
=============

At sites where a database must operate 24-hours per day and when it is
not feasible to take offline backups, then an alternative is provided
by Oracle where physical backups are performed while the database remains
available for both reading and updating. For this kind of backup the
database must be in ARCHIVELOG mode. Only data files and current
control file need to be backed up. Unlike offline backups, the unit of
a online backup is a tablespace, and any or all tablespaces can backed
up whenever needed. Different datafiles can be backed up at different
times.

Procedure
---------

ALTER TABLESPACE ts_name BEGIN BACKUP

Then perform an operating system backup of all datafiles in that
tablespace. Once the backup is completed then it is very important
to issue the command:

ALTER TABLESPACE ts_name END BACKUP

All this must be done while the database is open.

Frequency of online backups
---------------------------

In order to determine how frequently to back up the files of a
database, balance the amount of time available for taking backups
and the time available for recovery after media failure. The time for
recovery depends on how old your most recent copy of the damaged file
is. The older your backup, the more redo log files need to be applied,
and the longer recovery will take.

Backup strategies should be tested before being used to protect a
production database. Ensure that backups of all datafiles and of all
necessary redo logs are kept, and that backups are restored correctly.
(If file compression is used, verify that the file is correct after
decompression.)

What happens between BEGIN BACKUP and END BACKUP?
------------------------------------------------

Once the ALTER TABLESPACE ts_name BEGIN BACKUP is issued the status
in the datafile header is changed to indicate that the datafile is
being backed up. Oracle stops recording the occurrence of checkpoints
in the header of the database files. This means that when a database
file is restored, it will have knowledge of the most recent checkpoint
that occurred BEFORE the backup, not any that occurred during the
backup. This way, the system will ask for the appropriate set of redo
log files to apply should recovery be needed. Since vital information
needed for recovery is recorded in the Redo logs, these REDO LOGS are
considered as part of the backup. Hence, while backing up the database
in this way the database must be in ARCHIVELOG mode. Status in the
datafile header is not reset until END BACKUP is issued.
On END BACKUP, the system again begins noting the occurrence of the
checkpoints in each file of the database. The checkpoint in the
datafile header is changed during the next log switch after END BACKUP
is issued. The above information will allow the tablespace to be
recovered as if the database had been offline when the backup took
place.

Critical Files
==============

All the files belonging to the database are important. Along with
other tablespaces, special care should be taken to ensure that the
SYSTEM tablespace and tablespaces containing rollback segments, are
protected by backups. Also backup the control file and datafile
immediately after adding it to a tablespace or after creating
tablespace if archiving is enabled. If media failure damages a datafile
that has not been backed up, recovering it's tablespace is not
possible. After backing up the newly added datafile, include it in the
regular datafile backup rotation.

RECOVERY
========

Instance Failure
================

Instance failure is a hardware, software, or system failure that
prevents an instance from continuing work. It can be caused by
a CPU failure, an operating system failure, a power outage, failure
of one of the ORACLE background processes or a failure to access a
required database file when the file is not lost or damaged.

Instance Recovery
=================

Instance recovery is automatic. Restarting the database performs the
instance recovery. It involves two steps.

1) Rolling forward.
- data that has not been recorded in the database
- the contents of rollback segments

2) Rolling back transactions that have been explicitly rolled
back or have not been committed.

3) Releasing any resources held by transactions in process at the
time of the failure.

4) Instance recovery is not necessary if the database is shutdown
normally.

Media failure
=============

Media failure is a hardware, software, or system failure that prevents
reading or writing to files that are required to operate the database.
Media failure is a failure caused by the loss of the control file,
database files or redo log file.

Media Recovery
==============

What is needed for media recovery?
---------------------------------

The database must be operating in ARCHIVELOG mode. In addition, you must
have the latest backup of database, all online redo log files, archived logs,
current control file.

Performing Recovery
-------------------

You can use different commands to recover your database. They are:

1) RECOVER DATABASE

This command is only used with a current control file. Database must
be mounted, but not OPEN. The control file is compared with the
datafile header and brings the datafiles up to date by applying
archived redo logs to make the control file entries match the datafile
header. Online redo logs are applied to make the datafiles current.

Once the recovery is complete open the database with:

ALTER DATABASE OPEN

2) RECOVER DATAFILE

This command is used when database is up and cannot be stopped. Can
also be used when the database is in mounted state. The tablespace
which contains these datafiles must be taken offline.

Issue RECOVER DATAFILE , command. You will be
prompted for log files. The changes will be applied only to these
files.

Once the media recovery is complete the tablespace can be brought
online. Allows 'multi-tasking' recovery. Different datafiles can
be recovered parallelly using different sessions or terminals.
Very useful when there are several datafiles to be recovered.

3) RECOVER TABLESPACE

Tablespace must be offline. Database must be in OPEN state. Recovers
a single tablespace to the current state. This command cannot be
used on SYSTEM tablespace or a tablespace which has rollback
segments having a status "in use". If having database OPEN is not an
issue, can recover using standard recovery (RECOVER DATABASE).

4) RECOVER DATABASE MANUAL

Manual recovery after media failure enables you to control how
many redo log files to apply to the database. This can used to undo
an inadvertent change to the database by stopping recovery before the
change took place. MANUAL option needed for recovery with a control
file backup (old copy) and current control file is not available.

Database must be MOUNTed but not OPEN. After MOUNTing the database
connect internal and issue RECOVER DATABASE MANUAL command. Then you
will be prompted beginning with the earliest redo log file recorded in
the header of each database file. The recovery process will continue to
prompt for redo log files until CANCEL is typed when prompted for the
next redo log file. Recovery can be cancelled at any time of any redo
log.

5) RECOVER DATABASE UNTIL

Is same as RECOVER DATABASE MANUAL except the granularity is recovery
is stopped at a specified in time within a log file.
Cannot be used for recovery with an old copy(backup) of control file.

6) RECOVER DATABASE MANUAL UNTIL

Can be used for recovery with an old copy(backup) of control file.
Everything else is similar as RECOVER DATABASE UNTIL.

Opening the Database
--------------------

For safety, before starting any recovery action, always backup datafiles,
online logs, and control files. If space is a constraint then at least
backup the online logs and control files.

Open the database with: ALTER DATABASE OPEN [NO]RESETLOGS

NORESETLOGS

The NORESETLOGS option does not clear the redo log files during startup
and the online redo logs to be used for recovery. Only used in scenario
where MANUAL RECOVERY started, CANCEL used, and then RECOVER DATABASE
is started.

RESETLOGS

CAUTION: Never use RESETLOGS unless necessary.

Once RESETLOGS is used then the redo log files cannot be used
and any completed transactions in those redo logs are lost!!

Before using the RESETLOGS option take an offline backup of the
database.

The RESETLOGS option clears all the online redo logs and modifies
all the online data files to indicate no recovery is needed. After
resetting the redo logs none of the existing log files or data file
backups can be used. In the control file, the log sequence number is
modified, which is very important for recovery purposes. The
recovery will be applied only to the log files whose sequence number
is greater than log sequence number in the control file. One has to
be very cautious when using RESETLOGS option. It is important to
remember that all datafiles must be online otherwise they will become
useless once the database up.

After Recovery
--------------

Take an offline BACKUP after recovering the database using either of
the above options.

CONCLUSION
==========

Oracle provides a Database Administrator with many recovery
options to recover from various types of failure. Each option is
dependent upon the failure and the available backup files. Hence,
good backup strategy is essential for recovery.

Subject: Backup and Recovery Scenarios

PURPOSE

Describe various Backup and Recovery Scenarios

SCOPE & APPLICATION

All support analysts

RELATED DOCUMENTS

Backup and Recovery HandBook, Intro to DataServer Course Material
Backup and Recovery - an Overview

Backup

a) Consistent backups
A consistent backup means that all data files and control files are consistent
to a point in time. I.e. they have the same SCN. This is the only method of
backup when the database is in NO Archive log mode.

b) Inconsistent backups
An Inconsistent backup is possible only when the database is in Archivelog mode
and proper Oracle aware software is used. Most default backup software can not
backup open files. Special precautions need to be used and testing needs to be
done. You must apply redo logs to the data files, in order to restore the
database to a consistent state.

c) Database Archive mode
The database can run in either Archivelog mode or noarchivelog mode.
When you first create the database, you specify if it is to be in Archivelog
mode. Then in the init.ora file you set the parameter log_archive_start=true
so that archiving will start automatically on startup.
If the database has not been created with Archivelog mode enabled, you can
issue the command whilst the database is mounted, not open.
SVRMGR> alter database Archivelog;.
SVRMGR> log archive start
SVRMGR> alter database open
SVRMGR> archive log list
This command will show you the log mode and if automatic archival is set.

d) Backup Methods
Essentially, there are two backup methods, hot and cold, also known as online
and offline, respectively.
A cold backup is one taken when the database is shutdown.
A hot backup is on taken when the database is running.
Commands for a hot backup:
1. Svrmgr>alter database Archivelog
Svrmgr> log archive start
Svrmgr> alter database open
2. Svrmgr> archive log list
--This will show what the oldest online log sequence is. As a precaution,
always keep the all archived log files starting from the oldest online log
sequence.
3. Svrmgr> Alter tablespace tablespace_name BEGIN BACKUP

4. --Using an OS command, backup the datafile(s) of this tablespace.
5. Svrmgr> Alter tablespace tablespace_name END BACKUP
--- repeat step 3, 4, 5 for each tablespace.
6. Svrmgr> archive log list
---do this again to obtain the current log sequence. You will want to make
sure you have a copy of this redo log file.
7. So to force an archived log, issue
Svrmgr> ALTER SYSTEM SWITCH LOGFILE
A better way to force this would be:
svrmgr> alter system archive log current;
8. Svrmgr> archive log list
This is done again to check if the log file had been archived and to find
the latest archived sequence number.
9. Backup all archived log files determined from steps 2 and 8.
Do not backup the online redo logs. These will contain the end-of-backup
marker and can cause corruption if use doing recovery.
10. Back up the control file:
Svrmgr> Alter database backup controlfile to 'filename'

e) Incremental backups
These are backups that are taken on blocks that have been modified since the
last backup. These are useful as they don't take up as much space and time.
There are two kinds of incremental backups
Cumulative and Non cumulative.
Cumulative incremental backups include all blocks that were changed since the
last backup at a lower level. This one reduces the work during restoration as
only one backup contains all the changed blocks.
Noncumulative only includes blocks that were changed since the previous backup
at the same or lower level.
Using rman, you issue the command "backup incremental level n"

f) Support scenarios
When the database crashes, you now have a backup. You restore the backup and
then recover the database. Also, don't forget to take a backup of the control
file whenever there is a schema change.

RECOVERY
=========

There are several kinds of recovery you can perform, depending on the type of
failure and the kind of backup you have. Essentially, if you are not running in
archive log mode, then you can only recover the cold backup of the database and
you will lose any new data and changes made since that backup was taken.
If, however, the database is in Archivelog mode you will be able to restore the
database up to the time of failure.
There are three basic types of recovery:

1. Online Block Recovery.
This is performed automatically by Oracle.(pmon) Occurs when a process dies
while changing a buffer. Oracle will reconstruct the buffer using the online
redo logs and writes it to disk.
2. Thread Recovery.
This is also performed automatically by Oracle. Occurs when an instance
crashes while having the database open. Oracle applies all the redo changes
in the thread that occurred since the last time the thread was checkpointed.
3. Media Recovery.
This is required when a data file is restored from backup. The checkpoint
count in the data files here are not equal to the check point count in the
control file.
This is also required when a file was offlined without checkpoint and when
using a backup control file.

Now let's explain a little about Redo vs Rollback.

Redo information is recorded so that all commands that took place can be
repeated during recovery. Rollback information is recorded so that you can undo
changes made by the current transaction but were not committed. The Redo Logs
are used to Roll Forward the changes made, both committed and non- committed
changes. Then from the Rollback segments, the undo information is used to
rollback the uncommitted changes.

Media Failure and Recovery in Noarchivelog Mode

In this case, your only option is to restore a backup of your Oracle
files.
The files you need are all datafiles, and control files.
You only need to restore the password file or parameter files if they are lost
or are corrupted.

Media Failure and Recovery in Archivelog Mode

In this case, there are several kinds of recovery you can perform, depending on
what has been lost. The three basic kinds of recovery are:
1. Recover database - here you use the recover database command and the database
must be closed and mounted. Oracle will recover all datafiles that are online.
2. Recover tablespace - use the recover tablespace command. The database can be
open but the tablespace must be offline.
3. Recover datafile - use the recover datafile command. The database can be
open but the specified datafile must be offline.

Note: You must have all archived logs since the backup you restored from,
or else you will not have a complete recovery.

a) Point in Time recovery:
A typical scenario is that you dropped a table at say noon, and want to recover
it. You will have to restore the appropriate datafiles and do a point-in-time
recovery to a time just before noon.
Note: you will lose any transactions that occurred after noon.
After you have recovered until noon, you must open the database with resetlogs.
This is necessary to reset the log numbers, which will protect the database
from having the redo logs that weren't used be applied.

The four incomplete recovery scenarios all work the same:

Recover database until time '1999-12-01:12:00:00';
Recover database until cancel; (you type in cancel to stop)
Recover database until change n;
Recover database until cancel using backup controlfile;

Note: When performing an incomplete recovery, the datafiles must be online.
Do a select name, status from v$datafile to find out if there are any files
which are offline. If you were to perform a recovery on a database which has
tablespaces offline, and they had not been taken offline in a normal state, you
will lose them when you issue the open resetlogs command. This is because the
data file needs recovery from a point before the resetlogs option was used.

b) Recovery without control file
If you have lost the current control file, or the current control file is
inconsistent with files that you need to recover, you need to recover either by
using a backup control file command or create a new control file. You can also
recreate the control file based on the current one using the
'backup control file to trace' command which will create a script for you to
run to create a new one.
Recover database using backup control file command must be used when using a
control file other that the current. The database must then be opened with
resetlogs option.

c) Recovery of missing datafile with rollback segment
The tricky part here is if you are performing online recovery. Otherwise you
can just use the recover datafile command. Now, if you are performing an
online recovery, you must first ensure that in the init.ora file, you remove
the parameter rollback_segments. Otherwise, oracle will want to use those
rollback segments when opening the database, but can't find them and wont open.

Until you recover the datafiles that contain the rollback segments, you need to
create some temporary rollback segments in order for new transactions to work.
Even if other rollback segments are ok, they will have to be taken offline.
So, all the rollback segments that belong to the datafile need to be recovered.
If all the datafiles belonging to the tablespace rollback_data were lost, you
can now issue a recover tablespace rollback_data.
Next bring the tablespace online and check the status of the rollback segments
by doing a select segment_name, status from dba_rollback_segs;
You will see the list of rollback segments that are in status Need Recovery.
Simply issue alter rollback segment online command to complete.
Don't forget to reset the rollback_segments parameter in the init.ora.

d) Recovery of missing datafile without rollback segment
There are three ways to recover in this scenario, as mentioned above.
1. recover database
2. recover datafile 'c:\orant\database\usr1orcl.ora'
3. recover tablespace user_data

e) Recovery with missing online redo logs
Missing online redo logs means that somehow you have lost your redo logs before
they had a chance to archived. This means that crash recovery cannot be
performed, so media recovery is required instead. All datafiles will need to
berestored and rolled forwarded until the last available archived log file is
applied. This is thus an incomplete recovery, and as such, the recover
database command is necessary.
(i.e. you cannot do a datafile or tablespace recovery).
As always, when an incomplete recovery is performed, you must open the database
with resetlogs.
Note: the best way to avoid this kind of a loss, is to mirror your online log
files.

f) Recovery with missing archived redo logs
If your archives are missing, the only way to recover the database is to
restore from your latest backup. You will have lost any uncommitted
transactions which were recorded in the archived redo logs. Again, this is why
Oracle strongly suggests mirroring your online redo logs and duplicating copies
of the archives.

g) Recovery with resetlogs option
Reset log option should be the last resort, however, as we have seen from above,
it may be required due to incomplete recoveries. (recover using a backup
control file, or a point in time recovery). It is imperative that you backup
up the database immediately after you have opened the database with reset logs.
The reason is that oracle updates the control file and resets log numbers, and
you will not be able to recover from the old logs.
The next concern will be if the database crashes after you have opened the
database with resetlogs, but have not had time to backup the database.
How to recover?
Shut down the database
Backup all the datafiles and the control file
Startup mount
Alter database open resetlogs
This will work, because you have a copy of a control file after the
resetlogs point.

Media failure before a backup after resetlogs.

If a media failure should occur before a backup was made after you opened the
database using resetlogs, you will most likely lose data.
The reason is because restoring a lost datafile from a backup prior to the
resetlogs will give an error that the file is from a point in time earlier,
and you don't have its backup log anymore.

h) Recovery with corrupted/missing rollback segments.
If a rollback segment is missing or corrupted, you will not be able to open the
database. The first step is to find out what object is causing the rollback to
appear corrupted. If we can determine that, we can drop that object.
If we can't we will need to log an iTar to engage support.

So, how do we find out if it's actually a bad object?

1. Make sure that all tablespaces are online and all datafiles are online.
This can be checked through v$datafile, under the status column.
For tablespaces associated with the datafiles, look in dba_tablespaces.
If this doesn't show us anything, i.e., all are online, then

2. Put the following in the init.ora:
event = "10015 trace name context forever, level 10"

This event will generate a trace file that will reveal information about the
transaction Oracle is trying to roll back and most importantly, what object
Oracle is trying to apply the undo to.

Stop and start the database.

3. Check in the directory that is specified by the user_dump_dest parameter
(in the init.ora or show parameter command) for a trace file that was
generated at startup time.

4. In the trace file, there should be a message similar to:
error recovery tx(#,#) object #.

TX(#,#) refers to transaction information.
The object # is the same as the object_id in sys.dba_objects.

5. Use the following query to find out what object Oracle is trying to
perform recovery on.

select owner, object_name, object_type, status
from dba_objects where object_id =

Oracle-Database

Tuesday, July 24, 2007

Backup & Recovery6

1 comment:

Ayyu

About Me