Tuesday, October 9, 2007

Database Cloning process in case of Shutdown abort

Prod database: NEP
Dev database: NEDEV

The Standard process was:

a. Execute the production backup based on the BCV policy.
b. Validate the production backup in the backup server: Startup the database and shutdown immediate
C. copy all the data files from the backup server to the development server and clone the database.

BUT

In this case the 2nd DBA (Paul) used shutdown abort during the backup validation process instead of clean shutdown because the "shutdown immediate" was taking long time.

I was not aware of shutdown abort.

Cloning process:

a. I started the copying process of data files and submitted the parallel jobs of sftp

ftp -n -v << EOF
open x.x.x.x
user oracle
binary
lcd /u01/oradata
cd /u01/oradata
mget *.dbf

Time taken: 3 hours approximately


b. I verified the count of data files. It was matching with the source.

Backup server: Source server: ls -l | awk '{ print $5 " " $9}' > source.txt
Development Server: local server: ls -l | awk '{ print $5 " " $9}' > dev.txt

diff source.txt dev.txt

I did not find any difference.

c. Prepared the control file script. create_ctl_nedev.sql

STARTUP NOMOUNT
CREATE CONTROLFILE set DATABASE "NEDEV" RESETLOGS FORCE LOGGING NOARCHIVELOG
MAXLOGFILES 192
MAXLOGMEMBERS 3
MAXDATAFILES 1024
MAXINSTANCES 32
MAXLOGHISTORY 584
LOGFILE
GROUP 1 (
'/oracle03/oradata/NEDEV/redo_g1_t1_m1.log',
'/oracle04/oradata/NEDEV/redo_g1_t1_m2.log'
) SIZE 100M,
GROUP 2 (
'/oracle03/oradata/NEDEV/redo_g2_t1_m1.log',
'/oracle04/oradata/NEDEV/redo_g2_t1_m2.log'
) SIZE 100M
-- STANDBY LOGFILE
DATAFILE
'/oracle07/oradata/NEDEV/system01.dbf',
'/oracle07/oradata/NEDEV/undotbs01.dbf',
'/oracle07/oradata/NEDEV/sysaux01.dbf',
'/oracle07/oradata/NEDEV/users01.dbf',
'/oracle07/oradata/NEDEV/undotbs02.dbf',
'/oracle07/oradata/NEDEV/s_ts_poaq_data.dbf',
'/oracle07/oradata/NEDEV/s_ts_cc_data01.dbf',
'/oracle07/oradata/NEDEV/s_ts_cc_indx01.dbf',
'/oracle07/oradata/NEDEV/s_fi_data1.dbf',
'/oracle07/oradata/NEDEV/s_ts_poaq_data02.dbf',
'/oracle07/oradata/NEDEV/s_ts_poaq_data03.dbf',
'/oracle07/oradata/NEDEV/s_ts_poaq_indx02.dbf',
CHARACTER SET AL32UTF8
;


e. Created the controlfiles.

sqlplus sys as sysdba
sqlplus> startup nomount
sqlplus> @create_ctl_nedev
controlfile created

sqlplus> recover database until cancel using controlfile;

I started applying the archive logs: I got the archive log information from the alert log of production database. The archive logs generated between begin and end backups.
log sequence numbers: 45675767 to 45675769

I applied the above archive logs and when recovery process prompted the 45675770 then I entered "cancel"

cancel

ORA-01547:Warning:Recover Succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1 '/oracle07/oradata/NEDEV/system01.dbf'

I checked the archive logs in the backup server and there were only 3 archive logs copied in the backup.. 45675767 to 45675769

The above message directed me to check the alert log in the backup server. I found that it was shutdown abort after the backup validation.

I called Paul and he confirmed that he did shutdown abort because the shutdown immediate was taking long time to go through.
We had the deadline to release the database to the application team.

Action Taken:

login to backup server

sqlplus sys as sysdba
SQL> startup
ORACLE instance started.

Total System Global Area 599785472 bytes
Fixed Size 1262200 bytes
Variable Size 192941448 bytes
Database Buffers 398458880 bytes
Redo Buffers 7122944 bytes
Database mounted.
Database opened.

sqlplus> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

The shutdown process finished in 45 minutes and it generated 10 more archive logs.

I copied the archive logs and finished the recovery process.

scp 456757*.arc oracle@nedev:/oracle07/archive
password:


sqlplus> recover database until cancel using backup controlfile;

Applied 456757670 to 456757680

.
.

when it prompted for 456757681 then I entered "cancel"
cancel

Media recovery completed
sqlplus>

9. alter database open resetlogs;
10. I added the temp file in the temporary tablespace.
12. Change the global_name
11. Dropped all the db links and restore the old one

Future Process:

After the above incident I placed the following process for cloning.

1. Validate the backup in the backup server
2. Shutdown immediate
3. Bring down NEDEV and clean the file systems. Delete all datafiles and redolog files.
4. Start the copy process.

- datafiles
- archive logs
- init files
- redologs ( Yes, redolog files )


5. Start the cloning process
login to development server

$hostname
nedev
$export ORACLE_SID=NEP
$sqlplus sys as sysdba
sqlplus> startup nomount pfile='/tmp/initNEP.ora'
sqlplus> @create_ctl.sql
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "NEP" RESETLOGS FORCE LOGGING ARCHIVELOG
MAXLOGFILES 192
MAXLOGMEMBERS 3
MAXDATAFILES 1024
MAXINSTANCES 32
MAXLOGHISTORY 584
LOGFILE
GROUP 1 (
'/oracle03/oradata/NEP/redo_g1_t1_m1.log',
'/oracle04/oradata/NEP/redo_g1_t1_m2.log'
) SIZE 100M,
GROUP 2 (
'/oracle03/oradata/NEP/redo_g2_t1_m1.log',
'/oracle04/oradata/NEP/redo_g2_t1_m2.log'
) SIZE 100M
-- STANDBY LOGFILE
DATAFILE
'/oracle07/oradata/NEP/system01.dbf',
'/oracle07/oradata/NEP/undotbs01.dbf',
'/oracle07/oradata/NEP/sysaux01.dbf',
'/oracle07/oradata/NEP/users01.dbf',
'/oracle07/oradata/NEP/undotbs02.dbf',
'/oracle07/oradata/NEP/s_ts_poaq_data.dbf',
'/oracle07/oradata/NEP/s_ts_cc_data01.dbf',
'/oracle07/oradata/NEP/s_ts_cc_indx01.dbf',
'/oracle07/oradata/NEP/s_fi_data1.dbf',
'/oracle07/oradata/NEP/s_ts_poaq_data02.dbf',
'/oracle07/oradata/NEP/s_ts_poaq_data03.dbf',
'/oracle07/oradata/NEP/s_ts_poaq_indx02.dbf',
CHARACTER SET AL32UTF8
;

CREATE CONTROLFILE reuse DATABASE "NEP" NOPRESETLOGS ARCHIVELOG

sqlplus> alter database open;

Note: We brought up NEP in the development server. We had not the dev and production connectivity so we were not worried about db links.

sqlplus> alter databse backup controlfile to trace;

sqlplus> shutdown immediate

$export ORACLE_SID=NEDEV
$sqlplus sys as sysdba
sqlplus> startup nomount pfile='/tmp/initNEDEV.ora'

$ cat > create_ctl_nedev.sql

STARTUP NOMOUNT
CREATE CONTROLFILE set DATABASE "NEDEV" RESETLOGS FORCE LOGGING NOARCHIVELOG
MAXLOGFILES 192
MAXLOGMEMBERS 3
MAXDATAFILES 1024
MAXINSTANCES 32
MAXLOGHISTORY 584
LOGFILE
GROUP 1 (
'/oracle03/oradata/NEDEV/redo_g1_t1_m1.log',
'/oracle04/oradata/NEDEV/redo_g1_t1_m2.log'
) SIZE 100M,
GROUP 2 (
'/oracle03/oradata/NEDEV/redo_g2_t1_m1.log',
'/oracle04/oradata/NEDEV/redo_g2_t1_m2.log'
) SIZE 100M
-- STANDBY LOGFILE
DATAFILE
'/oracle07/oradata/NEDEV/system01.dbf',
'/oracle07/oradata/NEDEV/undotbs01.dbf',
'/oracle07/oradata/NEDEV/sysaux01.dbf',
'/oracle07/oradata/NEDEV/users01.dbf',
'/oracle07/oradata/NEDEV/undotbs02.dbf',
'/oracle07/oradata/NEDEV/s_ts_poaq_data.dbf',
'/oracle07/oradata/NEDEV/s_ts_cc_data01.dbf',
'/oracle07/oradata/NEDEV/s_ts_cc_indx01.dbf',
'/oracle07/oradata/NEDEV/s_fi_data1.dbf',
'/oracle07/oradata/NEDEV/s_ts_poaq_data02.dbf',
'/oracle07/oradata/NEDEV/s_ts_poaq_data03.dbf',
'/oracle07/oradata/NEDEV/s_ts_poaq_indx02.dbf',
CHARACTER SET AL32UTF8
;


sqlplus> @create_ctl_nedev.sql
Controlfile created

sqlplus> alter database open resetlogs;
sqlplus> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
NEP
sqlplus> alter database rename global_name to 'NEDEV';


6. Add the temporary files in the temporary tablespace
7. Drop database links and restore the old one.

No comments: