Whenever, Accidently power failure causes server shutdown, Shutdown abort and any other reason that cause database to go down in inconsistent way requires instance recovery. Before we go deep into this we need to understand working of SMON and SCN.
What is this SMON:- SMON is one of the MUST HAVE back ground process. And this is the process that recovers database from instance crash. SMON checks the SCN in all the data files headers and compare this with SCN stored in Control file. If these two matches means database is in consistent state and can be opened.
What is System Change Number (SCN):- This is the number (always increasing) incremented whenever a Commit has happened or AQ, SMON,PMON has completed their work. To get current SCN we need to use DBMS_FLASHBACK.Get_System_Change_Number. SCN is also used in Flashback Query
Eg Select * from scott.emp as of scn 1232097
SCN is checked whenever Oracle starts up the database. SMON compares the SCN of Checkpoint_Change#(v$database.Chechpoint_Change# with v$datafile.checkpoint_change# and v$datafile_header.checkpoint_change#) and if all the SCN are same then database is in consistent state and can be opened.
SCN plays a vital role for providing consistent reads. Basically, it works as follows: The query reads a db block. Beside other information block also contains the information regarding when last time block was changed.. If this SCN is greater than the SCN that was in place when (our) query began, it means that the block was changed after we have started our query. So we have to find an older version of the block. If this block is found in the rollback segments, we use it for our query.
The v$ views use change# to report SCN values. Hence, it can be argued that SCN means System Change Number, not System Commit Number.
When exporting, an SCN can be specified with the exp_flashback_scn export parameter.
More on SCN
There are 3 SCNs basically in control file
1. Checkpoint SCN
2. Stop SCN
3. Thread checkpoint SCN
Checkpoint SCN is the datafile checkpoint SCN when checkpoint happens for datafile. This checkpoint SCN is recorded in datafile header as well.
Stop SCN is the SCN which gets recoreded in control file when datafile is taken in begin backup mode or when datafile is taken offline. This is the checkpoint at a point when datafile header is freezed.
Thread Checkpoint SCN is the one related to online redo log files. This SCN gets generated when ever transaction get recoreded in online redo log file. When we shut down database with normal or immediate option, all these SCN are synchronized and made equal.
/
Eg.
SQL> select banner from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE 10.2.0.1.0 Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 – Production
Now! Check the Database checkpoint change number using V$DATABASE. With this number (CHECKPOINT_CHAGNE#) , all the number (CHECKPOINT_CHANGE#) of datafile header should match.
SQL> select name,checkpoint_change# from v$database;
NAME CHECKPOINT_CHANGE#
--------- ------------------
SEED 1151635
SQL> select name,checkpoint_change# from v$datafile;
Name Checkpoint_Change# Last_Change#
------------------- -------------------------------- --------------------------
D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\SYSTEM01.DBF 1151635 1151635
D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\UNDOTBS01.DBF 1151635 1151635
D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\SYSAUX01.DBF 1151635 1151635
D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\USERS01.DBF 1151635 1151635
D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\EXAMPLE01.DBF 1151635 1151635
D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\RMAN_TBS01.DBF 1151635 1151635
Now, The CHECKPOINT_CHANGE# of Database and Datafile Header are same and more importantly Start SCN (Checkpoint_change#) and Stop SCN (Last_Change#) are same means START and STOP SCN are same and this confirms that our database is consistent state and does not requires any kind of recovery.
So when you use shutdown immediate or Shutdown normal, Database goes down with consistent state. However, this is not the case with “Shutdown Abort” statement.
To understand fully, We need to understand actual working of “Shutdown” process, SCN ,Checkpoint_Change#, Last_Change#.
Shutdown Process:
Shutdown Immediate/Shutdown Normal: This is the most clean method to take database down. Using this statement, DB goes down in consistent state. Following are the stages:
Shutdown Immediate:
SQL> select open_mode from v$database;
OPEN_MODE
----------
READ WRITE
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
Internally, First Oracle closes all the Background (SMON, PMON, CHKPT, DBWR, CJQ0,MMNL,MMON) processes . Secondly if Archival is enabled then it stops. Thirdly, All uncommitted transaction are rolled back, further logons stops. Checkpoint happens, All the blocks ( Changed, Committed) are flushed to Hard disk. If any trigger (System Trigger) is written on BEFORE SHUTDOWN system event than it fires.
Instances withers away, All Locks, Enaques (TX) are released. “ALTER DATABASE CLOSE NORMAL” and “ALTER DATABASE DISMOUNT” statements are executed. After all these happenings our database is now SHUT.
In this scenario, Before closing down Oracle does CHECKPOINT that makes database state consistent. Means no Online REDO logs will be required to open database. Once you open the database Checkpoint_Change# of Database (Controlfile),Datafile(s) and Datafile_Headers will be same.
Checkpoint change number of controlfile.
SQL> Select Checkpoint_Change#,name from v$database;
CHECKPOINT_CHANGE# NAME
---------------------------- -------------
1202269 SEED
Checkpoint change number of Datafile(s).
SQL>Select Checkpoint_Change#,name,last_change# from v$datafile;
CHECKPOINT_CHAGNE# NAME Last_Change#
--------------------- --------------------------------- -------------------------------
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\SYSTEM01.DBF 1151635
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\UNDOTBS01.DBF 1151635
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\SYSAUX01.DBF 1151635
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\USERS01.DBF 1151635
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\EXAMPLE01.DBF 1151635
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\RMAN_TBS01.DBF1151635
Checkpoint Change number in Header of Datafile(s).
SQL>Select dh.checkpoint_change#,df.name from v$datafile_header dh,v$datafile df
where df.file#=dh.file#
CHECKPOINT_CHAGNE# NAME CHECKPOINT_CHAGNE#
--------------------- --------------------------------- ----------------------------------------
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\SYSTEM01.DBF
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\UNDOTBS01.DBF
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\SYSAUX01.DBF
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\USERS01.DBF
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\EXAMPLE01.DBF
1202269 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\RMAN_TBS01.DBF
In this process, Every data file has the same CHECKPOINT_CHANGE# number as DATABASE has. So database does not need any kind of recovery (INSTANCE) recovery.
Now! See how SHUTDOWN ABORT just like a crash and why it needs recovery.
SQL> shutdown abort;
ORACLE instance shut down.
In this scenario, Database has not done anything to make things consistent instead it is just crashed.
However, unlike Media recovery where we need not to apply Archived REDO logs all the changes stored in Online REDO Log files will be re-applied and instance will recover itself.
In Other words START SCN != STOP SCN so recovery is required.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 612368384 bytes
Fixed Size 1250428 bytes
Variable Size 230689668 bytes
Database Buffers 373293056 bytes
Redo Buffers 7135232 bytes
Database mounted.
SQL> Select name,Checkpoint_Change# from v$database;
NAME CHECKPOINT_CHANGE#
--------- ------------------
SEED 1260508
SQL>Select Checkpoint_Change#,name,last_change# from v$datafile;
CHECKPOINT_CHAGNE# NAME Last_Change#
--------------------- --------------------------------- -------------------------------
1260508 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\SYSTEM01.DBF Null
1260508 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\UNDOTBS01.DBF Null
1260508 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\SYSAUX01.DBF Null
1260508 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\USERS01.DBF Null
1260508 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\EXAMPLE01.DBF Null
1260508 D:\ORACLE\PRODUCT\10.2.0\ORADATA\SEED\RMAN_TBS01.DBF Null
Next time DBA starts up the database, it with out asking any recovery will automatically apply contents of Online REDO log files (ROLLFORD) and will open the database.
SQL> startup
ORACLE instance started.
Total System Global Area 612368384 bytes
Fixed Size 1250428 bytes
Variable Size 234883972 bytes
Database Buffers 369098752 bytes
Redo Buffers 7135232 bytes
Database mounted.
Database opened.
I have tried to put the instance recovery process in clear and precise words, but still if anything is missing then any correction, suggestion are most welcome.
Hi,
ReplyDeleteWe have faced an issue where the database is throwing the below error.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/txb7/appl/oracle/isdorchr/system01.dbf'
But my finding is the checkpoint scn is same for all the datafiles.
SQL> select checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
-------------------------
11933366629687
SQL> select distinct checkpoint_change# from v$datafile_header;
CHECKPOINT_CHANGE#
-------------------------
11933366629687
SQL> select controlfile_change# from v$database;
CONTROLFILE_CHANGE#
-------------------------
11933366629687
Please guide me why these are same as according to my understanding since the datafile is asking for recovery the checkpoint scn's should be different.
Select Checkpoint_Change#,name,last_change# from v$datafile
ReplyDeletewhat output is given by this query?
SQL> alter database datafile 'e:\app\dell\oradata\sample11\users02.dbf' offline;
ReplyDeleteDatabase altered.
SQL> insert into scott.emp(empno,ename) values (1211,'jaiddd');
1 row created.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> alter system switch logfile;
System altered.
SQL> alter database datafile 'e:\app\dell\oradata\sample11\users02.dbf' online;
alter database datafile 'e:\app\dell\oradata\sample11\users02.dbf' online
*
ERROR at line 1:
ORA-01113: file 6 needs media recovery
ORA-01110: data file 6: 'E:\APP\DELL\ORADATA\SAMPLE11\USERS02.DBF'
at this point of time
Select name,checkpoint_change# from v$database;
Name Checkpoint_change#
------------ ------------------------
SAMPLE11 1869152
select name,checkpoint_change#,last_change# from v$datafile Where name like '%USERS02.DBF'
NAME CHECKPOINT_CHANGE# LAST_CHANGE#
-------------------- ---------------------- ---------------------
E:\APP\DELL\ORADATA\SAMPLE11\USERS02.DBF 1869152 1869613 ----NOT SAME
SQL> RECOVER DATAFILE 6;
Media recovery complete.
Select name,checkpoint_change# from v$database;
Name Checkpoint_change#
------------ ------------------------
SAMPLE11 1869648
select name,checkpoint_change#,last_change# from v$datafile Where name like '%USERS02.DBF'
NAME CHECKPOINT_CHANGE# LAST_CHANGE#
-------------------- ---------------------- ---------------------
E:\APP\DELL\ORADATA\SAMPLE11\USERS02.DBF 1869648 1869648 ---- SAME
I hope it explains..
please let me know if you need more information this.