Step 1:
Fast-Start Failover: Installation. The FSFO application software is automatically loaded as part of the
standard installation when an Oracle 11g database home is created. Either DGMGRL or Oracle EM Grid Control
can be used to control the FSFO when a complete database home installation is present. Alternatively,
FSFO may be installed by downloading the Oracle 11g Client installation software from otn.oracle.com and
then installing just the Oracle Client Administrator on the desired server; however, it’s important to
note that when it has been installed on a separate server, the FSFO can only be managed via the DGMGRL utility.
Fast-Start Failover: Basic Configuration. Since it’s certainly possible that more than one physical standby database
could exist in a Data Guard configuration, the first thing that I’ll need to establish is which physical standby
database should be paired with the primary database in case a fast-start failover is initiated. I’ll do that by
setting a value for the FastStartFailoverTarget parameter via the DGMGRL utility.
Note that I’ve chosen the primary database as the fast-start failover target for the selected physical standby database as well:
DGMGRL> EDIT DATABASE orcl_primary SET PROPERTY FastStartFailoverTarget = 'orcl_stdby1';
DGMGRL> EDIT DATABASE orcl_stdby1 SET PROPERTY FastStartFailoverTarget = 'orcl_primary';
Step 2:
Next, I’ll establish how long the Fast-Start Failover Observer should wait until it decides that the primary database
is unreachable by setting a value of 180 seconds for the FastStartFailoverThreshold parameter:
EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold = '180';
Now that the basic fast-start failover configuration is completed, I can confirm its status with the SHOW FAST_START FAILOVER command:
DGMGRL> show fast_start failover
Fast-Start Failover: DISABLED
Threshold: 90 seconds
Target: (none)
Observer: orcl_stdby1
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
DGMGRL> show database verbose orcl_primary;
Database
Name: orcl_primary
Role: PRIMARY
Enabled: YES
Intended State: TRANSPORT-ON
Instance(s):
orcl_primary
Properties:
DGConnectIdentifier = 'orcl_primary'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '4'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = ''
FastStartFailoverTarget = 'orcl_stdby1'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName = '11gPrimary'
SidName = 'orcl_primary'
StandbyArchiveLocation = '/u01/app/oracle/flash_recovery_area/ORCL/'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = 'log_%s_%t_%r.arc'
LatestLog = '(monitor)'
TopWaitEvents = '(monitor)'
Current status for "orcl_primary":
SUCCESS
DGMGRL> show database verbose orcl_stdby1
Database
Name: orcl_stdby1
Role: PHYSICAL STANDBY
Enabled: YES
Intended State: APPLY-ON
Instance(s):
orcl_stdby1
Properties:
DGConnectIdentifier = 'orcl_stdby1'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '4'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = '/u01/app/oracle/oradata/orcl/, /u01/app/oracle/oradata/stdby/'
FastStartFailoverTarget = 'orcl_primary'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName = '11gStdby'
SidName = 'orcl_stdby1'
StandbyArchiveLocation = '/u01/app/oracle/flash_recovery_area/STDBY/'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = 'log_%s_%t_%r.arc'
LatestLog = '(monitor)'
TopWaitEvents = '(monitor)'
Current status for "orcl_stdby1":
SUCCESS
Step 3:
Enable fast start failover:
enable fast_start failover
Step 4: Activating the Fast-Start Failover Observer
Now that the configuration of FSFO is complete, all I need to do is enable the configuration via DGMGRL as shown below. Note that I’m also enabling logging of Data Guard Broker activity for the command-line utility so that I can track any unexpected issues related to the FSFO’s performance or configuration:
[oracle@11gStdby ~]$ dgmgrl -logfile 11gStdby1_observer.log
DGMGRL for Linux: Version 11.1.0.6.0 - Production
Copyright (c) 2000, 2005, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys/oracle
Connected.
DGMGRL> ENABLE FAST_START FAILOVER;
Enabled.
Finally, it’s time to start up FSFO. Once again, I’ll use DGMGRL to start the Fast-Start Failover Observer process:
DGMGRL> START OBSERVER;
Once the FSFO is started, I can confirm that it’s been activated properly with the SHOW CONFIGURATION and SHOW DATABASE commands:
DGMGRL> show configuration verbose
Configuration
Name: MAA_orcl
Enabled: YES
Protection Mode: MaxPerformance
Databases:
orcl_primary - Primary database
orcl_stdby1 - Physical standby database
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 180 seconds
Target: orcl_stdby1
Observer: 11gStdby
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Current status for "MAA_orcl":
Warning: ORA-16608: one or more databases have warnings
DGMGRL> show database orcl_primary
Database
Name: orcl_primary
Role: PRIMARY
Enabled: YES
Intended State: TRANSPORT-ON
Instance(s):
orcl_primary
Current status for "orcl_primary":
SUCCESS
DGMGRL> show database orcl_stdby1
Database
Name: orcl_stdby1
Role: PHYSICAL STANDBY
Enabled: YES
Intended State: APPLY-ON
Instance(s):
orcl_stdby1
Current status for "orcl_stdby1":
SUCCESS
DGMGRL> show fast_start failover
Fast-Start Failover: ENABLED
Threshold: 180 seconds
Target: orcl_stdby1
Observer: 11gStdby
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
Step 5:
Automatic Detection of Failover Conditions: An Example
Now that FSFO is fully configured and is ready to detect a failover situation, I’ll use the same technique I used in the prior article about Data Guard failover to simulate a failure of the primary database: I’ll simply issue the kill -9 <pid> command against its Server Monitor (SMON) background process. Once again, the death of the primary database is almost immediately recorded in its alert log:
. . .
Tue Aug 25 18:54:10 2009
Errors in file /u01/app/oracle/diag/rdbms/orcl_primary/orcl_primary/trace/orcl_primary_pmon_6166.trc:
ORA-00474: SMON process terminated with error
PMON (ospid: 6166): terminating the instance due to error 474
Instance terminated by PMON, pid = 6166
. . .
Just as before, the loss of connectivity to the primary database is reflected within the alert log of the corresponding physical standby databases by its Remote File Server (RFS) background process:
. . .
Tue Aug 25 18:54:49 2009
RFS[2]: Possible network disconnect with primary database
Tue Aug 25 18:54:49 2009
RFS[1]: Possible network disconnect with primary database
Tue Aug 25 18:55:49 2009
. . .
This time, however, there’s a dramatic difference! After approximately three minutes have elapsed, there’s a sudden flurry of activity at the physical standby site as the FSFO automatically detects the failure of the primary database. In Listing 7.1, I’ve captured the alert logs of both databases as well as the Data Guard Broker log entries to show all of the actions that Oracle 11g initiates during a Fast-Start Failover. After the automatic failover is complete, the Data Guard configuration fully reflects the successful actions of the FSFO:
DGMGRL> show configuration verbose
Configuration
Name: MAA_orcl
Enabled: YES
Protection Mode: MaxPerformance
Databases:
orcl_stdby1 - Primary database
orcl_primary - Physical standby database (disabled)
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 180 seconds
Target: orcl_primary
Observer: 11gStdby
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Current status for "MAA_orcl":
Warning: ORA-16608: one or more databases have warnings
DGMGRL> show database verbose orcl_stdby1
Database
Name: orcl_stdby1
OEM Name: orcl_11gStdby1
Role: PRIMARY
Enabled: YES
Intended State: TRANSPORT-ON
Instance(s):
orcl_stdby1
Properties:
DGConnectIdentifier = 'orcl_stdby1'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '4'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = '/u01/app/oracle/oradata/orcl/, /u01/app/oracle/oradata/stdby/'
FastStartFailoverTarget = 'orcl_primary'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName = '11gStdby'
SidName = 'orcl_stdby1'
StandbyArchiveLocation = '/u01/app/oracle/flash_recovery_area/STDBY/'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = 'log_%s_%t_%r.arc'
LatestLog = '(monitor)'
TopWaitEvents = '(monitor)'
Current status for "orcl_stdby1":
Warning: ORA-16829: fast-start failover configuration is lagging
DGMGRL> show database verbose orcl_primary
Database
Name: orcl_primary
OEM Name: orcl_11gPrimary
Role: PHYSICAL STANDBY
Enabled: NO
Intended State: APPLY-ON
Instance(s):
orcl_primary
Properties:
DGConnectIdentifier = 'orcl_primary'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '4'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = ''
FastStartFailoverTarget = 'orcl_stdby1'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName = '11gPrimary'
SidName = 'orcl_primary'
StandbyArchiveLocation = '/u01/app/oracle/flash_recovery_area/ORCL/'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = 'log_%s_%t_%r.arc'
LatestLog = '(monitor)'
TopWaitEvents = '(monitor)'
Current status for "orcl_primary":
Error: ORA-16661: the standby database needs to be reinstated