DB2 - Problem description
Problem IC96369 | Status: Closed |
IN A TSA AUTOMATED HADR ENVIRONMENT, AN AUTOMATED HADR FAILOVER OCCURS IMMEDIATELY AFTER THE TAKEOVER HADR COMMAND IS RUN | |
product: | |
DB2 FOR LUW / DB2FORLUW / A50 - DB2 | |
Problem description: | |
In the case where a manual HADR takeover is issued by running the "db2 takeover hadr on db <dbname>" command from the standby host and it is successful. It is possible that TSA initiates an automated HADR takeover immediately after the fact, switching the HADR standby and primary back to their original locations and leaving them in a disconnected state. The following db2diag.log message should be seen on the standby host at the time that the manual HADR takeover command was issued: 2013-09-26-14.17.39.603232-240 E1930E465 LEVEL: Event PID : 2510 TID : 47851748976960PROC : db2sysc INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-15 APPID: *LOCAL.db2inst1.130926181739 AUTHID : DB2USER1 EDUID : 554 EDUNAME: db2agent (HADRDB) FUNCTION: DB2 UDB, base sys utilities, sqeDBMgr::StartUsingLocalDatabase, probe:13 START : Received TAKEOVER HADR command. Immediately after the above db2diag.log message, the following db2diag.log message should be seen: 2013-09-26-14.17.41.260002-240 E2396E435 LEVEL: Warning PID : 2510 TID : 47851748976960PROC : db2sysc INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-15 APPID: *LOCAL.db2inst1.130926181739 AUTHID : DB2USER1 EDUID : 554 EDUNAME: db2agent (HADRDB) FUNCTION: DB2 Common, SQLHA APIs for DB2 HA Infrastructure, sqlhaLockHADRResource, probe:181 If this second db2diag.log message is not seen, then this issue is being hit. The cause is that the IBM.Test ClusterInitiatedMove resource flag was not properly cleaned up as part of a prior TSA automated HADR takeover. In order to confirm whether or not this flag still exists, verify the output of the following commands: export CT_MANAGEMENT_SCOPE=2 lsrsrc IBM.Test | grep ClusterInitiatedMove and ls /tmp | grep ClusterInitiatedMove If the ClusterInitiatedMove flag resource or file exists, it will need to be manually removed. In order to remove the flag resource, run the following command as root: rmrsrc -s "Name like '%ClusterInitiatedMove%'" IBM.Test | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * TSA Automated HADR environment * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Cancun Release 10.5.0.4 (also known as Fix * * Pack 4) or higher * **************************************************************** | |
Local Fix: | |
Run the following as root and retry the takeover: rmrsrc -s "Name like '%ClusterInitiatedMove%'" IBM.Test | |
available fix packs: | |
DB2 Cancun Release 10.5.0.4 (also known as Fix Pack 4) for Linux, UNIX, and Windows | |
Solution | |
Problem has first been fixed in DB2 Cancun Release 10.5.0.4 (also known as Fix Pack 4) | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 26.09.2013 11.09.2014 11.09.2014 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) | |
10.5.0.4 |