home clear 64x64
en blue 200x116 de orange 200x116 info letter User
suche 36x36
Latest versionsfixlist
11.1.0.7 FixList
10.5.0.9 FixList
10.1.0.6 FixList
9.8.0.5 FixList
9.7.0.11 FixList
9.5.0.10 FixList
9.1.0.12 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

DB2 - Problem description

Problem IC75196 Status: Closed

A CONTROLLED TAKEOVER IN A TSA / HADR ENV MAY BE FOLLOWED BY AN IMMEDIATE
FAILBACK IF THERE IS LATENCY IN CHANGES TO RESOURCES

product:
DB2 FOR LUW / DB2FORLUW / 970 - DB2
Problem description:
In a TSA / HADR environment, the state of the database is 
monitored by scripts located under 
/usr/sbin/rsct/sapolicies/db2/.  During the course of a user 
initiated HADR takeover, DB2 issues requests to TSA to created, 
lock, and unlock resources.  If those operations are not 
propagated quickly enough, there is a chance the monitor scripts 
will report the database is down on both servers when there are 
no locks or flags in place.  If that happens, TSA will issue a 
second takeover by force.  A manual reintegration may be 
required. 
 
Note: this issue only affects controlled takeover and it is 
expected to happen only in rare cases. 
 
 
 
 
Node 1 syslogs: 
 
## The database is reporting as online (return code 1): 
Jan 25 12:23:42 server-a user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[3735742]: 
Returning 1 : db2inst1 db2inst1 SAMPLE 
Jan 25 12:24:04 server-a user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[3735792]: 
Returning 1 : db2inst1 db2inst1 SAMPLE 
 
## The takeover is issued and has started several seconds ago, 
but the takeover is not finished and the monitor script runs and 
reports offline (return code 2): 
Jan 25 12:24:26 server-a user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[983346]: 
Returning 2 : db2inst1 db2inst1 SAMPLE 
Jan 25 12:24:48 server-a user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[2752700]: 
Returning 2 : db2inst1 db2inst1 SAMPLE 
Jan 25 12:25:10 server-a user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[3866888]: 
Returning 2 : db2inst1 db2inst1 SAMPLE 
Jan 25 12:25:32 server-a user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[2228742]: 
Returning 2 : db2inst1 db2inst1 SAMPLE 
 
 
 
Node 2 syslogs: 
 
## This was the standby, so reporting offline is expected 
Jan 25 12:23:45 server-b user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[2294478]: 
Returning 2 : db2inst1 db2inst1 SAMPLE 
Jan 25 12:24:07 server-b user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[1508116]: 
Returning 2 : db2inst1 db2inst1 SAMPLE 
 
## The takeover starts and resource changes are made 
Jan 25 12:24:07 server-b user:debug root[2687334]: Entering 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg lock 
Jan 25 12:24:13 server-b user:debug root[2294526]: Exiting 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg lock: 1 
Jan 25 12:24:23 server-b user:debug root[1769768]: Entering 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg unlock 
Jan 25 12:24:23 server-b user:debug root[3211824]: Exiting 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg unlock: 0 
Jan 25 12:24:26 server-b user:debug root[2687360]: Entering 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg lock 
 
## The monitor on node 1 ran at this point and returned offline. 
 The last status for node 2 is also offline and we are still 
modifying the resources: 
 
Jan 25 12:24:29 server-b user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[2294596]: 
Returning 2 : db2inst1 db2inst1 SAMPLE 
Jan 25 12:24:32 server-b user:debug root[3473642]: Exiting 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg lock: 1 
Jan 25 12:24:32 server-b user:debug root[3146076]: Entering 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg unlock 
Jan 25 12:24:32 server-b user:notice 
/usr/sbin/rsct/sapolicies/db2/hadrV95_start.ksh[3211836]: 
Entering : db2inst1 db2inst1 SAMPLE 
Jan 25 12:24:32 server-b user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_start.ksh[2949514]: su - 
db2inst1 -c db2gcf -t 3600 -u -i db2inst1 -i db2inst1 -h SAMPLE 
-L 
Jan 25 12:24:32 server-b user:debug root[2163396]: Exiting 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg unlock: 0 
Jan 25 12:24:33 server-b user:notice 
/usr/sbin/rsct/sapolicies/db2/hadrV95_start.ksh[2163400]: 
Returning 0 : db2inst1 db2inst1 SAMPLE 
Jan 25 12:24:33 server-b user:debug root[3211870]: Entering 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg lock 
Jan 25 12:24:34 server-b user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[2687424]: 
Returning 1 : db2inst1 db2inst1 SAMPLE 
Jan 25 12:24:39 server-b user:debug root[2687428]: Exiting 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg lock: 1 
Jan 25 12:24:40 server-b user:debug root[3211884]: Entering 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg unlock 
Jan 25 12:24:40 server-b user:debug root[2753468]: Exiting 
/usr/sbin/rsct/sapolicies/db2/lockreqprocessed 
db2_db2inst1_db2inst1_SAMPLE-rg unlock: 0 
Jan 25 12:24:56 server-b user:debug 
/usr/sbin/rsct/sapolicies/db2/hadrV95_monitor.ksh[6553680]: 
Returning 1 : db2inst1 db2inst1 SAMPLE
Problem Summary:
**************************************************************** 
* USERS AFFECTED:                                              * 
* Users of TSAMP HA solutions.                                 * 
**************************************************************** 
* PROBLEM DESCRIPTION:                                         * 
* See Problem Description above.                               * 
**************************************************************** 
* RECOMMENDATION:                                              * 
* Upgrade to DB2 Version 9.7 Fix Pack 5.                       * 
****************************************************************
Local Fix:
Verify the current TSA resources OpState and the actual DB2 HADR 
roles. If manual reintegration is required, issue "db2 start 
hadr on <dbname> as standby". 
 
If issue is readily reproducible, there may be an underlying 
latency problem.  Resolve latency problem to reduce the chance 
the monitor scripts will run in the middle of resource 
management.
available fix packs:
DB2 Version 9.7 Fix Pack 5 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 6 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 7 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 8 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 9 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 9a for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 10 for Linux, UNIX, and Windows

Solution
First Fixed in Version 9.7 Fix Pack 5.
Workaround
not known / see Local fix
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
23.03.2011
23.12.2011
23.12.2011
Problem solved at the following versions (IBM BugInfos)
9.7.FP5
Problem solved according to the fixlist(s) of the following version(s)
9.7.0.5 FixList