DB2 - Problem description
Problem IC79599 | Status: Closed |
HADR STANDBY DATABASE CAN BE BROUGHT DOWN DUE TO A BADPAGE ERROR AFTER SUCCESSFUL REINTEGRATION | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
If the following steps happen in order, hadr standby database could be brought down. 1. HADR is in peer state 2. A forced takeover is executed on standby while active transactions exists on table <tab1> on primary and the below message is logged into the diaglog: 2011-08-08-01.09.27.277225-420 I65530E321 LEVEL: Warning PID : 20672 TID : 47731157748560PROC : db2hadrs (HADRDB) INSTANCE: db2inst1 NODE : 000 FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduAcceptEvent, probe:20214 MESSAGE : Time out waiting for primary end of log 3. New primary inserts rows or updates rows on table <tab1> 4. Old Primary reintegrates as new standby successfully. 5. New standby will shut down when it tries to replay the data changes on table tab1 with BADPAGE error. Diag messages similar to these will appear in the db2diag.log: 2011-08-08-01.11.23.845061-420 I80277E563 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldRedoUpsert, probe:1884 MESSAGE : Free Space does not match during redo! DATA #1 : Hexdump, 26 bytes 0x0000000228115658 : 01A2 0200 0400 0000 0D00 2E0B 0500 8000 ................ 0x0000000228115668 : 0000 3A0B 0000 0D00 0100 ..:....... 2011-08-08-01.11.23.845277-420 I80841E448 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldRedoUpsert, probe:1887 MESSAGE : Space used: DATA #1 : Hexdump, 4 bytes 0x00007FFF6D0D0088 : 1100 0000 .... 2011-08-08-01.11.23.845351-420 I81290E464 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldRedoUpsert, probe:1892 RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page" DIA8500C A data file error has occurred, record id is "". 2011-08-08-01.11.23.899138-420 E98543E372 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, base sys utilities, sqleMarkDBad, probe:10 MESSAGE : ADM7518C "HADRDB " marked bad. 2011-08-08-01.11.23.899237-420 I98916E385 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, base sys utilities, sqleMarkDBad, probe:210 MESSAGE : Database logging stopped due to mark db bad. 2011-08-08-01.11.23.911552-420 I100194E458 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldRedo, probe:6291 RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page" DIA8500C A data file error has occurred, record id is "". 2011-08-08-01.11.23.911683-420 I100653E457 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldmrdo, probe:783 RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page" DIA8500C A data file error has occurred, record id is "". | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * HADR users on all platforms * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to db2 v97fp6 * **************************************************************** | |
Local Fix: | |
available fix packs: | |
DB2 Version 9.7 Fix Pack 6 for Linux, UNIX, and Windows | |
Solution | |
With db2 v97fp6, in the senario described in Error Description, reintegration will fail if new standby is ahead of new primary. Hence, no bad page error will happen. | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 03.11.2011 05.06.2012 05.06.2012 |
Problem solved at the following versions (IBM BugInfos) | |
9.7.FP6 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.6 |