DB2 - Problem description
Problem IC77975 | Status: Closed |
HADR STANDBY DATABASE CAN BE BROUGHT DOWN DUE TO A BADPAGE ERROR AFTER SUCCESSFUL REINTEGRATION | |
product: | |
DB2 FOR LUW / DB2FORLUW / 910 - DB2 | |
Problem description: | |
If the following steps happen in order, hadr standby database could be brought down. 1. HADR is in peer state 2. A forced takeover is executed on standby while active transactions exists on table <tab1> on primary and the below message is logged into the diaglog: 2011-08-08-01.09.27.277225-420 I65530E321 LEVEL: Warning PID : 20672 TID : 47731157748560PROC : db2hadrs (HADRDB) INSTANCE: db2inst1 NODE : 000 FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduAcceptEvent, probe:20214 MESSAGE : Time out waiting for primary end of log 3. New primary inserts rows or updates rows on table <tab1> 4. Old Primary reintegrates as new standby successfully. 5. New standby will shut down when it tries to replay the data changes on table tab1 with BADPAGE error. Diag messages similar to these will appear in the db2diag.log: 2011-08-08-01.11.23.845061-420 I80277E563 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldRedoUpsert, probe:1884 MESSAGE : Free Space does not match during redo! DATA #1 : Hexdump, 26 bytes 0x0000000228115658 : 01A2 0200 0400 0000 0D00 2E0B 0500 8000 ................ 0x0000000228115668 : 0000 3A0B 0000 0D00 0100 ..:....... 2011-08-08-01.11.23.845277-420 I80841E448 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldRedoUpsert, probe:1887 MESSAGE : Space used: DATA #1 : Hexdump, 4 bytes 0x00007FFF6D0D0088 : 1100 0000 .... 2011-08-08-01.11.23.845351-420 I81290E464 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldRedoUpsert, probe:1892 RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page" DIA8500C A data file error has occurred, record id is "". 2011-08-08-01.11.23.899138-420 E98543E372 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, base sys utilities, sqleMarkDBad, probe:10 MESSAGE : ADM7518C "HADRDB " marked bad. 2011-08-08-01.11.23.899237-420 I98916E385 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, base sys utilities, sqleMarkDBad, probe:210 MESSAGE : Database logging stopped due to mark db bad. 2011-08-08-01.11.23.911552-420 I100194E458 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldRedo, probe:6291 RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page" DIA8500C A data file error has occurred, record id is "". 2011-08-08-01.11.23.911683-420 I100653E457 LEVEL: Severe PID : 19982 TID : 47731157748560PROC : db2redow (HADRDB) INSTANCE: db2inst1 NODE : 000 DB : HADRDB APPHDL : 0-68 APPID: *LOCAL.DB2.110808081043 FUNCTION: DB2 UDB, data management, sqldmrdo, probe:783 RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page" DIA8500C A data file error has occurred, record id is "". | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * HADR users on all platforms * **************************************************************** * PROBLEM DESCRIPTION: * * Without the fix, customer could be exposed to the problem * * described in the APAR error description * **************************************************************** * RECOMMENDATION: * * Upgrade to db2 v91fp11 * **************************************************************** | |
Local Fix: | |
available fix packs: | |
DB2 Version 9.1 Fix Pack 11 for Linux, UNIX and Windows | |
Solution | |
the fix is included in db2 v91fp11. With the fix, standby database or instance will be brought down after a fored takeover, and can be successfully reintegrated into hadr pair later. | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC78019 IC79599 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 08.08.2011 02.11.2011 02.11.2011 |
Problem solved at the following versions (IBM BugInfos) | |
9.1.FP11 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.1.0.11 |