DB2 - Problem description
Problem IC68899 | Status: Closed |
HADR STANDBY MAY PANIC WITH SQLP_BADLOG ERROR 'TAILPAGE 0 DOES NOT PAGELSN XXXXXXXXXXXX AND FIRSTLSN XXXXXXXXXXXX' | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
An HADR Standby may experience a problem in the following situation 1) The HADR pair is in Peer state and on the last log page of a log file This can be observed using 'db2pd -hadrdb -db db1' under PrimaryPg or StandbyPg. The value will be 1 less than LOGFILSIZ. 2) The HADR pair is disconnected This could be for any reason including network glitch, HADR timeout, or deactivation and reactivation of the Standby 3) The HADR Primary does not generate any log records while the HADR pair reconnects. In this situation, the Standby may panic with an entry in the diag.log similar to the following: 2010-01-20-05.51.00.822571+540 I125712485A542 LEVEL: Severe PID : 156072 TID : 12500 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 EDUID : 12500 EDUNAME: db2hadrs (DB1) 0 FUNCTION: DB2 UDB, data protection services, sqlpgWriteToDisk, probe:909 MESSAGE : ZRC=0x8610000D=-2045771763=SQLP_BADLOG "Log File cannot be used" DIA8414C Logging can not continue due to an error. DATA #1 : <preformatted> TailPage 0 does not match pagelsn 0356CCBA76B4 and firstlsn 0356CCBA8000 The signature of the this problem is: - TailPage 0 - firstlsn is the first lsn of the next log file - pagelsn is less than firstlsn (and is the pagelsn of the last page on the prior log file) Additionally, in the HADR Primary's db2diag.log, there will be an entry similar to the following 2010-01-20-05.50.57.554083+540 I101672740A382 LEVEL: Warning PID : 1343502 TID : 14394 PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 EDUID : 14394 EDUNAME: db2hadrp (DB1) 0 FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrTransitionPtoNPeer, probe:10645 MESSAGE : near peer catchup starts at 00000356CCBA800C Where the reported number is slightly past the firstlsn value reported from the Standby's diag.log (00000356CCBA800C vs 00000356CCBA8000) | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * DB2 V9.7 FP2 * **************************************************************** * PROBLEM DESCRIPTION: * * An HADR Standby may experience a problem in the following * * * * situation * * * * * * * * 1) The HADR pair is in Peer state and on the last log page * * of a * * log file * * * * This can be observed using 'db2pd -hadrdb -db db1' under * * * * PrimaryPg or StandbyPg. The value will be 1 less than * * * * LOGFILSIZ. * * * * * * * * 2) The HADR pair is disconnected * * * * This could be for any reason including network glitch, HADR * * * * timeout, or deactivation and reactivation of the Standby * * * * * * * * 3) The HADR Primary does not generate any log records while * * the * * HADR pair reconnects. * * * * * * * * * * * * In this situation, the Standby may panic with an entry in * * the * * diag.log similar to the following: * * * * * * * * 2010-01-20-05.51.00.822571+540 I125712485A542 LEVEL: * * Severe * * PID : 156072 TID : 12500 PROC : * * db2sysc * * 0 * * * * INSTANCE: db2inst1 NODE : 000 * * * * EDUID : 12500 EDUNAME: db2hadrs (DB1) 0 * * * * FUNCTION: DB2 UDB, data protection services, * * sqlpgWriteToDisk, * * probe:909 * * * * MESSAGE : ZRC=0x8610000D=-2045771763=SQLP_BADLOG "Log File * * * * cannot be used" * * * * DIA8414C Logging can not continue due to an error. * * * * DATA #1 : <preformatted> * * * * TailPage 0 does not match pagelsn 0356CCBA76B4 and firstlsn * * * * 0356CCBA8000 * * * * * * * * * * * * The signature of the this problem is: * * * * - TailPage 0 * * * * - firstlsn is the first lsn of the next log file * * * * - pagelsn is less than firstlsn (and is the pagelsn of the * * last * * page on the prior log file) * * * * * * * * Additionally, in the HADR Primary's db2diag.log, there will * * be * * an entry similar to the following * * * * * * * * 2010-01-20-05.50.57.554083+540 I101672740A382 LEVEL: * * Warning * * PID : 1343502 TID : 14394 PROC : * * db2sysc * * 0 * * * * INSTANCE: db2inst1 NODE : 000 * * * * EDUID : 14394 EDUNAME: db2hadrp (DB1) 0 * * * * FUNCTION: DB2 UDB, High Availability Disaster Recovery, * * * * hdrTransitionPtoNPeer, probe:10645 * * * * MESSAGE : near peer catchup starts at 00000356CCBA800C * * * * * * * * Where the reported number is slightly past the firstlsn * * value * * reported from the Standby's diag.log * * * * (00000356CCBA800C vs 00000356CCBA8000) * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 V9.7 FP3 * **************************************************************** | |
Local Fix: | |
Restart the HADR Standby after primary has moved to next log | |
available fix packs: | |
DB2 Version 9.7 Fix Pack 3 for Linux, UNIX, and Windows | |
Solution | |
First fixed in DB2 V9.7 FP3 | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 27.05.2010 24.09.2010 24.09.2010 |
Problem solved at the following versions (IBM BugInfos) | |
9.7.FP3 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.3 | |
9.7.0.3 |