DB2 - Problem description
Problem IT31892 | Status: Closed |
CRASH RECOVERY CAN FAIL AFTER INCORRECTLY REPLAYING EXTENT MOVEMENT LOG RECORDS WITH LARGE RECOVERY WINDOW | |
product: | |
DB2 FOR LUW / DB2FORLUW / B50 - DB2 | |
Problem description: | |
An extent move log record (OBJ_MOVE_ANCHOR and OBJ_MOVE_EXT) will be replayed if the page storing an extent's location is older than the extent move log record i.e page LSN < log record LSN. There are two actions required for moving extents - move extent on disk and update any cached page locations. This APAR addresses a bug where extent movement log records are skipped (i.e. extent not moved on disk), but cached page locations are updated. This usually will not be a problem, except if the recovery window is very long. A long recovery window opens the chance of updating cached page locations of extents that should not be moved, but also belong to a very old extent movement. This can cause pages to be read from the wrong position in the tablespace and can lead to read errors during recovery. One such error can be found below: 2020-01-22-13.12.41.408368-300 I29438705E1403 LEVEL: Error PID : 162709 TID : 17690898919776 PROC : db2sysc 8 APPHDL : 0-33175 APPID: XXXX EDUID : 4084 EDUNAME: db2redow (XXXX) 8 FUNCTION: DB2 UDB, buffer pool services, sqlbRedoUpdateEMP, probe:814 MESSAGE : ZRC=0x8402008F=-2080243569=SQLB_READERR "Non-critical I/O or verification error encountered during page read from disk." DATA #1 : Object descriptor, PD_TYPE_SQLB_OBJECT_DESC, 104 bytes Obj: {pool:16;obj:292;type:71} Parent={16;292} lifeLSN: 00000000BA992AB2 tid: 0 0 0 extentAnchor: 5890216 initEmpPages: 0 poolPage0: 0 poolflags: 0x 3122 objectState: 0x 80027 lastSMP: 0 pageSize: 32768 extentSize: 4 bufferPoolID: 1 partialHash: 1176764432 objDescAttributes: 0 objDescEHLState: 0x0000100a801295df bufferPool: 0x00001000b1d746e0 pdef: 0x0000100a72861b20 DATA #2 : Page ID, PD_TYPE_SQLB_PAGE_ID, 4 bytes 6990309 DATA #3 : Page ID, PD_TYPE_SQLB_PAGE_ID, 4 bytes 5 | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * all * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to Db2 11.5m3fp0 or later * **************************************************************** | |
Local Fix: | |
Avoid running extent movement (ALTER TABLESPACE ... REDUCE MAX and ALTER TABLESPACE ... LOWER HIGH WATER MARK). If extent movement is urgently required, ensure that the recovery window is small before running extent movement. This can be checked by running the following command and testing if the OLDEST_TX_LSN is close to CURRENT_LSN and no indoubt transactions exist on the database (db2 "select MEMBER, OLDEST_TX_LSN, CURRENT_LSN, NUM_INDOUBT_TRANS from table (mon_get_transaction_log(-2)) as t". This reduces that chances that an old extent movement log record will be replayed incorrectly if recovery is required. | |
Solution | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : follow-up : IT31893 | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 18.02.2020 24.04.2020 24.04.2020 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) |