DB2 - Problem description
Problem IC70082 | Status: Closed |
HWM TOO LOW DUE TO TIMING WINDOW WHEN NEW SMP EXTENT ALLOCATED, CAN LEAD TO DATA CORRUPTION | |
product: | |
DB2 FOR LUW / DB2FORLUW / 950 - DB2 | |
Problem description: | |
There is a very small timing window where we cache the value of the last initialized SMP extent. We then use this value to calculate the high-water mark. If a new SMP extent gets allocated within that timing window then we'll set the high-water mark to a value that is too low. Various operations will result in this subsequently getting set correctly, but it is possible for a tablespace reduce to occur and we can get rid of extents after the high-water mark... extents that are in use by the tablespace. This results in data corruption. When this happens one of the symptoms is at database start up time the tablespace will be put to OFFLINE state and the following error will appear in the db2diag.log: 2010-07-09-04.39.06.621170+120 I420281399A2620 LEVEL: Severe PID : 10563 TID : 27 PROC : db2sysc 0 INSTANCE: db2 NODE : 000 DB : SAMPLE APPHDL : 0-973 APPID: *LOCAL.db2pc1.100709023906 AUTHID : DB2 EDUID : 27 EDUNAME: db2agent (SAMPLE) 0 FUNCTION: DB2 UDB, buffer pool services, sqlb_verify_page, probe:3 MESSAGE : ZRC=0x86020001=-2046689279=SQLB_BADP "page is bad" DIA8400C A bad page was encountered. DATA #1 : String, 64 bytes Error encountered trying to read a page - information follows : DATA #2 : String, 23 bytes Page verification error DATA #3 : Page ID, PD_TYPE_SQLB_PAGE_ID, 4 bytes 11776007 DATA #4 : Object descriptor, PD_TYPE_SQLB_OBJECT_DESC, 72 bytes Obj: {pool:9;obj:65534;type:14} Parent={9;65534} lifeLSN: 000000000000 tid: 0 0 0 extentAnchor: 0 initEmpPages: 0 poolPage0: 0 poolflags: 2122 objectState: 0 lastSMP: 0 pageSize: 4096 extentSize: 8 bufferPoolID: 1 partialHash: 4059955209 bufferPool: 0x7ffffff9b713bb40 DATA #5 : Bitmask, 4 bytes 0x00000002 DATA #6 : Page header, PD_TYPE_SQLB_PAGE_HEAD, 48 bytes pageHead: {pool:0;obj:0;type:0} PPNum:0 OPNum:0 begoff: 0 datlen: 0 pagebinx: 0 revnum: 0 pagelsn: 000000000000 flag: 0 signature: 0 cbits1to31: 0 cbits32to63: 0 CALLSTCK: [0] 0x7FFFFFFF7B017F30 __1cZsqlbLogReadAttemptFailure6FIpnQSQdDLB_OBJECT_DESC_IpnJSQdDL B_PAGE_ibLIpcpnMSQdDLB_GLOBALS__v_ + 0x150 [1] 0x7FFFFFFF7B01C8B8 __1cQsqlb_verify_page6FpnJSQdDLB_PAGE_pnQSQdDLB_OBJECT_DESC_IIpn MSQdDLB_GLOBALS_pL_i_ + 0x598 [2] 0x7FFFFFFF7B01965C sqlbReadPage + 0xE84 [3] 0x7FFFFFFF7AFF9334 __1cTsqlbGetPageFromDisk6FpnLSQdDLB_FIX_CB_i_i_ + 0x2EC [4] 0x7FFFFFFF7AF42404 __1cHsqlbfix6FpnLSQdDLB_FIX_CB__i_ + 0xA1C [5] 0x7FFFFFFF7B07AFA0 __1cYsqlbFindNewHighWaterMark6FHIpnJSQdDLP_LSN8_LpnMSQdDLB_GLOBA LS__i_ + 0xC38 [6] 0x7FFFFFFF7B06F4F8 __1cQsqlbDMSStartPool6FpnMSQdDLB_GLOBALS_pnMSQdDLB_POOL_CB__i_ + 0x7B8 Another possible symptom is some objects in the tablespace will have data beyond the high water mark, resulting in data corruption in the objects. Though the problem may remain undetected until the object is accessed, running db2dart /TS on the tablespace will catch this. | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * HWM TOO LOW DUE TO TIMING WINDOW WHEN NEW SMP * * EXTENTALLOCATED, CAN LEAD TO DATA CORRUPTIONThere is a very * * small timing window where we cache the valueofthe last * * initialized SMP extent.We then use this value to calculate * * the high-water mark.If a new SMP extent gets allocated * * within that timing windowthen we'll set the high-water mark * * to a value that is toolow.Various operations will result in * * this subsequently gettingsetcorrectly, but it is possible * * for a tablespace reduce tooccurand we can get rid of extents * * after the high-water mark...extents that are in use by the * * tablespace.This results in data corruption.When this happens * * one of the symptoms is at database startuptimethe tablespace * * will be put to OFFLINE state and thefollowingerrorwill * * appear in the db2diag.log:2010-07-09-04.39.06.621170+120 * * I420281399A2620 LEVEL:SeverePID : 10563 * * TID : 27 PROC :db2sysc0INSTANCE: db2 * * NODE : 000 DB : SAMPLEAPPHDL : 0-973 * * APPID:*LOCAL.db2pc1.100709023906AUTHID : DB2EDUID : 27 * * EDUNAME: db2agent (SAMPLE) 0FUNCTION: DB2 UDB, * * buffer pool services, sqlb_verify_page,probe:3MESSAGE : * * ZRC=0x86020001=-2046689279=SQLB_BADP "page is bad"DIA8400C A * * bad page was encountered.DATA #1 : String, 64 bytesError * * encountered trying to read a page - informationfollows :DATA * * #2 : String, 23 bytesPage verification errorDATA #3 : Page * * ID, PD_TYPE_SQLB_PAGE_ID, 4 bytes11776007DATA #4 : Object * * descriptor, PD_TYPE_SQLB_OBJECT_DESC, 72bytesObj: * * {pool:9;obj:65534;type:14} Parent={9;65534}lifeLSN: * * 000000000000tid: 0 0 0extentAnchor: * * 0initEmpPages: 0poolPage0: * * 0poolflags: 2122objectState: * * 0lastSMP: 0pageSize: * * 4096extentSize: * * 8bufferPoolID: 1partialHash: * * 4059955209bufferPool: 0x7ffffff9b713bb40DATA #5 : * * Bitmask, 4 bytes0x00000002DATA #6 : Page header, * * PD_TYPE_SQLB_PAGE_HEAD, 48 bytespageHead: * * {pool:0;obj:0;type:0} PPNum:0 OPNum:0begoff: * * 0datlen: 0pagebinx: * * 0revnum: 0pagelsn: 000000000000 * * flag: 0signature: * * 0cbits1to31: 0cbits32to63: * * 0CALLSTCK:[0] * * 0x7FFFFFFF7B017F30__1cZsqlbLogReadAttemptFailure6FIpnQSQdDLB_O * 0x150[1] * * 0x7FFFFFFF7B01C8B8__1cQsqlb_verify_page6FpnJSQdDLB_PAGE_pnQSQd * 0x598[2] 0x7FFFFFFF7B01965C sqlbReadPage + 0xE84[3] * * 0x7FFFFFFF7AFF9334__1cTsqlbGetPageFromDisk6FpnLSQdDLB_FIX_CB_i * + 0x2EC[4] 0x7FFFFFFF7AF42404 * * __1cHsqlbfix6FpnLSQdDLB_FIX_CB__i_+0xA1C[5] * * 0x7FFFFFFF7B07AFA0__1cYsqlbFindNewHighWaterMark6FHIpnJSQdDLP_L * 0xC38[6] * * 0x7FFFFFFF7B06F4F8__1cQsqlbDMSStartPool6FpnMSQdDLB_GLOBALS_pnM * possible symptom is some objects in the tablespacewillhave * * data beyond thehigh water mark, resulting in data corruption * * in theobjects.Though the problem mayremain undetected until * * the object is accessed, runningdb2dart/TS on the * * tablespacewill catch this. * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 version 9.5 and Fix Pack 6 * **************************************************************** | |
Local Fix: | |
Contact service when the above occurs. | |
available fix packs: | |
DB2 Version 9.5 Fix Pack 6a for Linux, UNIX, and Windows | |
Solution | |
Problem was first fixed in DB2 version 9.5 and Fix Pack 6 | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC70147 IC71080 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 21.07.2010 20.09.2010 20.09.2010 |
Problem solved at the following versions (IBM BugInfos) | |
9.5. | |
Problem solved according to the fixlist(s) of the following version(s) |