DB2 - Problem description
Problem IC78289 | Status: Closed |
INSTANCE HUNG AT FUNCTION SQMFASTWRITERQUEUEMGR::ALLOCFASTWRITER RECORDS WHEN LOCKING EVENT MONITORS ARE ENABLE | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
When a locking event monitor is active and activity collection is enabled for deadlock, lock timeout or lock wait events, agents generating locking records may block waiting for records while holding latches sqlrr_curr_activity_cb_latch or sqlrr_past_activity_cb_latch. This can result in a hang as the fast writers (which release records for reuse) end up waiting on those latches. | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * All users that use locking event monitors * **************************************************************** * PROBLEM DESCRIPTION: * * 1) The majority of EDUs (including the PMODEL ones) are * * waiting for (SQLO_LT_sqeAppServices__m_appServLatch) - * * Address: (c0000000246d04d8). The holder is doing an index * * scan, and it is waiting for * * (SQLO_LT_sqlrr_past_activity_cb__sqlrr_past_activity_cb_latc * * h) - Address: (c00000005f4653c8). * * * * EDU name : db2agent (SB27069) * * EDU ID : 590 * * * * (5) 0xc000000000419a70 nanosleep_sys * * (6) 0xc000000000429950 nanosleep * * (7) 0xc000000017dafab0 sqloSpinLockConflict * * (8) 0xc00000001d2948a0 sqlplDumpLockTimeoutInfo * * (9) 0xc000000017cfb9b0 sqlplnfd * * (10) 0xc000000017825910 sqlplrqWithAttr * * (11) 0xc00000001f39ca20 sqliLockUncond * * (12) 0xc000000017797f10 sqliLockRec2 * * (13) 0xc00000001733c850 sqliProcLeafNotNorm * * (14) 0xc00000001781fad0 sqlischf * * (15) 0xc0000000177e54d0 sqliFirstTreeSearch * * (16) 0xc000000017bca120 sqlirdk * * (17) 0xc000000017d88320 sqldIndexFetch * * (18) 0xc00000001799bd50 sqldRowFetch * * (19) 0xc000000017999d10 sqlritaSimplePerm * * * * Waiting on latch type: * * (SQLO_LT_sqlrr_past_activity_cb__sqlrr_past_activity_cb_latc * * h) - Address: (c00000005f4653c8), Line: 1615, File: * * sqlpldwi.C * * Holding Latch type: (SQLO_LT_SQLP_LHSH__xhshlatch) - * * Address: (c0000000818788c8), Line: 550, File: * * /view/db2_v97fp5_hpipf64_s110814/vbs/engn/sqp/inc/sqlpLockIn * * ternal.h HoldCount: 1 * * Holding Latch type: (SQLO_LT_sqeAppServices__m_appServLatch) * * - Address: (c0000000246d04d8), Line: 1366, File: sqlpldwi.C * * HoldCount: 1 * * Holding Latch type: (SQLO_LT_sqeApplication__masterAppLatch) * * - Address: (c00000005f4601c4), Line: 1383, File: sqlpldwi.C * * HoldCount: 1 * * * * ===== * * * * 2) The holder of * * (SQLO_LT_sqlrr_past_activity_cb__sqlrr_past_activity_cb_latc * * h) - Address: (c00000005f4653c8) is another index scan. This * * EDU is waiting for fast writer records but none are * * available and so it blocks on a waitpost (which will be * * posted by a fast writer when it returns records back to the * * fast writer record pool): * * * * EDU name : db2agent (SB27069) * * EDU ID : 121 * * * * // looks like waiting for fast writers * * (5) 0xc00000000041ac70 semtimedop_sys * * (6) 0xc00000000042c7c0 semtimedop * * (7) 0xc0000000172f49f0 sqloWaitEDUWaitPost * * (8) 0xc00000001abe7ad0 * * sqmLockEvents::collectRequestorActivities * * (9) 0xc000000017e4ba20 sqmLockEvents::collectLockEvent * * (10) 0xc000000017df0580 sqlplfnd * * (11) 0xc000000017a46f10 sqlplrq * * (12) 0xc00000001f39c660 sqliLockUncond * * (13) 0xc000000017dc7f40 sqliScanLeaf2 * * (14) 0xc0000000178099b0 sqlifnxt * * (15) 0xc000000017bc94f0 sqlirdk * * (16) 0xc000000017d88320 sqldIndexFetch * * (17) 0xc00000001799bd50 sqldRowFetchP8 * * (18) 0xc000000017999d10 sqlritaSimplePerm * * * * Holding Latch type: * * (SQLO_LT_sqlrr_past_activity_cb__sqlrr_past_activity_cb_latc * * h) - Address: (c00000005f4653c8), Line: 1262, File: * * sqlm_lock_events.C HoldCount: 1 * * * * ===== * * * * 3) This instance has five fast writers. The first four have * * the same stack, but the fifth one is diffent. In fact, the * * first four are waiting for the fifth one which is waiting on * * a SQLO_LT_SQLP_LHSH__xhshlatch: * * * * EDU name : db2fw0 (SB27069) * * EDU ID : 52 * * * * EDU name : db2fw1 (SB27069) * * EDU ID : 59 * * * * EDU name : db2fw2 (SB27069) * * EDU ID : 56 * * * * EDU name : db2fw3 (SB27069) * * EDU ID : 55 * * * * (5) 0xc00000000041abf0 semop_sys * * (6) 0xc00000000042c560 semop * * (7) 0xc000000017d982d0 * * SQLO_SLATCH_CAS::getConflictComplexEm * * (8) 0xc000000017e839c0 SQLO_SLATCH_CAS64::getConflictEm * * (9) 0xc0000000179a6000 sqloltch_notrack * * (10) 0xc000000017e2f040 sqlbVerifyAndLatchPage * * (11) 0xc000000017b01c40 sqlbfix * * (12) 0xc000000017e20020 sqldGetPageForAppend * * (13) 0xc00000001783b250 sqldInsertRow * * (14) 0xc00000001789d3e0 sqldRowInsert * * (15) 0xc000000017810ed0 sqlrinsr * * (16) 0xc00000001aae0f00 sqmRecordTypeArray::processRecord * * * * EDU name : db2fw4 (SB27069) * * EDU ID : 57 * * * * (5) 0xc000000000419a70 nanosleep_sys * * (6) 0xc000000000429950 nanosleep * * (7) 0xc000000017dafab0 sqloSpinLockConflict * * (8) 0xc000000017a45380 sqlplrq * * (9) 0xc0000000197b5830 sqldFindSlotOnPage * * (10) 0xc000000017e209c0 sqldFindAndLockSlotIfSpaceAvail * * (11) 0xc000000017e1f6d0 sqldGetPageForAppend * * (12) 0xc00000001783b250 sqldInsertRow * * (13) 0xc00000001789d3e0 sqldRowInsert * * (14) 0xc000000017810ed0 sqlrinsr * * (15) 0xc00000001aae0f00 sqmRecordTypeArray::processRecord * * * * Waiting on latch type: (SQLO_LT_SQLP_LHSH__xhshlatch) - * * Address: (c0000000818788c8), Line: 505, File: * * /view/db2_v97fp5_hpipf64_s110814/vbs/engn/sqp/inc/sqlpLockIn * * ternal.h * * * * ===== * * * * 4) The holder of the SQLO_LT_SQLP_LHSH__xhshlatch - Address: * * (c0000000818788c8) - is waiting for * * (SQLO_LT_sqlrr_past_activity_cb__sqlrr_past_activity_cb_latc * * h) - Address: (c00000005f4653c8), which is being held by the * * EDU described in step 1). So, 1) wants something that 4) * * has, and 4) is waiting for something owned by 1). DEADLATCH! * * * * EDU name : db2agent (SB27069) * * EDU ID : 590 * * * * (5) 0xc000000000419a70 nanosleep_sys * * (6) 0xc000000000429950 nanosleep * * (7) 0xc000000017dafab0 sqloSpinLockConflict * * (8) 0xc00000001d2948a0 sqlplDumpLockTimeoutInfo * * (9) 0xc000000017cfb9b0 sqlplnfd * * (10) 0xc000000017825910 sqlplrqWithAttr * * (11) 0xc00000001f39ca20 sqliLockUncond * * (12) 0xc000000017797f10 sqliLockRec2 * * (13) 0xc00000001733c850 sqliProcLeafNotNorm * * (14) 0xc00000001781fad0 sqlischf * * (15) 0xc0000000177e54d0 sqliFirstTreeSearch * * (16) 0xc000000017bca120 sqlirdk * * (17) 0xc000000017d88320 sqldIndexFetch * * (18) 0xc00000001799bd50 sqldRowFetch * * (19) 0xc000000017999d10 sqlritaSimplePerm * * * * Waiting on latch type: * * (SQLO_LT_sqlrr_past_activity_cb__sqlrr_past_activity_cb_latc * * h) - Address: (c00000005f4653c8), Line: 1615, File: * * sqlpldwi.C * * Holding Latch type: (SQLO_LT_SQLP_LHSH__xhshlatch) - * * Address: (c0000000818788c8), Line: 550, File: * * /view/db2_v97fp5_hpipf64_s110814/vbs/engn/sqp/inc/sqlpLockIn * * ternal.h HoldCount: 1 * * Holding Latch type: (SQLO_LT_sqeAppServices__m_appServLatch) * * - Address: (c0000000246d04d8), Line: 1366, File: sqlpldwi.C * * HoldCount: 1 * * Holding Latch type: (SQLO_LT_sqeApplication__masterAppLatch) * * - Address: (c00000005f4601c4), Line: 1383, File: sqlpldwi.C * * HoldCount: 1 * **************************************************************** * RECOMMENDATION: * * Upgrade to Version 9.7 Fix Pack 5 * **************************************************************** | |
Local Fix: | |
For deadlock collection use the DEADLOCKS WITH DETAILS event monitor rather than the LOCKING event monitor. For lock timeout collection set the DB2_CAPTURE_LOCKTIMEOUT registry variable to ON to generate a report file about lock timeouts rather than the LOCKING event monitor. | |
available fix packs: | |
DB2 Version 9.7 Fix Pack 5 for Linux, UNIX, and Windows | |
Solution | |
Problem was first fixed in Version 9.7 Fix Pack 5 | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 23.08.2011 21.12.2011 09.08.2012 |
Problem solved at the following versions (IBM BugInfos) | |
9.7.FP5 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.5 |