DB2 - Problem description
Problem IT01915 | Status: Closed |
HASH LATCH CONTENTION CAUSES POOR PERFORMANCE | |
product: | |
DB2 FOR LUW / DB2FORLUW / A50 - DB2 | |
Problem description: | |
This APAR applies to all platforms under the combination of the following conditions: 1) A LOCKLIST greater than 8000 pages is used. 2) Many applications are accessing one specific table concurrently via SELECT or DELETE queries, or via searched DELETE or CLOSE CURSOR operations against cursor opened FOR READ ONLY. If you suspect that you are hitting this problem, collect "db2pd -latches" at various intervals. The output that is relevant in this case is column 2 (Holder), column 3 (Waiter) and column 5 (LatchType). If you see many lines of output that have a LatchType of SQLO_LT_SQLP_LHSH__hshlatch (lock manager hash table latch) and have the same Holder value (EDU ID of the EDU holding the latch) with various Waiter values (EDU ID of an EDU waiting on the latch), then it is possible that you might be hitting this issue. Note that multiple unique holders may be present, as is the case in this example. Address Holder Waiter Filename LOC LatchType HoldCount 0x07000046EBBE0B58 798762 213934 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 0x07000046EBBE0B58 798762 129857 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 0x07000046EBBE0B58 798762 140132 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 ... repeats many times ... ... with the same Holder value ... ... with varying Waiter values ... 0x07000046EBBE0B58 798762 1186579 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 0x07000046EBBE0B58 798762 1190691 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 0x07000046EBBE0B58 798762 1200514 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 0x07000046EBBE0B58 798762 1202054 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 0x07000046EBBE0B58 798762 800313 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 0x07000046EBBE0B58 798762 467072 sqlpLockInternal.h 554 SQLO_LT_SQLP_LHSH__hshlatch 0 ... skip some other entries ... 0x07000046EBBE0B58 1156509 213934 sqlpLockInternal.h 520 SQLO_LT_SQLP_LHSH__hshlatch 1 0x07000046EBBE0B58 1156509 129857 sqlpLockInternal.h 520 SQLO_LT_SQLP_LHSH__hshlatch 1 0x07000046EBBE0B58 1156509 140132 sqlpLockInternal.h 520 SQLO_LT_SQLP_LHSH__hshlatch 1 ... repeats many times ... ... with the same Holder value (but different than the Holder value above) ... ... with varying Waiter values ... 0x07000046EBBE0B58 1156509 148094 sqlpLockInternal.h 520 SQLO_LT_SQLP_LHSH__hshlatch 1 0x07000046EBBE0B58 1156509 149379 sqlpLockInternal.h 520 SQLO_LT_SQLP_LHSH__hshlatch 1 0x07000046EBBE0B58 1156509 160167 sqlpLockInternal.h 520 SQLO_LT_SQLP_LHSH__hshlatch 1 ... skip other entries ... Once you have confirmed from "db2pd -latches" output that your environment might be suffering from this issue, you can collect additional information from agents to confirm that this specific problem is the issue in your environment. For each of the holder values in the "db2pd -latches" output, collect "db2pd -stacks <holder_EDU_ID>" to dump the stack trace of the EDUs waiting on the hash latch. This may need to be collected mutiple times in order to capture an instance when the EDU is actively holding the latch. The holder EDU stack that indicates the problem scenario looks like this: -------Frame------ ------Function + Offset------ 0x0900000045849C34 getConflictComplex__17SQLO_SLATCH_CAS64FCUl + 0x3D4 0x090000004584A258 getConflict__17SQLO_SLATCH_CAS64FCUl + 0xD8 0x0900000045EEA990 sqlplrl__FP9sqeBsuEduP14SQLP_LOCK_INFO + 0x3F0 0x090000004685F3A0 sqldmclo__FP8sqeAgentPP8SQLD_CCBi + 0x1BA0 0x09000000468545F0 sqlriclo__FP8sqlrr_cbP9sqlri_taoi + 0x550 0x0900000045B9BA98 sqlricjp__FP8sqlrr_cbP12sqlri_opparmilT4 + 0x2B8 0x0900000045B9B4C8 sqlricls_simple__FP8sqlrr_cbil + 0x1488 0x09000000476F15AC sqlrr_process_close_request__FP8sqlrr_cbiN32 + 0x20C 0x0900000046EFEEE8 sqlrr_close__FP14db2UCinterfaceP15db2UCCursorInfo + 0x208 In addition, for various waiter values in the "db2pd -latches" output, collect "db2pd -stacks <waiter_EDU_ID>". Again, you may need to collect this multiple times in order to capture an instance when the EDU is actively waiting on the latch. The waiter EDU stack that indicates the problem scenario looks like this: -------Frame------ ------Function + Offset------ 0x0900000045849C34 getConflictComplex__17SQLO_SLATCH_CAS64FCUl + 0x3D4 0x090000004584A258 getConflict__17SQLO_SLATCH_CAS64FCUl + 0xD8 0x090000004608EA20 sqlplrq__FP9sqeBsuEduP14SQLP_LOCK_INFO + 0x1EA0 0x09000000460D8CF4 sqldLockTable__FP8sqeAgentP14SQLP_LOCK_INFOUiUsi + 0x114 0x09000000468CE908 sqldScanOpen__FP8sqeAgentP14SQLD_SCANINFO1P14SQLD_SCANINFO2PPv + 0xB28 0x0900000046856E30 sqlriopn__FP8sqlrr_cbP9sqlri_taoPi + 0x1550 0x09000000477C4918 sqlrita__FP8sqlrr_cb + 0xC78 0x09000000468F7278 sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + 0x3F8 0x0900000046EF5908 sqlrr_process_fetch_request__FP14db2UCinterface + 0xC48 0x0900000046EF7228 sqlrr_open__FP14db2UCinterfaceP15db2UCCursorInfo + 0xE08 If the primary conditions are met, and holder EDU and waiter EDU stacks match those listed above, then you might obtain relief after applying the local fix or by upgrading to a newer level of DB2 that contains the fix for this APAR. Local Fix Apply the following registry setting and restart DB2. DB2_KEEPTABLELOCK=TRANSACTION | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * All users * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 V105FP4 or higher version. * **************************************************************** | |
Local Fix: | |
DB2_KEEPTABLELOCK=TRANSACTION | |
available fix packs: | |
DB2 Cancun Release 10.5.0.4 (also known as Fix Pack 4) for Linux, UNIX, and Windows | |
Solution | |
Fixed on DB2 V105FP4 or higher version. | |
Workaround | |
DB2_KEEPTABLELOCK=TRANSACTION | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 20.05.2014 10.11.2014 18.11.2014 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) | |
10.5.0.4 |