DB2 - Problem description
Problem IC96301 | Status: Closed |
HIGH USER CPU OR SYSTEM HANG MAY BE OBSERVED WHEN LARGE AMOUNT OF LOCK RESOURCES BEING USED BY MANY CONCURRENT TRANSACTIONS | |
product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
Problem description: | |
When there are many concurrent transactions which are requesting a large amount of lock resources at the same time, you may observe high USER CPU or the server appeared to be hanging or be performed very slowly. When this issue happens, you may see a lot of agents looping at sqlpRefillAgentLRBCache function at the top of their stacks while holding the latch modifyGroupListLatch as below example: 0x0900000005072744 sqlpRefillAgentLRBCache__FCP9sqeBsuEduCP9SQLP_DBCBCP8SQLP_LCBCP9 SQLP_LTRNPUlCb + 0xB60 0x0900000004FACC5C sqlplrq__FP9sqeBsuEduP14SQLP_LOCK_INFO + 0x460 0x0900000004F678A4 @161@sqliScanLeaf2__FP7SQLI_CBP11SQLI_SFGLOB + 0x5CC 0x0900000004F69844 sqlifnxt__FP7SQLI_CBP11SQLI_SFGLOB + 0x26C 0x0900000004F62F80 sqlirdk__FP8sqeAgentP10SQLI_IXPCRP9SQLD_IXCBUlP8SQLD_KEYP12SQLI_ KEYDATAP9sqli_scanPi + 0x2EC 0x0900000004F64AB0 sqldIndexFetch__FP8SQLD_CCBP8SQLZ_RID + 0x67C 0x0900000004F51EA4 sqldRowFetch__FP8sqeAgentP8SQLD_CCBUlT3PP10SQLD_VALUEP8SQLZ_RIDT 3P12SQLD_ID_LISTP9SQLP_LSN8 + 0x133C 0x0900000004F4D338 sqlriFetch__FP8sqlrr_cbP9sqlri_taol + 0x2D4 0x0900000004F4C8EC sqlriNljnNonPiped__FP8sqlrr_cb + 0x328 0x0900000004F49A58 sqlrihsjn__FP8sqlrr_cb + 0x59C 0x0900000004F4DDC8 sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + 0x24 0x0900000004D67868 sqlrr_process_fetch_request__FP14db2UCinterface + 0xF60 0x090000000579F9F8 sqlrr_open__FP14db2UCinterfaceP15db2UCCursorInfo + 0x17F0 <LatchInformation> Holding Latch type: (SQLO_LT_SQLP_LCB__modifyGroupListLatch) - Address: (0xa00060000db0e68), Line: 842, File: sqlplrq.C HoldCount: 1 </LatchInformation> The issue is more likely to occurs at an environment where the db parameters LOCKLIST and MAXLOCKS are configured with very high values or are set to AUTOMATIC but has grown to a very large size in STMM case. For example, below are the output showing the LOCKLIST is set to AUTOMATIC in a STMM environment, but the size has grown to almost 10G bytes. LOCKLIST (4KB) AUTOMATIC(2527424) AUTOMATIC(2527424) | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 10.1 Fix Pack 5 * **************************************************************** | |
Local Fix: | |
-Reduce the requirement of locking resources and commit the transactions as soon as possible. -Use a lower isolation level such as CS/UR if possible. -Set LOCKLIST to a fixed size but with a smaller value so lock escalation can happen earlier. | |
Solution | |
First fixed in DB2 10.1 Fix Pack 5 | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC99682 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 26.09.2013 13.02.2017 13.02.2017 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) | |
10.1.0.5 |