DB2 - Problem description
Problem IT07864 | Status: Closed |
DB2 CRASHES AFTER SETTING DB2_RESOURCE_POLICY. | |
product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
Problem description: | |
The use of DB2_RESOURCE_POLICY in some configurations enables memory affinity for NUMA exploitation. As part of this exploitation, various internal data structures are split into multiple regions and are allocated from different nodes. However, there are problems in how these multiple regions are initialized which results in memory corruption. Affected configurations must have more than 5 resources configured or automatically detected. This information can be found in db2diag.log after running db2start. (See "Number of resource bindings" entry.) 2015-03-12-12.12.24.271249-240 E6184A2642 LEVEL: Event PID : 56819936 TID : 258 PROC : db2wdog_DB2NAME INSTANCE: memmerto NODE : 000 HOSTNAME: p7302 EDUID : 258 EDUNAME: db2wdog_DB2NAME FUNCTION: DB2 UDB, oper system services, sqloInitializeResourcePolicy, probe:20 DATA #1 : String, 2279 bytes RESOURCE POLICY Number of DB resource policies = 1 DATABASE RESOURCE POLICY Database name = !GLBPOL! Method = 1 Number of resource groups = 1 Number of resource bindings = 32 Round robin resource binding = 1 Two different behaviours may be seen as a result of this corruption: 1) messages in db2diag.log regarding memory corruption during during database deactivation 2015-03-06-11.25.39.902046-300 E541790A1016 LEVEL: Critical PID : 6292082 TID : 2058 KTID : 96010371 PROC : db2sysc INSTANCE: memmerto NODE : 000 DB : SAMPLE APPHDL : 0-7 APPID: *LOCAL.memmerto.150306162453 AUTHID : MEMMERTO HOSTNAME: p7302 EDUID : 2058 EDUNAME: db2agent (PMR56607) FUNCTION: DB2 UDB, SQO Memory Management, sqloDiagnoseFreeBlockFailure, probe:10 MESSAGE : ADM14001C An unexpected and critical error has occurred: "Panic". The instance may have been shutdown as a result. "Automatic" FODC (First Occurrence Data Capture) has been invoked and diagnostic information has been recorded in directory "/home/memmerto/sqllib/db2dump/FODC_Panic_2015-03-06-11.25.39.89 0878_ 0000/". Please look in this directory for detailed evidence about what happened and contact IBM support if necessary to diagnose the problem. -------Frame------ ------Function + Offset------ 0x090000004D94261C sqle_panic__Fi + 0xA7C 0x090000004D958010 sqloCrashOnCriticalMemoryValidationFailure + 0x50 0x090000004D97E298 diagnoseMemoryCorruptionAndCrash__13SQLO_MEM_POOLFUlCPCcCb + 0x3F8 0x090000005823CB74 markAllAllocatedBlocksInvalid__17SqloChunkSubgroupCFv + 0x134 0x090000004D982A28 markAllAllocatedBlocksInvalid__13SQLO_MEM_POOLCFv + 0x88 0x090000004D96BBA0 sqloPurgeMemoryInSubPool + 0x400 0x090000004D96C334 sqloFreeMemorySubPool + 0x134 0x090000005004BBAC sqldTermDBCB__FP16sqeLocalDatabaseUl + 0x58C 0x090000004E3078B8 CleanDB__16sqeLocalDatabaseFbP5sqlca + 0x838 0x090000004E30FBD0 ExecuteDBShutdown__16sqeLocalDatabaseFP8sqeAgentPbP5sqlcai + 0x830 0x090000004E322320 TermDbConnect__16sqeLocalDatabaseFP8sqeAgentP5sqlcai + 0x3060 0x090000004E3A77AC AppStopUsing__14sqeApplicationFP8sqeAgentUcP5sqlca + 0x1A2C 0x09000000558BE774 sqleStartDb__FsP8SQLE_BWAP10sqledbdescP13sqledbdescextT1PcT2iT1l Ul + 0x1414 2) runtime traps due to corrupted latches 2015-03-05-16.24.57.464727-300 E33801A3679 LEVEL: Severe (OS) PID : 58917198 TID : 2402 PROC : db2sysc INSTANCE: memmerto NODE : 000 DB : SAMPLE APPHDL : 0-8 APPID: *LOCAL.memmerto.150305212457 AUTHID : MEMMERTO HOSTNAME: p7302 EDUID : 2402 EDUNAME: db2agent (SAMPLE) FUNCTION: DB2 UDB, SQO Latch Tracing, SQLO_SLATCH_CAS64::releaseConflict, probe:300 MESSAGE : ZRC=0x870F00FB=-2029059845=SQLO_SLATCH_ERROR_HELDX_WITH_SHARED_H OLDERS "shared latch found with both exclusive and shared holders. Latch likely corrupt." CALLED : OS, -, unspecified_system_function -------Frame------ ------Function + Offset------ 0x090000000E6BB9D4 sqle_panic__Fi + 0x520 0x090000000B3661B4 dumpDiagInfoAndPanic__17SQLO_SLATCH_CAS64CFCPCcCUiCUlT3ClT3CiT1T 3T7 + 0x308 0x090000000B365E44 releaseConflict__17SQLO_SLATCH_CAS64Fv@OL@21943 + 0x60 0x090000000CF2C9CC releaseConflict__17SQLO_SLATCH_CAS64Fv + 0x44 0x090000000CCF8DEC sqldScanOpen__FP8sqeAgentP14SQLD_SCANINFO1P14SQLD_SCANINFO2PPv + 0x1C0 0x090000000C6C8808 sqlrlini__FP8sqlrr_cbb + 0x380 0x090000000DB36804 sqlrr_appl_init__FP8sqeAgentP5sqlca + 0x5474 0x090000000DB05348 InitEngineComponents__14sqeApplicationFcP8sqeAgentP8SQLE_BWAP5sq lcaP22SQLESRSU_STATUS_VECTORT1 + 0x2CC4 0x090000000DAF9608 AppStartUsing__14sqeApplicationFP8SQLE_BWAP8sqeAgentcT3P5sqlcaPc + 0x820 0x090000000D680DBC AppLocalStart__14sqeApplicationFP14db2UCinterface + 0x540 | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 10.1 and Fix Pack 5 * **************************************************************** | |
Local Fix: | |
Discontinue the use of DB2_RESOURCE_POLICY if your configuration is at risk of being affected. | |
Solution | |
Problem was first fixed in DB2 Version 10.1 and Fix Pack 5 | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 23.03.2015 13.07.2015 13.07.2015 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) | |
10.1.0.5 |