DB2 - Problem description
Problem IC64760 | Status: Closed |
DB2 INSTANCE MAY TRAP DUE TO RARE TIMING CONDITIONS INVOLVING TRUSTED ROUTINES | |
product: | |
DB2 FOR LUW / DB2FORLUW / 950 - DB2 | |
Problem description: | |
**NOTE this APAR does not address all scenarios where these symptoms may occur, please see APAR IC77204 which covers additional scenarios** DB2 may trap in a rare sequence of unpredictable events involving trusted routines. Most trusted routines link with and load the DB2 client library (libdb2), which creates an "application-side" private memory manager in the DB2 server process. This process-wide memory manager survives until all agents unload their routines. The routines can be cached for long periods of time and are only unloaded at specific points in processing (for example, processing a disconnect). When the "application-side" memory manager/area is deallocated, all of its memory is usually decommitted (0'd), and will always be available for reuse for different purpose (memory contents will change at some point). The problem can be introduced shortly after agent recycling/reuse. Depending on the agent's history, its private "server-side" thread-specific memory area may get deallocated on agent recycling, which is normal. Depending on the agent's history and the current new request (involving a trusted routine), an agent may end up allocating all its subsequent important server-side memory from the wrong "application-side" memory manager. This is the problem fixed by the APAR, the fix corrects this. This still may not result in any problem. The agent may continue to allocate memory - unrelated to trusted routine execution - from the wrong memory area. The application-side memory manager may persist for a long period of time (unrelated to the "problem" agent's processing), keeping the wrongly allocated memory valid, during which the "problem" agent may get fully recycled or terminates, and the problem is not exposed. If, however, the application-side memory manager is deallocated while the agent is in its problem state, important memory areas are deallocated and the agent will trap. Various symptoms may occur depending on where it is executing. The problem is more exposed on DPF systems due to differences in agent reuse/recycling. Below is an example of what may appear in a DB2 diagnostic trap file, however, other stack traces may be possible. <StackTrace> -------Frame------ ------Function + Offset------ 0x09000000061FE3B8 sqloCrashOnCriticalMemoryValidationFailure + 0x1C 0x0900000006206618 sqloCrashOnCriticalMemoryValidationFailure@glue5C2 + 0x1C 0x09000000060FAB10 sqlofmblkEx + 0x1C 0x09000000068213A8 sqldReleaseWorkAreaMem__FP8sqeAgentiUlPUl + 0x1B4 0x0900000006821130 sqldTermAgent__FP8sqeAgent + 0xC8 0x090000000682186C AppDisassoc__8sqeAgentFiT1 + 0x198 0x09000000068DA1D0 sqleUCagentConnCleanup__FP8sqeAgentP14db2UCconHandlebT3 + 0xAC 0x09000000068D983C RunEDU__8sqeAgentFv + 0x218 0x09000000068D949C EDUDriver__9sqzEDUObjFv + 0x94 0x09000000068DF390 sqloEDUEntry + 0x4 </StackTrace> | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * All systems * **************************************************************** * PROBLEM DESCRIPTION: * * See error description * **************************************************************** * RECOMMENDATION: * * Upgrade to a level of DB2 containing the APAR * **************************************************************** | |
Local Fix: | |
WORKAROUND: Invoke a trusted routine through a persistent connection on every database partition. eg. export DB2NODE=<db partition> db2 connect to <database> db2 values dayofweek"(current timestamp)" | |
available fix packs: | |
DB2 Version 9.5 Fix Pack 6a for Linux, UNIX, and Windows | |
Solution | |
Problem first fixed in DB2 Version 9.5 Fix Pack 6 | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC64773 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 20.11.2009 14.06.2010 03.11.2011 |
Problem solved at the following versions (IBM BugInfos) | |
9.5.FP6 | |
Problem solved according to the fixlist(s) of the following version(s) |