DB2 - Problem description
Problem IC73855 | Status: Closed |
Working with Federation Server, db2 instance may trap when handled ENOMEM (12) "There is not enough memory available now." | |
product: | |
DB2 FOR LUW / DB2FORLUW / 910 - DB2 | |
Problem description: | |
Instance crashed while running a sql statement with the following stacks: 0x09000000030DFFE0 @124@9@ossLockTestGet__FPVi + 0x18 0x09000000030E00DC sqloXlatchAIX__FP12sqloSpinLockUlCPCcCUl@glueE1 + 0x94 0x090000000308567C sqlerTakeLibLatch__FP27SQLER_LOADED_LIB_HASH_TABLEiT2 + 0x58 0x09000000030857A8 sqlerLibraryLoad + 0x64 0x090000000418F420 sqlriFedOneTimeInitRtn__FP8sqlrr_cbP10sqlri_ufob + 0x308 0x090000000418EE0C sqlriFedInvokeInvoker__FP10sqlri_ufobP14sqlqg_Fmp_Info + 0xB4 0x090000000358C0CC sqlqg_Call_FMP_Thread__FP17sqlqg_FMP_RequestPP15sqlqg_FMP_Reply + 0x394 0x0900000003C4A544 sqlqgDeleteServer__FPP14UnfencedServer16Sqlqg_DeleteType + 0x2BC 0x0900000002833D88 sqlqgFindServer__FPUc12Sqlqg_OpTypePP14UnfencedServerUi + 0x3C4 0x09000000041C378C sqlqgOpen__FP12sqlri_rquery + 0x9EC 0x090000000421E074 sqlri_djx_rta__FP8sqlrr_cb + 0x460 0x09000000030D4020 sqlriunn__FP8sqlrr_cbP10sqlri_stob + 0x24 0x09000000030D419C sqlriset__FP8sqlrr_cb + 0x78 0x09000000030CAF28 sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + 0x4 0x0900000003019240 sqlrr_process_fetch_request__FP14db2UCinterface + 0x110 0x0900000002E825FC sqlrr_open__FP14db2UCinterfaceP15db2UCCursorInfo + 0x2C 0x0900000002F72518 sqljs_ddm_opnqry__FP14db2UCinterfaceP14sqljsDDMObject + 0xA94 Prior to this crash, there were many errors in db2diag.log reported "No Memory Available", e.g.: 2010-12-07-13.42.40.631290+480 E1791023315A732 LEVEL: Error (OS) PID : 1355956 TID : 1 PROC : db2agent (SAMPLE) 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-525 APPID: GA990B88.A410.11C700223017 AUTHID : HUJINPEI FUNCTION: DB2 UDB, oper system services, sqloAIXLoadModuleTryShr, probe:130 CALLED : OS, -, dlopen OSERR : ENOMEM (12) "There is not enough memory available now." MESSAGE : Attempt to load specified library failed. DATA #1 : Library name or path, 41 bytes /hujinpei/db2inst1/sqllib/lib64/libdb2qgstp.a DATA #2 : shared library load flags, PD_TYPE_LOAD_FLAGS, 4 bytes 2010-12-07-13.42.40.649557+480 I1791025558A485 LEVEL: Error PID : 1355956 TID : 1 PROC : db2agent (SAMPLE) 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-525 AUTHID : HUJINPEI FUNCTION: DB2 UDB, runtime interpreter, sqlriFedOneTimeInitRtn, probe:40 RETCODE : ZRC=0x8B0F0000=-1961951232=SQLO_NOMEM "No Memory Available (reason code is id of requested heap)" DIA8300C A memory heap error has occurred. 2010-12-07-13.43.37.708714+480 E1791083624A1533 LEVEL: Warning (OS) PID : 1355956 TID : 1 PROC : db2agent (SAMPLE) 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-1417 APPID: GA990B88.CE10.11C700223137 <app 1417 saw NOMEM> AUTHID : HUJINPEI FUNCTION: DB2 UDB, SQO Memory Management, sqloLogMemoryCondition, probe:100 CALLED : OS, -, malloc OSERR : ENOMEM (12) "There is not enough memory available now." MESSAGE : Private memory and/or virtual address space exhausted, or data ulimit exceeded DATA #1 : Soft data resource limit, PD_TYPE_RLIM_DATA_CUR, 8 bytes 251657728 DATA #2 : Requested size, PD_TYPE_MEM_REQUESTED_SIZE, 8 bytes 266240 DATA #3 : Current set size, PD_TYPE_SET_SIZE, 8 bytes 215678976 CALLSTCK: [0] 0x0900000002DE3340 sqloLogMemoryCondition + 0x26C [1] 0x0900000002DE3018 sqloLogMemoryCondition@glue236 + 0x74 [2] 0x09000000031A1F3C sqlogmblkEx + 0xC [3] 0x09000000041E9600 allocMgr__13sqlqg_memPoolFP19sqlqg_memEntityInfo + 0x7C [4] 0x09000000041E8F68 getMgrs__13sqlqg_memPoolFP19sqlqg_memEntityInfoP17sqlqg_memMgrsI nfo + 0x1A0 [5] 0x09000000041E7E54 initialize_server__14UnfencedServerFP11Server_Infoi + 0x260 [6] 0x09000000041E77BC initialize_server__25UnfencedRelational_ServerFP11Server_Infoi + 0x670 [7] 0x0900000003C4AB88 get_server__7WrapperFPUcP11Server_InfoPi + 0x14C [8] 0x09000000028340D4 sqlqgFindServer__FPUc12Sqlqg_OpTypePP14UnfencedServerUi + 0x710 [9] 0x09000000041C378C sqlqgOpen__FP12sqlri_rquery + 0x9EC | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * Instance crashed while running a sql statement with the * * * * following stacks: * * * * * * * * 0x09000000030DFFE0 @124@9@ossLockTestGet__FPVi + 0x18 * * * * 0x09000000030E00DC * * * * sqloXlatchAIX__FP12sqloSpinLockUlCPCcCUl@glueE1 + 0x94 * * * * 0x090000000308567C * * * * sqlerTakeLibLatch__FP27SQLER_LOADED_LIB_HASH_TABLEiT2 + 0x58 * * * * 0x09000000030857A8 sqlerLibraryLoad + 0x64 * * * * 0x090000000418F420 * * * * sqlriFedOneTimeInitRtn__FP8sqlrr_cbP10sqlri_ufob + 0x308 * * * * 0x090000000418EE0C * * * * sqlriFedInvokeInvoker__FP10sqlri_ufobP14sqlqg_Fmp_Info + * * 0xB4 * * 0x090000000358C0CC * * * * sqlqg_Call_FMP_Thread__FP17sqlqg_FMP_RequestPP15sqlqg_FMP_Repl * + 0x394 * * * * 0x0900000003C4A544 * * * * sqlqgDeleteServer__FPP14UnfencedServer16Sqlqg_DeleteType + * * 0x2BC * * 0x0900000002833D88 * * * * sqlqgFindServer__FPUc12Sqlqg_OpTypePP14UnfencedServerUi + * * 0x3C4 * * 0x09000000041C378C sqlqgOpen__FP12sqlri_rquery + 0x9EC * * * * 0x090000000421E074 sqlri_djx_rta__FP8sqlrr_cb + 0x460 * * * * 0x09000000030D4020 sqlriunn__FP8sqlrr_cbP10sqlri_stob + 0x24 * * * * 0x09000000030D419C sqlriset__FP8sqlrr_cb + 0x78 * * * * 0x09000000030CAF28 * * sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + * * 0x4 * * * * 0x0900000003019240 * * * * sqlrr_process_fetch_request__FP14db2UCinterface + 0x110 * * * * 0x0900000002E825FC * * * * sqlrr_open__FP14db2UCinterfaceP15db2UCCursorInfo + 0x2C * * * * 0x0900000002F72518 * * * * sqljs_ddm_opnqry__FP14db2UCinterfaceP14sqljsDDMObject + * * 0xA94 * * * * * * Prior to this crash, there were many errors in db2diag.log * * * * reported "No Memory Available", e.g.: * * * * * * * * 2010-12-07-13.42.40.631290+480 E1791023315A732 LEVEL: * * Error * * (OS) * * * * PID : 1355956 TID : 1 PROC : * * * * db2agent (BASS_DW) 0 * * * * INSTANCE: dwinst NODE : 000 DB : * * BASS_DW * * APPHDL : 0-525 APPID: * * GA990B88.A410.11C700223017 * * AUTHID : TYMX * * * * FUNCTION: DB2 UDB, oper system services, * * * * sqloAIXLoadModuleTryShr, probe:130 * * * * CALLED : OS, -, dlopen * * * * OSERR : ENOMEM (12) "There is not enough memory available * * * * now." * * * * MESSAGE : Attempt to load specified library failed. * * * * DATA #1 : Library name or path, 41 bytes * * * * /dwhome/dwinst/sqllib/lib64/libdb2qgstp.a * * * * DATA #2 : shared library load flags, PD_TYPE_LOAD_FLAGS, 4 * * bytes * * * * * * 2010-12-07-13.42.40.649557+480 I1791025558A485 LEVEL: * * Error * * PID : 1355956 TID : 1 PROC : * * * * db2agent (BASS_DW) 0 * * * * INSTANCE: dwinst NODE : 000 DB : * * BASS_DW * * APPHDL : 0-525 * * * * AUTHID : TYMX * * * * FUNCTION: DB2 UDB, runtime interpreter, * * sqlriFedOneTimeInitRtn, * * probe:40 * * * * RETCODE : ZRC=0x8B0F0000=-1961951232=SQLO_NOMEM * * * * "No Memory Available (reason code is id of * * requested * * heap)" * * * * DIA8300C A memory heap error has occurred. * * * * * * * * 2010-12-07-13.43.37.708714+480 E1791083624A1533 LEVEL: * * Warning * * (OS) * * * * PID : 1355956 TID : 1 PROC : * * * * db2agent (BASS_DW) 0 * * * * INSTANCE: dwinst NODE : 000 DB : * * BASS_DW * * APPHDL : 0-1417 APPID: * * GA990B88.CE10.11C700223137 * * <app 1417 saw NOMEM> * * * * AUTHID : TYMX * * * * FUNCTION: DB2 UDB, SQO Memory Management, * * * * sqloLogMemoryCondition, probe:100 * * * * CALLED : OS, -, malloc * * * * OSERR : ENOMEM (12) "There is not enough memory available * * * * now." * * * * MESSAGE : Private memory and/or virtual address space * * exhausted, * * or data ulimit * * * * exceeded * * * * DATA #1 : Soft data resource limit, PD_TYPE_RLIM_DATA_CUR, 8 * * * * bytes * * * * 251657728 * * * * DATA #2 : Requested size, PD_TYPE_MEM_REQUESTED_SIZE, 8 * * bytes * * 266240 * * * * DATA #3 : Current set size, PD_TYPE_SET_SIZE, 8 bytes * * * * 215678976 * * * * CALLSTCK: * * * * [0] 0x0900000002DE3340 sqloLogMemoryCondition + 0x26C * * * * [1] 0x0900000002DE3018 sqloLogMemoryCondition@glue236 + * * 0x74 * * [2] 0x09000000031A1F3C sqlogmblkEx + 0xC * * * * [3] 0x09000000041E9600 * * * * allocMgr__13sqlqg_memPoolFP19sqlqg_memEntityInfo + 0x7C * * * * [4] 0x09000000041E8F68 * * * * getMgrs__13sqlqg_memPoolFP19sqlqg_memEntityInfoP17sqlqg_memMgr * nfo * * * * + 0x1A0 * * * * [5] 0x09000000041E7E54 * * * * initialize_server__14UnfencedServerFP11Server_Infoi + 0x260 * * * * [6] 0x09000000041E77BC * * * * initialize_server__25UnfencedRelational_ServerFP11Server_Infoi * + * * 0x670 * * * * [7] 0x0900000003C4AB88 * * * * get_server__7WrapperFPUcP11Server_InfoPi + 0x14C * * * * [8] 0x09000000028340D4 * * * * sqlqgFindServer__FPUc12Sqlqg_OpTypePP14UnfencedServerUi + * * 0x710 * * [9] 0x09000000041C378C sqlqgOpen__FP12sqlri_rquery + 0x9EC * **************************************************************** * RECOMMENDATION: * * Upgarde to v9.1FP11 * **************************************************************** | |
Local Fix: | |
available fix packs: | |
DB2 Version 9.1 Fix Pack 11 for Linux, UNIX and Windows | |
Solution | |
In sqlriFedOneTimeInitRtn, add checking in case the librariesTable is null in else block as below which could avoid from trap potentially. +3241 if (djfmp_appCB->dj_handle == NULL) +3242 { ...... +3286 } else { <Check librariesTable here and throw 901 if null> +3287 // Look up the datajoiner library for the invokercb (it's already +3288 // in the hash table) +3289 rc = sqlerLibraryLoad ("", // don't care about the routinename +3290 "", // or specific name +3291 libPath, +3292 (char *)QUERY_GATEWAY_STP_LIBRARY, +3293 "", // don't care about the func name +3294 l_sqlr_rcb->agent_private_cbp->librariesTable, +3295 &invptr->pLoadedLib, +3296 FALSE, +3297 invptr, +3298 NULL, +3299 l_sqlr_rcb->sqlca); ...... +3307 } //@ed242033tjv | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC75480 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 12.01.2011 07.03.2011 26.04.2011 |
Problem solved at the following versions (IBM BugInfos) | |
9.0.1, 9.1.FP11 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.1.0.11 |