DB2 - Problem description
Problem IC95496 | Status: Closed |
DB2 MAY TRAP IN sqleCleanupFedDaemonAnchor DURING MULTIPLE LOGICAL NODES STARTUP | |
product: | |
DB2 FOR LUW / DB2FORLUW / A50 - DB2 | |
Problem description: | |
This timing related trap happens only on an MLN DPF setup where the different nodes start up at different times, usually when one node starts much later than the rest. The node where the trap occurs is in the process of starting up and has not reached the initialization of the federated fmp daemon anchor yet. While on another MLN the startup has completed including starting up global event monitors. The event monitor can dispatch a request to the trapping MLN where a subagent is deployed to service the request. This subagent can be interrupted at any time and drive an application termination. Part of the application termination involves clean up of the federated daemon anchor. However, if the federated daemon anchor hasn't been initialized yet this can lead to the trap. The stack trace of the crashed subagent looks like the following: sqleCleanupFedDaemonAnchor__Fv + 0xD0 sqleFedFMPManagerFv + 0x150 TermApplication__14sqeApplicationFv + 0x5C DestroyAppl__14sqeAppServicesFP14sqeApplication + 0x49C sqleProcessSubRequest__FP8sqeAgent + 0x298 RunEDU__8sqeAgentFv + 0x308 EDUDriver__9sqzEDUObjFv + 0x78 sqloEDUEntry + 0x4 In the db2diag.log, the trapping node (i.e. node 4) initiates db2start much later than node 0 where the subagent request was dispatched. 2013-07-11-02.01.54.471940+480 E217431820A314 LEVEL: Event PID : 234888 TID : 1 PROC : db2star2 INSTANCE: db2inst2 NODE : 000 EDUID : 1 FUNCTION: DB2 UDB, base sys utilities, DB2StartMain, probe:911 MESSAGE : ADM7513W Database manager has started. START : DB2 DBM ... 2013-07-11-02.02.12.276740+480 E217723221A314 LEVEL: Event PID : 135176 TID : 1 PROC : db2star2 INSTANCE: db2inst2 NODE : 004 EDUID : 1 FUNCTION: DB2 UDB, base sys utilities, DB2StartMain, probe:911 MESSAGE : ADM7513W Database manager has started. START : DB2 DBM 2013-07-11-02.02.12.329472+480 I217724005A531 LEVEL: Error PID : 415072 TID : 2057 PROC : db2sysc 4 INSTANCE: db2inst2 NODE : 004 APPHDL : 0-69 APPID: *N0.DB2.130710180215 AUTHID : DB2INST2 EDUID : 2057 EDUNAME: db2agent (idle) 4 FUNCTION: DB2 UDB, base sys utilities, sqleagnt_sigsegvh, probe:1 MESSAGE : Error in agent servicing application with coor_node: DATA #1 : Hexdump, 2 bytes 0x070000000E009330 : 0000 | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * All platforms * **************************************************************** * PROBLEM DESCRIPTION: * * This timing related trap happens only on an MLN DPF setup * * where the different nodes start up at different times, * * usually when one node starts much later than the rest. * * * * The node where the trap occurs is in the process of starting * * up * * and has not reached the initialization of the federated fmp * * daemon anchor yet. While on another MLN the * * startup has completed including starting up global event * * monitors. The event monitor can dispatch a request to the * * trapping MLN where a subagent is deployed to service the * * request. This subagent can be interrupted at any time and * * drive an application termination. Part of the application * * termination involves clean up of the federated daemon * * anchor. However, if the federated daemon anchor hasn't * * been initialized yet this can lead to the trap. * * * * The stack trace of the crashed subagent looks like the * * following: * * * * sqleCleanupFedDaemonAnchor__Fv + 0xD0 * * sqleFedFMPManagerFv + 0x150 * * TermApplication__14sqeApplicationFv + 0x5C * * DestroyAppl__14sqeAppServicesFP14sqeApplication + 0x49C * * sqleProcessSubRequest__FP8sqeAgent + 0x298 * * RunEDU__8sqeAgentFv + 0x308 * * EDUDriver__9sqzEDUObjFv + 0x78 * * sqloEDUEntry + 0x4 * * * * In the db2diag.log, the trapping node (i.e. node 4) * * initiates * * db2start * * much later than node 0 where the subagent request was * * dispatched. * * * * 2013-07-11-02.01.54.471940+480 E217431820A314 LEVEL: * * Event * * PID : 234888 TID : 1 PROC : * * db2star2 * * INSTANCE: db2inst2 NODE : 000 * * EDUID : 1 * * FUNCTION: DB2 UDB, base sys utilities, DB2StartMain, * * probe:911 * * MESSAGE : ADM7513W Database manager has started. * * START : DB2 DBM * * * * ... * * * * 2013-07-11-02.02.12.276740+480 E217723221A314 LEVEL: * * Event * * PID : 135176 TID : 1 PROC : * * db2star2 * * INSTANCE: db2inst2 NODE : 004 * * EDUID : 1 * * FUNCTION: DB2 UDB, base sys utilities, DB2StartMain, * * probe:911 * * MESSAGE : ADM7513W Database manager has started. * * START : DB2 DBM * * * * 2013-07-11-02.02.12.329472+480 I217724005A531 LEVEL: * * Error * * PID : 415072 TID : 2057 PROC : * * db2sysc * * 4 * * INSTANCE: db2inst2 NODE : 004 * * APPHDL : 0-69 APPID: *N0.DB2.130710180215 * * AUTHID : DB2INST2 * * EDUID : 2057 EDUNAME: db2agent (idle) 4 * * FUNCTION: DB2 UDB, base sys utilities, sqleagnt_sigsegvh, * * probe:1 * * MESSAGE : Error in agent servicing application with * * coor_node: * * DATA #1 : Hexdump, 2 bytes * * 0x070000000E009330 : 0000 * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 10.5 fix pack 7 * **************************************************************** | |
Local Fix: | |
available fix packs: | |
DB2 Cancun Release 10.5.0.4 (also known as Fix Pack 4) for Linux, UNIX, and Windows | |
Solution | |
The fix will be included in DB2 Version 10.5 fix pack 7. | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 27.08.2013 19.01.2016 27.04.2016 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) | |
10.5.0.4 | |
10.5.0.7 |