DB2 - Problem description
Problem IC84086 | Status: Closed |
DB2 MAY HANG DUE TO LATCH ON SQLO_LT_SQLE_FEDFMP_APP_CB__FMPAPPL ATCH AFTER -1131 AND -1042 ERRORS. | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
DB2 might experience a hung in which an agent is holding SQLO_LT_SQLE_FEDFMP_APP_CB__FMPAPPLATCH latch. It can happen that other agents waiting on this latch will be holding crucial latches (example SQLO_LT_sqeLocalDatabase__dblatch or SQLO_LT_SQLP_TENTRY__tranEntryLatch) for database functioning, halting the whole system. The db2diag.log might show errors like: sqleFedFMPManager::disconnect, probe:30 RETCODE : ZRC=0xFFFFFB95=-1131 and sqleFedFMPManager::returnFmpToPool, probe:30 RETCODE : ZRC=0xFFFFFBEE=-1042 The following is an example of stacks of holder and waiter when this problem happens: EDU name : db2agntdp (DW ) 0 EDU ID : 348 <LatchInformation> Holding Latch type: (SQLO_LT_SQLE_FEDFMP_APP_CB__fmpAppLatch) - Address: (0x200dfc4a8), Line: 481, File: sqle_fed_fmp.C HoldCount: 2 </LatchInformation> 0x00002AAAABF43973 sqloWaitEDUWaitPost + 0x0195 (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1) 0x00002AAAABB58298 _ZN8sqeAgent14WaitAgentEventEPji + 0x0012 (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1) 0x00002AAAABB7576B _ZN16sqeAgentServices14GetNextRequestEPjP8sqeAgent + 0x034d (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1) 0x00002AAAABB556DD _ZN8sqeAgent6RunEDUEv + 0x0443 (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1) 0x00002AAAAC2249C0 _ZN9sqzEDUObj9EDUDriverEv + 0x00a6 (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1) 0x00002AAAAC224917 _Z10sqlzRunEDUPcj + 0x0009 EDU name : db2agent (DW) 0 EDU ID : 313 <LatchInformation> Waiting on latch type: (SQLO_LT_SQLE_FEDFMP_APP_CB__fmpAppLatch) - Address: (0x200dfc4a8), Line: 481, File: sqle_fed_fmp.C Holding Latch type: (SQLO_LT_SQLP_TENTRY__tranEntryLatch) - Address: (0x2aaad4138268), Line: 748, File: /view/db2_v97fp4_linuxamd64_s110330/vbs/engn/include/sqlpi_inlin es.h HoldCount: 1 Holding Latch type: (SQLO_LT_SQLP_SAVEPOINTS__spLatch) - Address: (0x2aaad4137d40), Line: 851, File: /view/db2_v97fp4_linuxamd64_s110330/vbs/engn/include/sqlpi_inlin es.h HoldCount: 1 </LatchInformation> 0x00002AAAAC8F27E5 _ZN17sqleFedFMPManager14getFmpAppLatchEP18sqle_FedFMP_app_cbj + 0x0063 (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1) 0x00002AAAAC8F24DB _ZN17sqleFedFMPManager14validateFedFmpEP18sqle_FedFMP_app_cbbP14 sqlqg_Fmp_InfoP13sqlerFmpParmsP14sqlerFmpHandlePi + 0x006b (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1) 0x00002AAAAC8F2027 _ZN17sqleFedFMPManager14getFmpFromPoolEP13sqlerFmpParmsP14sqlqg_ Fmp_InfoP14sqlerFmpHandle + 0x0081 (/db2home/db2inst1/sqllib/lib64/libdb2e.so.1) A trace when the issue occurs might show the following: 17170 | | | | sqlerFedInvokeFencedRoutine entry [eduid 7457 eduname db2agent] 17172 | | | | | sqleFedFMPManager::getFmpFromPool entry [eduid 7457 eduname db2agent] 17174 | | | | | | sqleFedFMPManager::validateFedFmp entry [eduid 7457 eduname db2agent] 17176 | | | | | | | sqleFedFMPManager::getFmpAppLatch entry [eduid 7457 eduname db2agent] 17179 | | | | | | | sqleFedFMPManager::getFmpAppLatch exit 17180 | | | | | | sqleFedFMPManager::validateFedFmp data [probe 1] 17183 | | | | | | | sqleFedFMPManager::addLogRecord entry [eduid 7457 eduname db2agent] 17185 | | | | | | | sqleFedFMPManager::addLogRecord exit 17187 | | | | | | sqleFedFMPManager::validateFedFmp exit [rc = 1] 17189 | | | | | sqleFedFMPManager::getFmpFromPool exit 17191 | | | | | sqlerSendFmpStart entry [eduid 7457 eduname db2agent] 17227 | | | | | | sqlerRtnWriteFencedArgData entry [eduid 7457 eduname db2agent] 17232 | | | | | | sqlerRtnWriteFencedArgData exit 17238 | | | | | sqlerSendFmpStart data [probe 42] 17243 | | | | | sqlerSendFmpStart exit 17247 | | | | | sqeAgent::AgentBreathingPoint entry [eduid 7457 eduname db2agent] 17249 | | | | | | sqeAgent::QueryInterrupt entry [eduid 7457 eduname db2agent] 17253 | | | | | | sqeAgent::QueryInterrupt data [probe 70] 17255 | | | | | | sqeAgent::QueryInterrupt exit 17257 | | | | | sqeAgent::AgentBreathingPoint exit [rc = 1] 17259 | | | | | sqlerInterruptFmp entry [eduid 7457 eduname db2agent] 17261 | | | | | | sqlerInterruptThreadedFmp entry [eduid 7457 eduname db2agent] 17264 | | | | | | | sqlerMasterThreadReq entry [eduid 7457 eduname db2agent] 17268 | | | | | | | sqlerMasterThreadReq exit [rc = 0xFFFFFBEE = -1042] 17270 | | | | | | sqlerInterruptThreadedFmp error [probe 10] [ ZRC = 0xFFFFFBEE = -1042] 17272 | | | | | | sqlerInterruptThreadedFmp exit [rc = 0xFFFFFBEE = -1042] 17274 | | | | | sqlerInterruptFmp exit [rc = 0xFFFFFBEE = -1042] 17276 | | | | sqlerFedInvokeFencedRoutine error [probe 70] 17281 | | | | sqlerFedInvokeFencedRoutine exit [rc = 0xFFFFFBEE = -1042] 17284 | | | sqlriFedInvokeInvoker exit [rc = 0x8012006D = -2146303891 = SQLR_CA_BUILT] Note that this might look similar to APAR IC83795, but with the fix for IC83795 this problem can still occur. This APAR only applies to Federated environments. | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 9.7 and Fix Pack 7 * **************************************************************** | |
Local Fix: | |
available fix packs: | |
DB2 Version 9.7 Fix Pack 7 for Linux, UNIX, and Windows | |
Solution | |
Problem was first fixed in DB2 Version 9.7 and Fix Pack 7 | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 12.06.2012 20.10.2012 20.10.2012 |
Problem solved at the following versions (IBM BugInfos) | |
9.7. | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.7 |