DB2 - Problembeschreibung
Problem IC68616 | Status: Geschlossen |
HEALTH MONITOR FAILS TO RESTART IF IT CRASHES | |
Produkt: | |
DB2 FOR LUW / DB2FORLUW / 910 - DB2 | |
Problembeschreibung: | |
The Health Monitor may crash after a Backup Utility runs, a REORG Utility runs, or after the database (DB2 Instance) has been stopped and restarted. Once the Health Monitor crashes, the "Health monitor already in use." message will be reported in the either a few minutes later, a day later or a week later in the db2diag.log. Health Monitor is crashing and later causing the DB2 instances to crash too. Snip messages from db2diag.log 2010-04-05-00.34.17.676000-300 I1H937 LEVEL: Event PID : 7168 TID : 8348 PROC : db2syscs.exe INSTANCE: DB2TOOLS NODE : 000 FUNCTION: DB2 UDB, RAS/PD component, pdLogInternal, probe:120 START : New Diagnostic Log file DATA #1 : Build Level, 128 bytes Instance "DB2TOOLS" uses "32" bits and DB2 code release "SQL09018" with level identifier "02090107". Informational tokens are "DB2 v9.1.800.1023", "s090823", "WR21437", Fix Pack "8". DATA #2 : System Info, 1564 bytes System: WIN32_NT SCOOBY Service Pack 2 5.2 x86 Family 15, model 2, stepping 5 CPU: total:4 online:4 Cores per socket:1 Threading degree per core:2 Physical Memory(MB): total:3072 free:1734 available:1734 Virtual Memory(MB): total:5895 free:3272 Swap Memory(MB): total:2823 free:1538 Information in this record is only valid at the time when this file was created (see this record's time stamp) 2010-04-05-00.34.17.660000-300 E941H420 LEVEL: Warning PID : 7168 TID : 8348 PROC : db2syscs.exe INSTANCE: DB2TOOLS NODE : 000 FUNCTION: DB2 UDB, routine_infrastructure, sqlerReturnFmpToPool, probe:999 DATA #1 : String, 34 bytes Removing FMP from pool FMP handle: DATA #2 : sqlerFmpHandle, PD_SQLER_TYPE_FMP_HANDLE, 12 bytes fmpPid: 2356 pFmpEntry: 0x01cc8eec 2010-04-05-00.34.17.754000-300 E1363H2098 LEVEL: Warning PID : 7168 TID : 8348 PROC : db2syscs.exe INSTANCE: DB2TOOLS NODE : 000 FUNCTION: DB2 UDB, routine_infrastructure, sqlerReturnFmpToPool, probe:1000 DATA #1 : String, 10 bytes Fmp Entry: DATA #2 : sqlerFmpThreadList, PD_SQLER_TYPE_FMP_THREAD_LIST, 364 bytes fmpTid: 0 next Ptr: 0x00000000 prev Ptr: 0x00000000 agentCB Ptr: 0x00000000 fmpRow Ptr: 0x01cc8ee0 ipcResources Ptr: 0x01cc8e30 useCount: 0 nestLevel: 0 refreshThreadClass: 0 assocBroken: 0 threadFlags: 0x00000000 fmpComHandle: 0x00000000 sendBuffer: 0x00000000 recvBuffer: 0x00000000 bufferSize: 0 bytesSent: 0 bytesReceived: 0 sendLength: 0 bufferPosition: 0 fmpConditions: 0x01CC8EEC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8EFC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F0C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F1C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F2C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F3C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F4C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F5C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F6C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F7C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F8C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F9C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FAC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FBC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FCC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FDC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FEC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FFC : 0000 0000 0000 0000 0000 0000 ............ 2010-04-05-00.34.17.754000-300 E3463H2531 LEVEL: Warning PID : 7168 TID : 8348 PROC : db2syscs.exe INSTANCE: DB2TOOLS NODE : 000 FUNCTION: DB2 UDB, routine_infrastructure, sqlerReturnFmpToPool, probe:1001 DATA #1 : String, 8 bytes Fmp Row: DATA #2 : sqlerFmpRow, PD_SQLER_TYPE_FMP_ROW, 416 bytes fmpPid: 2356 fmpPoolList Ptr: 0x00000000 fmpForcedList Ptr: 0x00000000 nextFmpCB Ptr: 0x00000000 prevFmpCB Ptr: 0x00000000 fmpIPCList Ptr: 0x01cc90a0 stateFlags: 0x00000028 numFmp32Attaches: 0 numActiveThreads: 0 numPoolThreads: 0 fmpCodePage: 0 fmpRowUseCount: 0 active: 0x01 rowLoaderValidate: 0x00 ipcLatch: 0x01CC8EE4 : 0000 8A01 .... rowLatch: 0x01CC8EE8 : 0000 8B01 .... fmpAgentList: 0x01CC8EEC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8EFC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F0C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F1C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F2C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F3C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F4C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F5C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F6C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F7C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F8C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8F9C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FAC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FBC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FCC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FDC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FEC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC8FFC : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC900C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC901C : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x01CC902C : 0000 0000 0000 0000 0000 0000 E08E CC01 ................ 0x01CC903C : 308E CC01 0000 0000 0000 0000 0000 0000 0............... 0x01CC904C : 0000 0000 0000 0000 0000 0000 ............ 2010-04-05-00.34.17.801000-300 I5996H301 LEVEL: Warning PID : 7168 TID : 8348 PROC : db2syscs.exe INSTANCE: DB2TOOLS NODE : 000 FUNCTION: DB2 UDB, routine_infrastructure, sqlerQueryHmonExistence, probe:99 MESSAGE : Health Monitor Process crashed. 2010-04-05-00.34.27.801000-300 I6299H291 LEVEL: Severe PID : 7168 TID : 8348 PROC : db2syscs.exe INSTANCE: DB2TOOLS NODE : 000 FUNCTION: DB2 UDB, routine_infrastructure, sqlerGetHmonFmp, probe:30 MESSAGE : Health monitor already in use. Repeating the above message many times. "Health Monitor already in use" db2agent may eventually crash with a trap whose stack looks like <StackTrace> -------Frame------ ------Function + Offset------ 0x09000000050219F8 sqlerHmonInitCommsLayer__FP11sqlerFmpRowP13sqle_agent_cb + 0x9C 0x0900000003C29B94 sqlerHmonSendRequest__FP21sqlerHmonFmpReqStructP13sqle_agent_cb + 0x150 0x0900000003C4C260 sqlmhEstimateCollSize__FP21sqlmhCollRequestParamPUi + 0x84 0x0900000003C4C00C hmonSnapAddCollSize__FP20sqlmhMonSnapSizeArgsPcUiPUi + 0xA4 0x0900000003C452F0 hmonSnapAproxNodeSz__FP20sqlmhMonSnapSizeArgsP9SNodeCB_tPUiUl + 0x43C 0x0900000003C40770 hmonSnapSzDB__FP20sqlmhMonSnapSizeArgsP14HmonMainStructPUcUiPUi + 0xF0 0x0900000003C3FE48 hmonSnapGetSz__FP20sqlmhMonSnapSizeArgs + 0x2F8 0x0900000003C3F4B0 sqlmhmonszagnt__FUi13sqm_entity_idP6sqlmaiPUiT1P5sqlca + 0x1EC 0x0900000002FB7490 sqlmonszbackend__FP10sqle_db2ra + 0x868 0x0900000002C827AC sqlesrvr__FP14db2UCinterface + 0x18C 0x0900000003929048 sqleMappingFnServer__FP5sqldaP5sqlca + 0x170 0x09000000039289F8 sqlerKnownProcedure__FiPcPiP5sqldaT4P13sqlerFmpTableP13sqle_agen t_cbP5sqlca + 0x210 0x0900000003927D60 sqlerCallDL__FP14db2UCinterfaceP9UCstpInfo + 0x288 0x0900000003928130 sqljs_ddm_excsqlstt__FP14db2UCinterfaceP14sqljsDDMObject + 0xF8 0x090000000384E9BC sqljsParseRdbAccessed__FP13sqljsDrdaAsCbP14sqljsDDMObjectP14db2U Cinterface - 0x48 0x090000000384E7BC sqljsParse__FP13sqljsDrdaAsCbP14db2UCinterface + 0x28C 0x090000000384D8BC @48@sqljsSqlam__FP14db2UCinterfaceP13sqle_agent_cbb + 0xD4 0x09000000036F46DC @48@sqljsDriveRequests__FP13sqle_agent_cbP11UCconHandle + 0xA4 0x09000000036F4528 @48@sqljsDrdaAsInnerDriver__FP17sqlcc_init_structb + 0xD0 0x09000000036F42C4 sqljsDrdaAsDriver__FP17sqlcc_init_struct + 0xA0 0x09000000036EBA44 sqleRunAgent__FPcUi + 0x36C 0x09000000037493A8 sqloCreateEDU__FPFPcUi_vPcUlP13SQLO_EDU_INFOPi + 0x270 0x09000000036E6004 sqloSpawnEDU + 0x234 0x09000000036E59D8 sqleCreateNewAgent__FiP8sqlekrcbP17sqlcc_init_structP16sqlkdRqst RplyFmtP18sqle_master_app_cbT1P20agentPoolLatchVectorP16SQLO_EDU WAITPOSTP17sqle_connect_infoPP13sqle_agent_cb + 0x28C 0x09000000036E3F50 sqleGetAgentFromPool__FiP17sqlcc_init_structT1P12sqlz_app_hdlP16 sqlkdRqstRplyFmtP17sqle_connect_info + 0x82C 0x09000000036E335C sqleGetAgent__FiP17sqlcc_init_structT1P12sqlz_app_hdlP16sqlkdRqs tRplyFmtT1 + 0x2E4 0x0900000003745C2C sqlccipcconnmgr_child__FPcUi + 0x340 0x09000000037493A8 sqloCreateEDU__FPFPcUi_vPcUlP13SQLO_EDU_INFOPi + 0x270 0x09000000036E6004 sqloSpawnEDU + 0x234 0x090000000392B1EC sqlccipcconnmgr__FP15SQLCC_CONNMGR_TP9sqlf_kcfd + 0x168 0x0000000100004AB4 sqleInitSysCtlr__FPi + 0xD30 0x00000001000031D4 sqleSysCtlr__Fv + 0x32C 0x0900000003747058 @49@sqloSystemControllerMain__FCUiPFv_iPFi_vPPvCPi + 0x580 0x090000000375101C sqloRunInstance + 0xA8 0x00000001000029F4 DB2main + 0x92C 0x0000000100003D50 main + 0x10 </StackTrace> | |
Problem-Zusammenfassung: | |
**************************************************************** * USERS AFFECTED: * * It can happen in all DB2 supported Platform * **************************************************************** * PROBLEM DESCRIPTION: * * Health Monitor Process never comes up again once it crashes. * * * * In Windows: * * * * The db2diag.log will have entries: * * 2010-04-05-00.34.17.801000-300 I5996H301 LEVEL: * * * * Warning * * * * PID : 7168 TID : 8348 PROC : * * * * db2syscs.exe * * * * INSTANCE: DB2TOOLS NODE : 000 * * * * FUNCTION: DB2 UDB, routine_infrastructure, * * * * sqlerQueryHmonExistence, * * * * probe:99 * * * * MESSAGE : Health Monitor Process crashed. * * * * * * * * 2010-04-05-00.34.27.801000-300 I6299H291 LEVEL: * * * * Severe * * * * PID : 7168 TID : 8348 PROC : * * * * db2syscs.exe * * * * INSTANCE: DB2TOOLS NODE : 000 * * * * FUNCTION: DB2 UDB, routine_infrastructure, * * sqlerGetHmonFmp, * * probe:30 * * * * MESSAGE : Health monitor already in use. * * * * * * * * Repeating the above message many times. * * * * "Health Monitor already in use" * * * * In Unix platforms: * * After Health Monitor crashes, * * 2011-06-14-05.39.35.143026-240 E10434E400 LEVEL: * * Error * * PID : 28613 TID : 47879456549184PROC : * * db2wdog * * INSTANCE: ginaulak NODE : 000 * * EDUID : 2 EDUNAME: db2wdog * * FUNCTION: DB2 UDB, base sys utilities, * * sqleChildCrashHandler, probe:15 * * DATA #1 : <preformatted> * * Health Monitor Process crashed. Process id: 28690, OSS term * * code: 0x102, signal: 9 * * * * We may not have "Health Monitor already in use" message in * * the db2diag.log * * * * ps command will not show db2acd process anymore, even after * * it is expected to restart itself up after 15 minutes of * * crashing. * **************************************************************** * RECOMMENDATION: * * Upgrading to db2 version 9.1 fix pack 10 will fix the * * problem * **************************************************************** | |
Local-Fix: | |
Only way to restart Health Monitor is to restart the DB2 instance | |
verfügbare FixPacks: | |
DB2 Version 9.1 Fix Pack 10 for Linux, UNIX and Windows | |
Lösung | |
This has been fixed in db2 version 9.1 fix pack 10. With this fix, Health monitor is expected to start itself up after 15 minutes of crashing without any problem. | |
Workaround | |
Only way to restart Health Monitor is to restart the DB2 instance | |
Weitere Daten | |
Datum - Problem gemeldet : Datum - Problem geschlossen : Datum - der letzten Änderung: | 12.05.2010 16.06.2011 16.06.2011 |
Problem behoben ab folgender Versionen (IBM BugInfos) | |
9.1.FP10 | |
Problem behoben lt. FixList in der Version | |
9.1.0.10 |