DB2 - Problem description
Problem IC64024 | Status: Closed |
HADR: CRASH ON OLD STANDBY AFTER STOPPING AND HADR TAKE OVER BY FORCE ZRC=0X870F0009=-2029060087=SQLO_EOF | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
If HADR Primary is stopped the current log will be archived and closed. This information gets sent to the Standby but if this is also stoppend and then we run HADR take over by force on the Standby, we might encounter that the log that should have been closed, continues to be used in this server, instead of moving onto a new log. A rollback in this situation will lead to a crash with the following stack: <StackTrace> -------Frame------ ------Function + Offset------ 0x09000000029C3E4C sqloDumpEDU + 0x14 0x0900000002DAFD5C sqlzerdm__FP20sqle_agent_privatecbP5sqlcaPci + 0x5E0 0x0900000002DACC88 sqlrerdm__FUii + 0xFC 0x0900000002CEAFC8 sqlrrbck_dps__FP8sqlrr_cbiN22P22sqlxa_call_info_structP9SQLP_GXI D + 0x3C8 0x0900000002CE9100 sqlrrbck__FP8sqlrr_cbiN22P22sqlxa_call_info_struct + 0x394 0x0900000002D564E0 sqlrr_rollback__FP7UCintfc + 0xEC 0x0900000002A24E04 sqljs_ddm_rdbrllbck__FP7UCintfcP14sqljsDDMObject + 0x170 0x0900000002A28778 sqljsParseRdbAccessed__FP13sqljsDrdaAsCbP14sqljsDDMObjectP7UCint fc + 0xF8 0x0900000002A283BC sqljsParse__FP13sqljsDrdaAsCbP7UCintfc + 0x2B0 0x0900000002A2D5CC sqljsSqlam__FP7UCintfcP13sqle_agent_cbb + 0x260 0x0900000002A2DCA4 sqljsDriveRequests__FP13sqle_agent_cbP11UCconHandle + 0xA0 0x0900000002A2DB04 sqljsDrdaAsInnerDriver__FP17sqlcc_init_structb + 0xD4 0x0900000002A2D838 sqljsDrdaAsDriver__FP17sqlcc_init_struct + 0xA8 0x0900000002BC7968 sqleRunAgent__FPcUi + 0x2B8 0x09000000029C0BD8 sqloCreateEDU__FPFPcUi_vPcUlP13SQLO_EDU_INFOPi + 0x250 0x09000000029C0614 sqloSpawnEDU + 0x268 0x0900000002BC6490 sqleCreateNewAgent__FiP8sqlekrcbP17sqlcc_init_structP16sqlkdRqst RplyFmtP18sqle_master_app_cbT1P20agentPoolLatchVectorP16SQLO_EDU WAITPOSTP17sqle_connect_infoPP13sqle_agent_cb + 0x318 0x0900000002BC5D24 sqleGetAgentFromPool__FiP17sqlcc_init_structT1P12sqlz_app_hdlP16 sqlkdRqstRplyFmtP17sqle_connect_info + 0x41C 0x0900000002BC57F0 sqleGetAgent__FiP17sqlcc_init_structT1P12sqlz_app_hdlPv + 0x22C 0x0900000004126D18 sqlcctcpconnmgr_child__FPcUi + 0xEE4 0x09000000029C0BD8 sqloCreateEDU__FPFPcUi_vPcUlP13SQLO_EDU_INFOPi + 0x250 0x09000000029C0614 sqloSpawnEDU + 0x268 0x090000000412539C sqlcctcpconnmgr__FP15SQLCC_CONNMGR_TP9sqlf_kcfd + 0x774 0x0900000004124970 sqlcctcp_start_listen + 0x48 0x09000000028A2F4C sqlccstrts__FP9sqlf_kcfdP15SQLCC_CONNMGR_TRUs + 0x190 0x0000000100004758 sqleInitSysCtlr__FPi + 0xEA4 0x0000000100002EDC sqleSysCtlr__Fv + 0x58 0x090000000303F1B8 sqloRunInstance + 0x7D8 0x0000000100002A60 DB2main + 0x8BC 0x0000000100003880 main + 0x10 </StackTrace> In the db2diag.log we will encounter messages like: 2008-08-24-02.40.30.159924+000 I3612884A776 LEVEL: Error PID : 647248 TID : 1 PROC : db2loggr (EBI) 0 INSTANCE: db2ebi NODE : 000 DB : EBI FUNCTION: DB2 UDB, oper system services, sqloReadBlocks, probe:200 MESSAGE : ZRC=0x870F0009=-2029060087=SQLO_EOF "the data does not exist" DIA8506C Unexpected end of file was reached. DATA #1 : File handle, PD_TYPE_SQO_FILE_HDL, 8 bytes 0x0FFFFFFFFFFFD580 : 0000 0003 0000 0000 ........ DATA #2 : unsigned integer, 4 bytes 89536 DATA #3 : unsigned integer, 8 bytes 4294877763 DATA #4 : unsigned integer, 4 bytes 12 DATA #5 : File Offset, 8 bytes 17591819317248 DATA #6 : File Offset, 8 bytes 366739456 DATA #7 : unsigned integer, 8 bytes 0 2008-08-24-02.40.30.189671+000 I3613661A1101 LEVEL: Error PID : 647248 TID : 1 PROC : db2loggr (EBI) 0 INSTANCE: db2ebi NODE : 000 DB : EBI FUNCTION: DB2 UDB, data protection, sqlpgarl, probe:2440 MESSAGE : Bp 78000002401e000 blkOffSet 89536 ReadCount 4294877763 blksRead 0 FCBp: DATA #1 : Hexdump, 144 bytes 0x07800011A1F07E60 : 0000 1000 0000 0000 0000 0000 0000 0001 ................ 0x07800011A1F07E70 : 0000 0003 0000 0000 FFFF FFFF 0000 0000 ................ 0x07800011A1F07E80 : 0000 0000 0000 0054 FFFF FFFF FFFF FFFF .......T........ 0x07800011A1F07E90 : 0000 0050 0000 0000 0006 0007 0000 2001 ...P.......... . 0x07800011A1F07EA0 : 4942 4D4C 4F47 6159 9FC7 D000 0003 4DE7 IBMLOGaY......M. 0x07800011A1F07EB0 : 0004 0000 0000 0001 48B0 1DDD 3F3D 279F ........H...?='. 0x07800011A1F07EC0 : 48B0 2443 0000 0000 0000 0000 0000 0000 H.$C............ 0x07800011A1F07ED0 : 0000 0000 0000 0000 0101 0030 0000 0000 ...........0.... 0x07800011A1F07EE0 : 0000 0000 0000 0000 0006 0000 0000 0000 ................ | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * HADR: CRASH ON OLD STANDBY AFTER STOPPING AND HADR TAKE OVER * * BY * * FORCE ZRC=0X870F0009=-2029060087=SQLO_EOF * * * * If HADR Primary is stopped the current log will be archived * * and * * closed. This information gets sent to the Standby but if * * * * this is also stoppend and then we run * * * * HADR take over by force on the Standby, we might encounter * * that * * the log that should have been closed, continues to be used * * in * * this server, instead of moving onto a new log. * * * * * * * * A rollback in this situation will lead to a crash with the * * * * following stack: * * * * * * * * <StackTrace> * * * * -------Frame------ ------Function + Offset------ * * * * 0x09000000029C3E4C sqloDumpEDU + 0x14 * * * * 0x0900000002DAFD5C * * sqlzerdm__FP20sqle_agent_privatecbP5sqlcaPci * * + 0x5E0 * * * * 0x0900000002DACC88 sqlrerdm__FUii + 0xFC * * * * 0x0900000002CEAFC8 * * * * sqlrrbck_dps__FP8sqlrr_cbiN22P22sqlxa_call_info_structP9SQLP_G * * * D + 0x3C8 * * * * 0x0900000002CE9100 * * * * sqlrrbck__FP8sqlrr_cbiN22P22sqlxa_call_info_struct + 0x394 * * * * 0x0900000002D564E0 sqlrr_rollback__FP7UCintfc + 0xEC * * * * 0x0900000002A24E04 * * * * sqljs_ddm_rdbrllbck__FP7UCintfcP14sqljsDDMObject + 0x170 * * * * 0x0900000002A28778 * * * * sqljsParseRdbAccessed__FP13sqljsDrdaAsCbP14sqljsDDMObjectP7UCi * * * fc + 0xF8 * * * * 0x0900000002A283BC sqljsParse__FP13sqljsDrdaAsCbP7UCintfc + * * * * 0x2B0 * * * * 0x0900000002A2D5CC sqljsSqlam__FP7UCintfcP13sqle_agent_cbb + * * * * 0x260 * * * * 0x0900000002A2DCA4 * * * * sqljsDriveRequests__FP13sqle_agent_cbP11UCconHandle + 0xA0 * * * * 0x0900000002A2DB04 * * * * sqljsDrdaAsInnerDriver__FP17sqlcc_init_structb + 0xD4 * * * * 0x0900000002A2D838 sqljsDrdaAsDriver__FP17sqlcc_init_struct * * + * * 0xA8 * * * * 0x0900000002BC7968 sqleRunAgent__FPcUi + 0x2B8 * * * * 0x09000000029C0BD8 * * * * sqloCreateEDU__FPFPcUi_vPcUlP13SQLO_EDU_INFOPi + 0x250 * * * * 0x09000000029C0614 sqloSpawnEDU + 0x268 * * * * 0x0900000002BC6490 * * * * sqleCreateNewAgent__FiP8sqlekrcbP17sqlcc_init_structP16sqlkdRq * * * RplyFmtP18sqle_master_app_cbT1P20agentPoolLatchVectorP16SQLO_E * * * WAITPOSTP17sqle_connect_infoPP13sqle_agent_cb + 0x318 * * * * 0x0900000002BC5D24 * * * * sqleGetAgentFromPool__FiP17sqlcc_init_structT1P12sqlz_app_hdlP * * * sqlkdRqstRplyFmtP17sqle_connect_info + 0x41C * * * * 0x0900000002BC57F0 * * * * sqleGetAgent__FiP17sqlcc_init_structT1P12sqlz_app_hdlPv + * * 0x22C * * 0x0900000004126D18 sqlcctcpconnmgr_child__FPcUi + 0xEE4 * * * * 0x09000000029C0BD8 * * * * sqloCreateEDU__FPFPcUi_vPcUlP13SQLO_EDU_INFOPi + 0x250 * * * * 0x09000000029C0614 sqloSpawnEDU + 0x268 * * * * 0x090000000412539C * * * * sqlcctcpconnmgr__FP15SQLCC_CONNMGR_TP9sqlf_kcfd + 0x774 * * * * 0x0900000004124970 sqlcctcp_start_listen + 0x48 * * * * 0x09000000028A2F4C * * sqlccstrts__FP9sqlf_kcfdP15SQLCC_CONNMGR_TRUs * * + 0x190 * * * * 0x0000000100004758 sqleInitSysCtlr__FPi + 0xEA4 * * * * 0x0000000100002EDC sqleSysCtlr__Fv + 0x58 * * * * 0x090000000303F1B8 sqloRunInstance + 0x7D8 * * * * 0x0000000100002A60 DB2main + 0x8BC * * * * 0x0000000100003880 main + 0x10 * * * * </StackTrace> * * * * * * * * In the db2diag.log we will encounter messages like: * * * * * * * * * * * * 2008-08-24-02.40.30.159924+000 I3612884A776 LEVEL: * * Error * * PID : 647248 TID : 1 PROC : * * * * db2loggr (EBI) 0 * * * * INSTANCE: db2ebi NODE : 000 DB : EBI * * * * FUNCTION: DB2 UDB, oper system services, sqloReadBlocks, * * * * probe:200 * * * * MESSAGE : ZRC=0x870F0009=-2029060087=SQLO_EOF "the data does * * not * * exist" * * * * DIA8506C Unexpected end of file was reached. * * * * DATA #1 : File handle, PD_TYPE_SQO_FILE_HDL, 8 bytes * * * * 0x0FFFFFFFFFFFD580 : 0000 0003 0000 0000 * * * * ........ * * * * DATA #2 : unsigned integer, 4 bytes * * * * 89536 * * * * DATA #3 : unsigned integer, 8 bytes * * * * 4294877763 * * * * DATA #4 : unsigned integer, 4 bytes * * * * 12 * * * * DATA #5 : File Offset, 8 bytes * * * * 17591819317248 * * * * DATA #6 : File Offset, 8 bytes * * * * 366739456 * * * * DATA #7 : unsigned integer, 8 bytes * * * * 0 * * * * * * * * 2008-08-24-02.40.30.189671+000 I3613661A1101 LEVEL: * * Error * * PID : 647248 TID : 1 PROC : * * * * db2loggr (EBI) 0 * * * * INSTANCE: db2ebi NODE : 000 DB : EBI * * * * FUNCTION: DB2 UDB, data protection, sqlpgarl, probe:2440 * * * * MESSAGE : Bp 78000002401e000 blkOffSet 89536 ReadCount * * * * 4294877763 blksRead 0 FCBp: * * * * DATA #1 : Hexdump, 144 bytes * * * * 0x07800011A1F07E60 : 0000 1000 0000 0000 0000 0000 0000 0001 * * * * ................ * * * * 0x07800011A1F07E70 : 0000 0003 0000 0000 FFFF FFFF 0000 0000 * * * * ................ * * * * 0x07800011A1F07E80 : 0000 0000 0000 0054 FFFF FFFF FFFF FFFF * * * * .......T........ * * * * 0x07800011A1F07E90 : 0000 0050 0000 0000 0006 0007 0000 2001 * * * * ...P.......... . * * * * 0x07800011A1F07EA0 : 4942 4D4C 4F47 6159 9FC7 D000 0003 4DE7 * * * * IBMLOGaY......M. * * * * 0x07800011A1F07EB0 : 0004 0000 0000 0001 48B0 1DDD 3F3D 279F * * * * ........H...?='. * * * * 0x07800011A1F07EC0 : 48B0 2443 0000 0000 0000 0000 0000 0000 * * * * H.$C............ * * * * 0x07800011A1F07ED0 : 0000 0000 0000 0000 0101 0030 0000 0000 * * * * ...........0.... * * * * 0x07800011A1F07EE0 : 0000 0000 0000 0000 0006 0000 0000 0000 * * * * ................ * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 version 9.7 FickPack 1 * **************************************************************** | |
Local Fix: | |
Avoid stopping Standby previous to HADR take over by force, if the Primary has been stopped. | |
available fix packs: | |
DB2 Version 9.7 Fix Pack 1 for Linux, UNIX, and Windows | |
Solution | |
Problem was first fixed in DB2 version 9.7 FickPack 1 | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 22.10.2009 09.03.2010 09.03.2010 |
Problem solved at the following versions (IBM BugInfos) | |
9.7. | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.1 |