DB2 - Problem description
Problem IC99037 | Status: Closed |
AN OS ERROR THAT PREVENTS WRITING TO THE NEXT EXTENT HEADER IN THE LOGS CAN RESULT IN CRASH RECOVERY FAILURE | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
From the db2diag.log you will see messages similar to the following: The initial point when writing to the next log file (sqlpgOpenNextExt) fails due to an OS error (EMFILE): 2013-09-09-18.39.45.759246-240 I191447263E1435 LEVEL: Error (OS) PID : 32212 TID : 47432960305472PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 EDUID : 5498 EDUNAME: db2loggw (SAMPLE) 0 FUNCTION: DB2 Common, OSSe, ossGetDiskInfo, probe:130 MESSAGE : ECF=0x9000002D=-1879048147=ECF_FILE_PROCESS_MAX The maximum number of file per process has already been reached CALLED : OS, -, fopen OSERR: EMFILE (24) DATA #1 : String, 12 bytes /proc/mounts DATA #2 : String, 72 bytes /home/db2inst1/NODE0000/SQL00001/SQLOGDIR/S0000022.LOG DATA #3 : unsigned integer, 8 bytes 1050882 CALLSTCK: (Static functions may not be resolved correctly, as they are resolved to the nearest symbol) [0] 0x00002B22E67BD93C pdOSSeLoggingCallback + 0x21E [1] 0x00002B22EB292F46 /u01/home/db2inst1/sqllib/lib64/libdb2osse.so.1 + 0x1D3 F46 [2] 0x00002B22EB292E91 ossLogSysRC + 0x113 [3] 0x00002B22EB2AAD10 ossGetDiskInfo + 0xCFA [4] 0x00002B22E6D2B60D sqloFetchAndStoreFSInfoInFileHandle + 0x12B [5] 0x00002B22E6D2A0AB sqloopenp + 0x15D7 [6] 0x00002B22E6D3BBDA _Z8sqlpgolfPcP10SQLPG_XHDRP9SQLP_LFCBm + 0xAA [7] 0x00002B22E6D636FF _Z8sqlpgoleP9SQLP_DBCBPcPP9SQLP_LECBjm + 0x50F [8] 0x00002B22E6D6B022 _Z16sqlpgOpenNextExtP9SQLP_DBCBmPP9SQLP_LECBmS3_ + 0x94 [9] 0x00002B22E6D6A401 /u01/home/db2inst1/sqllib/lib64/libdb2e.so.1 + 0x130C401 *** and eventually the database crashes: (...) 2013-09-09-18.39.46.077399-240 I191463519E420 LEVEL: Error PID : 32212 TID : 47432960305472PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 EDUID : 5498 EDUNAME: db2loggw (SAMPLE) 0 FUNCTION: DB2 UDB, data protection services, sqlpgasn2, probe:1880 RETCODE : ZRC=0x850F0006=-2062614522=SQLO_FHNL "TOO MANY OPEN FILES" DIA8306C Too many files were opened. 2013-09-09-18.39.46.077636-240 I191463940E341 LEVEL: Severe PID : 32212 TID : 47432960305472PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 EDUID : 5498 EDUNAME: db2loggw (SAMPLE) 0 FUNCTION: DB2 UDB, data protection services, sqlpgasn2, probe:4330 MESSAGE : Logging can not continue. ***** The subsequent problem in crash recovery for this same database after the above crash: 2013-09-09-18.40.33.184468-240 I266827622E449 LEVEL: Warning PID : 32212 TID : 47432889002304PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-39177 APPID: 10.136.39.163.47754.130909224032 AUTHID : SAMPLE EDUID : 5262 EDUNAME: db2agent (SAMPLE) 0 FUNCTION: DB2 UDB, base sys utilities, sqledint, probe:30 MESSAGE : Crash Recovery is needed. 2013-09-09-18.41.03.510780-240 E266828072E479 LEVEL: Event PID : 32212 TID : 47432889002304PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-39177 APPID: 10.136.39.163.47754.130909224032 AUTHID : SAMPLE EDUID : 5262 EDUNAME: db2agent (SAMPLE) 0 FUNCTION: DB2 UDB, base sys utilities, sqeLocalDatabase::FirstConnect, probe:1000 START : DATABASE: SAMPLE : ACTIVATED: NO 2013-09-09-18.41.03.520160-240 I266828552E512 LEVEL: Warning PID : 32212 TID : 47432889002304PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-39177 APPID: 10.136.39.163.47754.130909224032 AUTHID : SAMPLE EDUID : 5262 EDUNAME: db2agent (SAMPLE) 0 FUNCTION: DB2 UDB, recovery manager, sqlpresr, probe:410 MESSAGE : Crash recovery started. LowtranLSN 000000256001C70B MinbuffLSN 000000255B3DA7FA 2013-09-09-18.41.03.520639-240 E266829065E466 LEVEL: Warning PID : 32212 TID : 47432889002304PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-39177 APPID: 10.136.39.163.47754.130909224032 AUTHID : SAMPLE EDUID : 5262 EDUNAME: db2agent (SAMPLE) 0 FUNCTION: DB2 UDB, recovery manager, sqlpresr, probe:410 MESSAGE : ADM1530E Crash recovery has been initiated. (...) 2013-09-09-18.41.14.457878-240 I266833242E522 LEVEL: Warning PID : 32212 TID : 47432889002304PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPHDL : 0-39177 APPID: 10.136.39.163.47754.130909224032 AUTHID : SAMPLE EDUID : 5262 EDUNAME: db2agent (SAMPLE) 0 FUNCTION: DB2 UDB, recovery manager, sqlprecm, probe:4000 MESSAGE : DIA2051W Forward phase of crash recovery has completed. Next LSN is "0000002561D30010". 2013-09-09-18.41.14.534750-240 I266833765E561 LEVEL: Severe PID : 32212 TID : 47432909973824PROC : db2sysc 0 INSTANCE: db2inst1 NODE : 000 EDUID : 5884 EDUNAME: db2loggw (SAMPLE) 0 FUNCTION: DB2 UDB, data protection services, sqlpgWriteToDisk, probe:909 MESSAGE : ZRC=0x8610000D=-2045771763=SQLP_BADLOG "Log File cannot be used" DIA8414C Logging can not continue due to an error. DATA #1 : <preformatted> diffPage 125000 TailPage 0 does not match pagePso 159771048024 and firstLso 159261548001 *********** | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * Any * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrad to V9.7 Fixpack 10 or later * **************************************************************** | |
Local Fix: | |
Once you are in this state a restore will be required. As a future preventative measure, If the error is related to an OS tunable (maximum open files in this case), you can either increase the OS limit and / or decrease usage by DB2 where possible (MAXFILOP in DB2) to reduce the chance of a repeat occurance. | |
Solution | |
Problem first fixed in V9.7 Fixpack 10 | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 28.01.2014 26.11.2014 26.11.2014 |
Problem solved at the following versions (IBM BugInfos) | |
9.7.FP10 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.10 |