DB2 - Problem description
Problem IC76505 | Status: Closed |
DATABASE CRASHED WHILE CLOSING THE TEMP FILE HANDLE DURING AN UNDO OPERATION. | |
product: | |
DB2 FOR LUW / DB2FORLUW / 950 - DB2 | |
Problem description: | |
During an UNDO operation, we hit a disk full condition on the temporary tablespace while trying to close a file handle. Subsequently, database was marked bad and came down afterward. Here's an example from the db2diag.log - 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Error (OS) PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : xxxxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, oper system services, sqlobufreset, probe:10 MESSAGE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full." DIA8312C Disk was full. CALLED : OS, -, fsync OSERR: ENOSPC (28) DATA #1 : File handle, PD_TYPE_SQO_FILE_HDL, 8 bytes 0x0000002AA2BFB328 : 8A00 0000 8002 0000 ........ DATA #2 : String, 105 bytes Search for ossError*Analysis probe point after this log entry for further self-diagnosis of this problem. 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Error (OS) PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : xxxxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 Common, OSSe, ossErrorIOAnalysis, probe:100 CALLED : OS, -, fsync OSERR: ENOSPC (28) DATA #1 : String, 132 bytes A total of 4 analysis will be performed : - User info - ulimit info - Target file info - File system Target file handle = 138 DATA #2 : String, 190 bytes Real user ID of current process = xxxxx Effective user ID of current process = xxxxx Real group ID of current process = xxxx Effective group ID of current process = xxxx DATA #3 : String, 362 bytes Current process limits (unit in bytes except for nofiles) : mem (S/H) = unlimited / unlimited core (S/H) = 0 / unlimited cpu (S/H) = unlimited / unlimited data (S/H) = unlimited / unlimited fsize (S/H) = unlimited / unlimited nofiles (S/H) = 65534 / 65534 stack (S/H) = 10485760 / unlimited rss (S/H) = unlimited / unlimited DATA #4 : String, 261 bytes Target File Information : Size = 16384 Link = No Reference path = N/A Type = 0x8000 Permissions = rw------- UID = xxxxx GID = xxxx Last modified time = 1298765603 DATA #5 : String, 432 bytes File System Information of the target file : Block size = 32768 bytes Total size = 402653184000 bytes Free size = 32768 bytes Total # of inodes = 24802560 FS name = xxxxxx:/xxx/xxxxxxxx/xxxxxxxxx Mount point = /x/xxxxxx/xxx/xxxxxxxx/xxxxxxxxxxxxxxx FSID = 27 FS type name = nfs DIO/CIO mount opt = None Device type = N/A FS type = 0x6969 CALLSTCK: [0] 0x0000002A9684F555 pdOSSeLoggingCallback + 0x91 [1] 0x0000002A9BB37C0B /xxx/.exec/x86_64.linux.2.6.glibc.2.3 /lib64/libdb2osse.so.1 + 0x1AEC0B [2] 0x0000002A9BB38FEB ossLogSysRC + 0xBF [3] 0x0000002A9BB2C6E0 /xxx/.exec/x86_64.linux.2.6.glibc.2.3 /lib64/libdb2osse.so.1 + 0x1A36E0 [4] 0x0000002A9BB2ACF1 ossErrorAnalysis + 0x25 [5] 0x0000002A97EBFAC4 sqloSystemErrorHandler + 0x6C0 [6] 0x0000002A96C2A3A7 sqlobufreset + 0xFB [7] 0x0000002A968B2A94 sqlbWritePage + 0x1E0 [8] 0x0000002A98A3A187 _Z19sqlbGetPageFromDiskP11SQLB_FIX_CBi + 0x173D [9] 0x0000002A98A2B843 _Z7sqlbfixP11SQLB_FIX_CB + 0xB73 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : xxxxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, buffer pool services, sqlbForceNewPagesToDisk, probe:1235 MESSAGE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full." DIA8312C Disk was full. [...] 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : xxxxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, buffer pool services, SqlbFhdlTbl::closeOneFile, probe:1000 DATA #1 : String, 38 bytes Obj={pool:15;obj:2;type:16} State=x45 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : xxxxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, buffer pool services, SqlbFhdlTbl::closeOneFile, probe:0 DATA #1 : Object descriptor, PD_TYPE_SQLB_OBJECT_DESC, 72 bytes Obj: {pool:15;obj:2;type:16} Parent={9;24} lifeLSN: 01EA0C9C96B2 tid: 0 0 0 extentAnchor: 0 initEmpPages: 0 poolPage0: 0 poolflags: 111 objectState: 45 lastSMP: 0 pageSize: 16384 extentSize: 32 bufferPoolID: 4 partialHash: 268566543 bufferPool: 0x0000002ace3a3800 [...] 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : xxxxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, data management, sqldReorgCleanup, probe:10 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : xxxxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, data management, sqldmund, probe:719 RETCODE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full." DIA8312C Disk was full. [...] 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : x-xxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, data management, sqldmund, probe:719 MESSAGE : Error during UNDO of LSN: DATA #1 : Hexdump, 6 bytes 0x0000002AA2BFBE12 : 01EA 0C9C 96B2 2011-xx-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : x-xxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, data management, sqldmund, probe:719 RETCODE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full." DIA8312C Disk was full. 2011-0x-xx-xx.xx.xx.xxxxxx-xxx xxxxxxxxxxxxx LEVEL: Severe PID : xxxx TID : xxxxxxxxxxxx PROC : db2sysc xx INSTANCE: db2inst1 NODE : 0xx DB : SAMPLE APPHDL : x-xxxx APPID: xx.xxx.xxx.xx.xxxxx.xxxxxxxxxxx AUTHID : xxxxx EDUID : xxxxx EDUNAME: db2agntp (SAMPLE) xx FUNCTION: DB2 UDB, data management, sqldmund, probe:719 MESSAGE : Error during UNDO of log record: DATA #1 : Dumped object of size 5000 bytes at offset 0, 59 bytes /xxx/xxxxxx/xxxxxxxx/sqllib/db2dump/xxxx.xxxxx.xxx.dump.bin This APAR will prevent the database from being marked bad when we hit a disk full condition on a temporary tablespace during an UNDO operation. | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * Users with insufficient TEMP disk space. * **************************************************************** * PROBLEM DESCRIPTION: * * Without this APAR, customer is exposed to the issue as * * described in the "ERROR DESCRIPTION" section. * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 9.5, Fixpack 9. * **************************************************************** | |
Local Fix: | |
Be certain to have enough TEMP disk space to eliminate the possibility of hitting a 'disk full' condition. | |
available fix packs: | |
DB2 Version 9.5 Fix Pack 9 for Linux, UNIX, and Windows | |
Solution | |
First fixed in DB2 Version 9.5, Fixpack 9. | |
Workaround | |
Be certain to have enough TEMP disk space to eliminate the possibility of hitting a 'disk full' condition. | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC79958 IC84087 IC84653 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 20.05.2011 09.03.2012 09.03.2012 |
Problem solved at the following versions (IBM BugInfos) | |
9.5.FP9 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.5.0.9 |