DB2 - Problem description
Problem IC75553 | Status: Closed |
LOAD WITH COPY YES OPTION MAY HANG, IF THE LOAD COPY DEVICE DOES NOT HAVE ENOUGH SPACE WITH DPF ENV. | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
When running a LOAD operation with COPY YES option, if run out of space during writing the load copy file, the LOAD may hang instead of returning an appropriate error. The db2diag.log will indicate that the LOAD EDU db2lmw runs into "out of space" problem prior to the LOAD hanging, for example: 2011-03-28-10.14.02.485751+540 E84157A1103 LEVEL: Error (OS) PID : 397750 TID : 9824 PROC : db2sysc 1 INSTANCE: db2dpf NODE : 001 EDUID : 9824 EDUNAME: db2lmw0 1 FUNCTION: DB2 UDB, oper system services, sqlowrite, probe:60 MESSAGE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full." DIA8312C Disk was full. CALLED : OS, -, write OSERR : ENOSPC (28) "No space left on device" If we generate and inspect a DB2 trace of the hanging LOAD, you will see that: for db2lmr, function sqluCSerializableSocket::iSelectSocketForIO() calls sqloPdbSelectSocket() in a loop and not exiting, for example Deleted linenumber,timestamp, and etc from trace output. | sqluMCReadFromDevice entry | | sqluReadFromSocketDevice entry | | | sqlusCFormattedUserDataBuffer::iFillFromIO entry | | | | sqluCSerializableSocket::iNext entry | | | | | sqluCSerializableSocket::iSelectSocketForIO entry | | | | | | sqloPdbSelectSocket entry | | | | | | sqloPdbSelectSocket exit | | | | | | sqloPdbSelectSocket entry | | | | | | sqloPdbSelectSocket exit ... and the load controller agent (which calls sqlulPollMsg() in a loop) detected that a child EDU is still running, so it cannot exit, and so the whole LOAD cannot terminate: pid = 1114168 tid = 7455 node = 1 ... | sqlulTerminate exit | sqlogmblkEx entry [eduid 7455 eduname db2agent] | | sqloGetPrivatePoolHandle entry [eduid 7455 eduname db2agent] | | sqloGetPrivatePoolHandle exit | sqlogmblkEx mbt [Marker:PD_OSS_ALLOCATED_MEMORY ] | sqlogmblkEx exit | DIAG_NOTE data [probe 0] | | sqlofmblkEx entry [eduid 7455 eduname db2agent] | | sqlofmblkEx mbt [Marker:PD_OSS_FREED_MEMORY ] | | sqlofmblkEx exit | | sqluGetNumActiveChildren entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist exit [rc = 1] | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist exit [rc = 1] | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist exit [rc = 1] | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist exit | | sqluGetNumActiveChildren exit | | sqlorest entry [eduid 7455 eduname db2agent] | | sqlorest data [probe 10] | | sqlorest exit | | sqluGetNumActiveChildren entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist exit [rc = 1] | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist exit | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist exit | | sqluGetNumActiveChildren exit | | sqlorest entry [eduid 7455 eduname db2agent] | | sqlorest data [probe 10] | | sqlorest exit | | sqluGetNumActiveChildren entry [eduid 7455 eduname db2agent] | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] | | | sqlodoeseduexist exit [rc = 1] | | sqlugetnumactivechildren exit | | sqlorest entry [eduid 7455 eduname db2agent] | | sqlorest data [probe 10] | | sqlorest exit | | sqluGetNumActiveChildren entry [eduid 7455 eduname db2agent] ... | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * When running a LOAD operation with COPY YES option, if run * * out of space during writing the load copy file, the LOAD may * * hang instead of returning an appropriate error. * * * * The db2diag.log will indicate that the LOAD EDU db2lmw runs * * into "out of space" problem prior to the LOAD hanging, for * * example: * * * * 2011-03-28-10.14.02.485751+540 E84157A1103 LEVEL: * * Error(OS) * * PID : 397750 TID : 9824 PROC : * * db2sysc1 * * INSTANCE: db2dpf NODE : 001 * * EDUID : 9824 EDUNAME: db2lmw0 1 * * FUNCTION: DB2 UDB, oper system services, sqlowrite, probe:60 * * MESSAGE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full." * * DIA8312C Disk was full. * * CALLED : OS, -, write * * OSERR : ENOSPC (28) "No space left on device" * * * * If we generate and inspect a DB2 trace of the hanging LOAD, * * you will see that: * * * * for db2lmr, function * * sqluCSerializableSocket::iSelectSocketForIO() calls * * sqloPdbSelectSocket() in a loop and not exiting, for example * * * * Deleted linenumber,timestamp, and etc from trace output. * * * * | sqluMCReadFromDevice entry * * | | sqluReadFromSocketDevice entry * * | | | sqlusCFormattedUserDataBuffer::iFillFromIO entry * * | | | | sqluCSerializableSocket::iNext entry * * | | | | | sqluCSerializableSocket::iSelectSocketForIO entry * * | | | | | | sqloPdbSelectSocket entry * * | | | | | | sqloPdbSelectSocket exit * * | | | | | | sqloPdbSelectSocket entry * * | | | | | | sqloPdbSelectSocket exit * * ... * * * * and the load controller agent (which calls sqlulPollMsg() in * * a loop) detected that a child EDU is still running, so it * * cannot exit, and so the whole LOAD cannot terminate: * * * * pid = 1114168 tid = 7455 node = 1 * * * * ... * * | sqlulTerminate exit * * | sqlogmblkEx entry [eduid 7455 eduname db2agent] * * | | sqloGetPrivatePoolHandle entry [eduid 7455 eduname * * db2agent] * * | | sqloGetPrivatePoolHandle exit * * | sqlogmblkEx mbt [Marker:PD_OSS_ALLOCATED_MEMORY ] * * | sqlogmblkEx exit * * | DIAG_NOTE data [probe 0] * * | | sqlofmblkEx entry [eduid 7455 eduname db2agent] * * | | sqlofmblkEx mbt [Marker:PD_OSS_FREED_MEMORY ] * * | | sqlofmblkEx exit * * | | sqluGetNumActiveChildren entry [eduid 7455 eduname * * db2agent] * * | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] * * | | | sqloDoesEDUExist exit [rc = 1] * * | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] * * | | | sqloDoesEDUExist exit [rc = 1] * * | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] * * | | | sqloDoesEDUExist exit [rc = 1] * * | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] * * | | | sqloDoesEDUExist exit * * | | sqluGetNumActiveChildren exit * * | | sqlorest entry [eduid 7455 eduname db2agent] * * | | sqlorest data [probe 10] * * | | sqlorest exit * * | | sqluGetNumActiveChildren entry [eduid 7455 eduname * * db2agent] * * | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] * * | | | sqloDoesEDUExist exit [rc = 1] * * | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] * * | | | sqloDoesEDUExist exit * * | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] * * | | | sqloDoesEDUExist exit * * | | sqluGetNumActiveChildren exit * * | | sqlorest entry [eduid 7455 eduname db2agent] * * | | sqlorest data [probe 10] * * | | sqlorest exit * * | | sqluGetNumActiveChildren entry [eduid 7455 eduname * * db2agent] * * | | | sqloDoesEDUExist entry [eduid 7455 eduname db2agent] * * | | | sqlodoeseduexist exit [rc = 1] * * | | sqlugetnumactivechildren exit * * | | sqlorest entry [eduid 7455 eduname db2agent] * * | | sqlorest data [probe 10] * * | | sqlorest exit * * | | sqluGetNumActiveChildren entry [eduid 7455 eduname * * db2agent] * * ... * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 UDB version 9.7 fixpack 5. * **************************************************************** | |
Local Fix: | |
available fix packs: | |
DB2 Version 9.7 Fix Pack 5 for Linux, UNIX, and Windows | |
Solution | |
Problem was first fixed in DB2 UDB Version 9.7 Fix Pack 5. | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC76279 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 04.04.2011 12.12.2011 12.12.2011 |
Problem solved at the following versions (IBM BugInfos) | |
9.7.FP5 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.5 |