DB2 - Problem description
| Problem IC97746 | Status: Closed |
SQLO_LATCH_ERROR_EXPECTED_HELD ERROR LEADS TO DB2 CRASH IN WINDOWS | |
| product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
| Problem description: | |
The database crashes when DB2 tries to unlock a latch which was
either invalid or was already unlocked before,
the error message logged in db2diag.log is :
yyyy-mm-dd-23.30.00.450000+480 xxxxxxx LEVEL: Severe (OS)
PID : 28436 TID : 22152 PROC :
db2fmp64.exe
INSTANCE: Instance_name NODE : 000 DB :
Database_name
APPID : *LOCAL.<dbname>.131017145330
EDUID : 22152
FUNCTION: DB2 UDB, SQO Latch Tracing,
SQLO_SLATCH_CAS32::releaseConflictComplex, probe:360
MESSAGE :
ZRC=0x870F011E=-2029059810=SQLO_LATCH_ERROR_EXPECTED_HELD
"expected latch to be held."
CALLED : OS, -, unspecified_system_function
DATA #1 : String, 39 bytes
Attempting to unlock an invalid latch:
DATA #2 : File name, 16 bytes
sqloLatchCAS32.C
DATA #3 : Source file line number, 8 bytes
1503
DATA #4 : Codepath, 8 bytes
0
DATA #5 : String, 114 bytes
0x00000000: {
held X: 0
reserved for X: 0
shared holders: 0
shared waiter: 0
exclusive waiter: 0
}
DATA #6 : LatchMode, PD_TYPE_LATCH_MODE, 8 bytes
0x0 (invalid mode)
DATA #7 : String, 560 bytes
{
state = 0x00000000
= {
held X: 0
reserved for X: 0
shared holders: 0
shared waiter: 0
exclusive waiter: 0
}
flags = 32768 (No X Starvation)
numXPostsPending = 0
m_pSemPoolElement = 0x0000000000000000
cs = {
lock = { 0x80 [ locked ] }
flags = 0x0000
identity = SLATCH_DEFAULT::critical_section (26)
}
}
DATA #8 : Hexdump, 24 bytes
0x00000001804DEA58 : 0000 0000 2283 0000 8000 1A00 0000 0000
...."...........
0x00000001804DEA68 : 0000 0000 0000 0000
........
CALLSTCK:
[0] 0x000000018012B8C3 pdLogSysRC + 0x393
[1] 0x0000000180011FA4 SQLO_SLATCH_CAS32::dumpDiagInfoAndPanic
+ 0x1E2
[2] 0x00000001800124A0
SQLO_SLATCH_CAS32::releaseConflictComplex +
0x3BC
[3] 0x0000000180012097 SQLO_SLATCH_CAS32::releaseConflict +
0x39
[4] 0x0000000180063BB8 sqlogmt2 + 0x24E
[5] 0x000000000226A760 sqm_timestamp + 0x2C
[6] 0x000000000244EA37 sqlmon_conn::stmt_start + 0x49D
[7] 0x00000000023E52B8 sqlmon_acb::agent_stmt_start + 0x51E
...and then the database shutdown (crash) message is seen.
Analysis :
The above latch error, most of the times, is logged after a
failure of some procedure which runs in the Administrative Task
Scheduler (ATS). Failure of the procedure may also log similar
error as below in db2diag.log :
yyyy-mm-dd-23.15.06.504000+480 XXXX LEVEL: Error
PID : 28436 TID : 22736 PROC :
db2fmp64.exe
INSTANCE: Instance_name NODE : 000 DB :
Database_name
APPID : *LOCAL.<database_name>.131017151501
EDUID : 22736
FUNCTION: DB2 UDB, Administrative Task Scheduler,
AtsTask::executeTask, probe:600
MESSAGE : ZRC=0xFFFFFE4A=-438
DATA #1 : <preformatted>
[IBM][CLI Driver][DB2/NT64] SQL0438N Application raised error
or warning with diagnostic text: "SQL0438N Application raised
error or warning with diagnostic text: "C". SQLSTATE=55555
Analysis shows that, this type of a latch problem happens due to
an unwanted latch re-initialization in a timestamp related
function in DB2.
In the fix, we solve this problem of latch re-initialization in
the failing function in order to avoid the crash. | |
| Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Problem Description above * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 9.7 Fix pack 10. * **************************************************************** | |
| Local Fix: | |
There are 3 possible workarounds for the problem : 1> DB2_FMP_COMM_HEAPSZ=0 2> DB2_ATS_ENABLE=NO 3> Run the store procedure in question in a single thread FMP process i.e. NOT THREADSAFE. | |
| Solution | |
First fixed in Version 9.7 Fix pack 10. | |
| Workaround | |
not known / see Local fix | |
| Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 18.11.2013 01.12.2014 01.12.2014 |
| Problem solved at the following versions (IBM BugInfos) | |
9.7.FP10 | |
| Problem solved according to the fixlist(s) of the following version(s) | |
| 9.7.0.10 |
|