home clear 64x64
en blue 200x116 de orange 200x116 info letter User
suche 36x36
Latest versionsfixlist
11.1.0.7 FixList
10.5.0.9 FixList
10.1.0.6 FixList
9.8.0.5 FixList
9.7.0.11 FixList
9.5.0.10 FixList
9.1.0.12 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

DB2 - Problem description

Problem IC90996 Status: Closed

SQL0952N : INCORRECT TIMEOUT VALUE OF -1 LEADS TO NODE FAILURES AND
INTERMITTENT "LOG STATE MARKED BAD" ERRORS

product:
DB2 FOR LUW / DB2FORLUW / A10 - DB2
Problem description:
- This problem happens intermittently in DPF  (multi-partition) 
environments. 
 
- You will notice INTERRUPTS (SQLCODE -952)  on non-catalog node 
and ROLLBACKs (SQLCODE -1229) on catalog node, accompanied by 
following db2diag.log messages : 
 
On non-catalog nodes : 
 
2013-02-27-19.42.XXX         XXXX         LEVEL: Error 
PID     : 23330818             TID : 140509         PROC : 
db2sysc 22 
INSTANCE: db2inst1               NODE : 015           DB   : 
SAMPLE 
APPHDL  : 0-22                  APPID: 
xxx.xxx.xxx.xxx.xxxxx.xxxxxxxx 
AUTHID  : user                HOSTNAME: AAAAAA 
EDUID   : 140509               EDUNAME: db2agntp (SAMPLE) 15 
FUNCTION: DB2 UDB, data protection services, 
SQLP_DBCB::setLogState, 
probe:5005 
DATA #1 : <preformatted> 
Error detected during initialization.  As a result, for 
precautionary 
reasons the database log state has been marked bad. 
 
2013-02-27-19.42.XXX         XXXX         LEVEL: Severe 
PID     : 23330818             TID : 140509         PROC : 
db2sysc 22 
INSTANCE: db2inst1               NODE : 015           DB   : 
SAMPLE 
APPHDL  : 0-22                  APPID: 
xxx.xxx.xxx.xxx.xxxxx.xxxxxxxx 
AUTHID  : user                HOSTNAME: AAAAAA 
EDUID   : 140509               EDUNAME: db2agntp (SAMPLE) 15 
FUNCTION: DB2 UDB, base sys utilities, 
sqeLocalDatabase::FirstConnect, 
probe:8721 
DATA #1 : SQLCA, PD_DB2_TYPE_SQLCA, 136 bytes 
 sqlcaid : SQLCA     sqlcabc: 136   sqlcode: -952   sqlerrml: 0 
 sqlerrmc: 
 sqlerrp : SQLEDINT 
 sqlerrd : (1) 0x00000000      (2) 0x00000000      (3) 
0x00000000 
           (4) 0x00000000      (5) 0x00000000      (6) 
0x00000016 
 sqlwarn : (1)      (2)      (3)      (4)        (5)       (6) 
           (7)      (8)      (9)      (10)        (11) 
 sqlstate: 
 
- The first trigger of the problem can be found in db2diag.log 
when catalog node detects an fcm connection failure while trying 
to communicate with the non catalog node due to TIMEOUT : 
 
2013-02-27-19.42.XXX         XXXX         LEVEL: Error 
PID     : 23330818             TID : 140509         PROC : 
db2sysc 22 
INSTANCE: db2inst1               NODE : 0           DB   : 
SAMPLE 
APPHDL  : 0-22                  APPID: 
xxx.xxx.xxx.xxx.xxxxx.xxxxxxxx 
AUTHID  : user                HOSTNAME: AAAAAA 
EDUID   : 1800                 EDUNAME: db2fcms 0 
FUNCTION: DB2 UDB, fast comm manager, 
sqkfNetworkServices::detectNodeFailure, probe:15 
DATA #1 : <preformatted> 
Detected failure for node 15 - time elapsed: 4294967295; max 
timeout: 500; link state: 4 
 
The max timeout by default is 500 (default values of 10 secs 
(CONN_ELAPSE ) and 5 ( MAX_CONNRETRIES ) it converts to 500 
seconds). 
 
So in above example node 0 could not reach node 15 in more than 
500 secs. 
Time elapsed: 4294967295, 4294967295 converts to hex 0xFFFFFFFF 
which is -1. 
This is the trigger of the FCM failures resulting in INTERRUPTS 
on non-catalog nodes, -1229's on catalog node and the log state 
being marked bad. 
 
This way the node becomes unreachable due to a timing problem in 
db2.
Problem Summary:
**************************************************************** 
* USERS AFFECTED:                                              * 
* ALL                                                          * 
**************************************************************** 
* PROBLEM DESCRIPTION:                                         * 
* See Problem Description above.                               * 
**************************************************************** 
* RECOMMENDATION:                                              * 
* Upgrade to DB2 Version V10.1 Fix Pack 3.                     * 
****************************************************************
Local Fix:
N/A.
available fix packs:
DB2 Version 10.1 Fix Pack 3 for Linux, UNIX, and Windows
DB2 Version 10.1 Fix Pack 4 for Linux, UNIX, and Windows
DB2 Version 10.1 Fix Pack 3a for Linux, UNIX, and Windows
DB2 Version 10.1 Fix Pack 6 for Linux, UNIX, and Windows

Solution
First fixed in DB2 Version 10.1 Fix Pack 3.
Workaround
not known / see Local fix
BUG-Tracking
forerunner  : APAR is sysrouted TO one or more of the following: IC95228 
follow-up : 
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
20.03.2013
19.11.2013
19.11.2013
Problem solved at the following versions (IBM BugInfos)
Problem solved according to the fixlist(s) of the following version(s)
10.1.0.3 FixList
10.1.0.3 FixList