home clear 64x64
en blue 200x116 de orange 200x116 info letter User
suche 36x36
Latest versionsfixlist
11.1.0.7 FixList
10.5.0.9 FixList
10.1.0.6 FixList
9.8.0.5 FixList
9.7.0.11 FixList
9.5.0.10 FixList
9.1.0.12 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

DB2 - Problem description

Problem IC86957 Status: Closed

PURESCALE MEMBER CRASH RECOVERY AND CONCURRENT APPLICATION ON ANOTHER
MEMBER MAY HANG

product:
DB2 FOR LUW / DB2FORLUW / A10 - DB2
Problem description:
During undo phase of member crash recovery (MCR), a hang could 
occur during the undo phase of member crash recovery. 
 
This could happen when undoing transaction(s) that had an X 
table lock and made enough changes to indexes such that index 
page splits or page deletes were done if there are concurrent 
applications on other members that are accessing the index with 
an IN table lock. 
 
For example, it could occur if 
- Member 1 txn1 was doing a large delete and escalated to an X 
table lock and has done many index page deletes some of which 
required an X tree lock and some which need an IX tree lock. 
- Crash on member 1 and one of the page deletes that got an IX 
tree lock has been undone and continues to undo more of these. 
- During MCR, member 2 has txn2 doing an operation that requires 
only an IN table lock and which traverses the index and sees an 
empty index page that is about to be undeleted by the MCR and 
requests an S tree lock.   This includes UR index scans and 
RUNSTATS allow write mode ...   This would wait behind the IX 
tree lock held by MCR. 
- MCR on member 1 tries to undo one of the page deletes that got 
an X tree lock.  This currently will wait behind txn2 and both 
MCR and txn2 will wait forever. 
 
If this occurs you will see an entries such as the following in 
the db2diag.log 
FUNCTION: DB2 UDB, recovery manager, sqlpresr, probe:1305 
MESSAGE : Member crash recovery started. 
 
FUNCTION: DB2 UDB, recovery manager, sqlprDbBackwardPhase, 
probe:1100 
MESSAGE : Start of undo phase. 
 
But you will not see the corresponding entry: 
FUNCTION: DB2 UDB, recovery manager, sqlpresr, probe:3105 
MESSAGE : ADM1525I  Member crash recovery has completed 
successfully. 
 
Both member crash recovery and the concurrent application will 
be waiting with sqlilidx in their stack trace. 
 
The member performing crash recovery will have the SQLP_TREELOCK 
lock in IX mode and the concurrent application on the other 
member is waiting for it in S mode and the transaction being 
undone by member crash recovery is waiting for the lock in X 
mode.
Problem Summary:
**************************************************************** 
* USERS AFFECTED:                                              * 
* Purescale users                                              * 
**************************************************************** 
* PROBLEM DESCRIPTION:                                         * 
* See Error Description                                        * 
**************************************************************** 
* RECOMMENDATION:                                              * 
* Upgrade to DB2 version 10 fixpack 2.                         * 
****************************************************************
Local Fix:
Force the concurrent application that is waiting for the 
SQLP_TREELOCK lock in S mode on the member that is not doing 
member crash recovery to allow the crash recovery to complete.
available fix packs:
DB2 Version 10.1 Fix Pack 2 for Linux, UNIX, and Windows
DB2 Version 10.1 Fix Pack 3 for Linux, UNIX, and Windows
DB2 Version 10.1 Fix Pack 4 for Linux, UNIX, and Windows
DB2 Version 10.1 Fix Pack 3a for Linux, UNIX, and Windows
DB2 Version 10.1 Fix Pack 6 for Linux, UNIX, and Windows

Solution
The problem is first fixed in DB2 version 10 fixpack 2.
Workaround
not known / see Local fix
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
02.10.2012
17.12.2012
17.12.2012
Problem solved at the following versions (IBM BugInfos)
Problem solved according to the fixlist(s) of the following version(s)
10.1.0.2 FixList
10.5.0.2 FixList