DB2 - Problem description
Problem IC86957 | Status: Closed |
PURESCALE MEMBER CRASH RECOVERY AND CONCURRENT APPLICATION ON ANOTHER MEMBER MAY HANG | |
product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
Problem description: | |
During undo phase of member crash recovery (MCR), a hang could occur during the undo phase of member crash recovery. This could happen when undoing transaction(s) that had an X table lock and made enough changes to indexes such that index page splits or page deletes were done if there are concurrent applications on other members that are accessing the index with an IN table lock. For example, it could occur if - Member 1 txn1 was doing a large delete and escalated to an X table lock and has done many index page deletes some of which required an X tree lock and some which need an IX tree lock. - Crash on member 1 and one of the page deletes that got an IX tree lock has been undone and continues to undo more of these. - During MCR, member 2 has txn2 doing an operation that requires only an IN table lock and which traverses the index and sees an empty index page that is about to be undeleted by the MCR and requests an S tree lock. This includes UR index scans and RUNSTATS allow write mode ... This would wait behind the IX tree lock held by MCR. - MCR on member 1 tries to undo one of the page deletes that got an X tree lock. This currently will wait behind txn2 and both MCR and txn2 will wait forever. If this occurs you will see an entries such as the following in the db2diag.log FUNCTION: DB2 UDB, recovery manager, sqlpresr, probe:1305 MESSAGE : Member crash recovery started. FUNCTION: DB2 UDB, recovery manager, sqlprDbBackwardPhase, probe:1100 MESSAGE : Start of undo phase. But you will not see the corresponding entry: FUNCTION: DB2 UDB, recovery manager, sqlpresr, probe:3105 MESSAGE : ADM1525I Member crash recovery has completed successfully. Both member crash recovery and the concurrent application will be waiting with sqlilidx in their stack trace. The member performing crash recovery will have the SQLP_TREELOCK lock in IX mode and the concurrent application on the other member is waiting for it in S mode and the transaction being undone by member crash recovery is waiting for the lock in X mode. | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * Purescale users * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 version 10 fixpack 2. * **************************************************************** | |
Local Fix: | |
Force the concurrent application that is waiting for the SQLP_TREELOCK lock in S mode on the member that is not doing member crash recovery to allow the crash recovery to complete. | |
available fix packs: | |
DB2 Version 10.1 Fix Pack 2 for Linux, UNIX, and Windows | |
Solution | |
The problem is first fixed in DB2 version 10 fixpack 2. | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 02.10.2012 17.12.2012 17.12.2012 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) | |
10.1.0.2 | |
10.5.0.2 |