home clear 64x64
en blue 200x116 de orange 200x116 info letter User
suche 36x36
Neueste VersionenFixList
11.1.0.7 FixList
10.5.0.9 FixList
10.1.0.6 FixList
9.8.0.5 FixList
9.7.0.11 FixList
9.5.0.10 FixList
9.1.0.12 FixList
Haben Sie Probleme? - Kontaktieren Sie uns.
Kostenlos registrieren anmeldung-x26
Kontaktformular kontakt-x26

DB2 - Problembeschreibung

Problem IC65460 Status: Geschlossen

DB2 HA FAILS MONITORING FILESYSTEMS WHEN I/O ERRORS PRESENT.

Produkt:
DB2 FOR LUW / DB2FORLUW / 970 - DB2
Problembeschreibung:
In integrated HA solution environment, when I/O problem occurs 
in the system, the mount may remain in Unknown state and no 
failover occur. 
 
The scenario is as follows: 
 
1.  I/O problem occurs in the system. 
2.  TSA calls the monitor script for the filesystem (registered 
as an IBM.Application). 
3.  The monitor script (provided by DB2 HA) attempts to touch a 
file on the filesystem (after verifying the fs is mounted.) 
4.  The touch generates an I/O error. 
5.  In the event of an I/O error.  The monitor script then will 
issue a call to the stop script to attempt to make sure the fs 
is unmounted. 
6.  The stop script attempts to umount the fs.  But in this 
case, there is a PID accessing the filesystem, preventing the 
umount. 
7.  The stop script will attempt to try 9 more times (sleeping 
for 10 seconds between each try.) 
8.  After the third try (29 seconds after TSA kicked off the 
monitor script), TSA kills the monitor script for exceeding the 
monitor script timeout period as registered for the resource. 
This also kills off the child process (stop script) before it 
can get through its 10 tries to umount. 
9.  Since TSA killed the monitor script, the resource state is 
'Unknown'. 
10.  TSA takes no action on a resource with an unknown state. 
Instead it will start the cycle again by calling the monitor 
script. 
11.  On the affected node, this continues until the machine is 
rebooted.
Problem-Zusammenfassung:
**************************************************************** 
* USERS AFFECTED:                                              * 
* DB2/TSA user                                                 * 
**************************************************************** 
* PROBLEM DESCRIPTION:                                         * 
* In integrated HA solution environment, when I/O              * 
* problemoccurs in the system, the mount may remain in Unknown * 
* stateand no failover occur.The scenario is as follows:1.     * 
* I/O problem occurs in the system.2.  TSA calls the monitor   * 
* script for the filesystem(registered as an                   * 
* IBM.Application).3.  The monitor script (provided by DB2 HA) * 
* attempts totouch a file on the filesystem (after verifying   * 
* the fs ismounted.)4.  The touch generates an I/O error.5.    * 
* In the event of an I/O error.  The monitor script thenwill   * 
* issue a call to the stop script to attempt to make surethe   * 
* fs is unmounted.6.  The stop script attempts to umount the   * 
* fs.  But in thiscase, there is a PID accessing the           * 
* filesystem, preventingthe umount.7.  The stop script will    * 
* attempt to try 9 more times(sleeping for 10 seconds between  * 
* each try.)8.  After the third try (29 seconds after TSA      * 
* kicked off themonitor script), TSA kills the monitor script  * 
* for exceedingthe monitor script timeout period as registered * 
* for theresource.  This also kills off the child process      * 
* (stopscript) before it can get through its 10 tries to       * 
* umount.9.  Since TSA killed the monitor script, the resource * 
* stateis 'Unknown'.10.  TSA takes no action on a resource     * 
* with an unknownstate.  Instead it will start the cycle again * 
* by calling themonitor script.11.  On the affected node, this * 
* continues until the machineis rebooted.                      * 
**************************************************************** 
* RECOMMENDATION:                                              * 
* Upgrade to v97fp2.                                           * 
****************************************************************
Local-Fix:
verfügbare FixPacks:
DB2 Version 9.7 Fix Pack 2 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 3 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 3a for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 4 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 5 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 6 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 7 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 8 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 9a for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 9 for Linux, UNIX, and Windows
DB2 Version 9.7 Fix Pack 10 for Linux, UNIX, and Windows

Lösung
The monitor timeout is only 30, fixes are to either increase 
 
the mount monitor timeout to some value larger than 30 to allow 
the soft unmounting to complete; OR add a force option to the 
mountV95_stop.ksh to bypass the soft unmounting in the case 
where there is an IO error in the mount monitor and the mount 
monitor calls the mount stop. 
 
The fix in v97fp2 will contain this new "force" option.
Workaround
keiner bekannt / siehe Local-Fix
Weitere Daten
Datum - Problem gemeldet    :
Datum - Problem geschlossen :
Datum - der letzten Änderung:
07.01.2010
25.05.2010
25.05.2010
Problem behoben ab folgender Versionen (IBM BugInfos)
9.7.FP2
Problem behoben lt. FixList in der Version
9.7.0.2 FixList