DB2 - Problembeschreibung
Problem IC94574 | Status: Geschlossen |
MAKE HADRVXX_MONITOR.KSH SCRIPT MORE ROBUST TO REDUCE FALSE NEGATIVES | |
Produkt: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problembeschreibung: | |
** This APAR applies only to integrated HA solutions ** On memory constrained systems or very busy systems ps behavior is unpredictable where ps may return the process name in square brackets. Hence there's a chance that this check: p_pid=$(ps -u ${candidate_P_instance?} -o args | grep -c "^db2sysc") returns 0, which in turns makes hadrVxx_monitor.ksh script return a status of 2 i.e. HADR resource is down. Additionally the following line with the p_out parameter ends up storing all the ps output on one line, which is then grepped. When that line contains a db2sysc entry in brackets, it returns with a 0 return code which places HADR in an unknown state: p_out=$(ps -u ${candidate_P_instance?} -o args | egrep "^db2sysc|^\[db2sysc\]"| grep -v defunct) | |
Problem-Zusammenfassung: | |
**************************************************************** * USERS AFFECTED: * * All using HADR TSA scripts. * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 v9.7 FP9. * **************************************************************** | |
Local-Fix: | |
To avoid these "false negatives", the above lines have been modified as follows: OLD # We need to checking if the instance start is complete here. To minimize # the dependency on the checking for the db2 start script, we first check # if there is a db2sysc process. If not, no need to go further. If there is # a db2sysc process, check if the script is still running. p_pid=$(ps -u ${candidate_P_instance?} -o args | grep -c "^db2sysc") if [ $p_pid -eq 0 ]; then logger -i -p info -t $PROGNAME "$0 $DB2INSTANCE is offline, hadr monitor must return offline" rc=2 return $rc elif [ $p_pid -ne 0 ]; then isInstanceStartRunning=$(ps -ef | grep db2V97_start.ksh | grep ${candidate_P_instance?} | wc -l) if [[ $isInstanceStartRunning -ne 0 ]]; then logger -i -p notice -t $0 "db2sysc started but Instance start still running...return OFFLINE for HADR database ${DB2HADRDBNAME?}" rc=2 return $rc fi fi NEW # We need to checking if the instance start is complete here. To minimize # the dependency on the checking for the db2 start script, we first check # if there is a db2sysc process. If not, no need to go further. If there is # a db2sysc process, check if the script is still running. p_out=$(ps -u ${candidate_P_instance?} -o args | egrep "^db2sysc|^\[db2sysc\]"| grep -v defunct) p_pid=$(echo $p_out | grep -v "\[db2sysc\]" | grep -c "db2sysc") if [ $p_pid -eq 0 ]; then p_pid=$(echo $p_out | grep -c "\[db2sysc\]") if [[ $p_pid != 0 ]]; then log err "ps returns only [db2sysc]: returning 0" rc=0 else logger -i -p info -t $PROGNAME "$0 $DB2INSTANCE is offline, hadr monitor must return offline" rc=2 fi return $rc elif [ $p_pid -ne 0 ]; then isInstanceStartRunning=$(ps -ef | grep db2V97_start.ksh | grep ${candidate_P_instance?} | wc -l) if [[ $isInstanceStartRunning -ne 0 ]]; then logger -i -p notice -t $0 "db2sysc started but Instance start still running...return OFFLINE for HADR database ${DB2HADRDBNAME?}" rc=2 return $rc fi fi OLD p_out=$(ps -u ${candidate_P_instance?} -o args | egrep "^db2sysc|^\[db2sysc\]"| grep -v defunct) p_pid=$(echo $p_out | grep -v "\[db2sysc\]" | grep -c "db2sysc") if [ $p_pid -eq 0 ]; then p_pid=$(echo $p_out | grep -c "\[db2sysc\]") if [[ $p_pid != 0 ]]; then log err "ps returns only [db2sysc]: returning 0" rc=0 NEW p_out=$(ps -u ${candidate_P_instance?} -o args | egrep "^db2sysc|^\[db2sysc\]"| grep -v defunct) p_pid=$(echo $p_out | tr " " "\n" | grep -v "\[db2sysc\]" | grep -c "db2sysc") if [ $p_pid -eq 0 ]; then p_pid=$(echo $p_out | grep -c "\[db2sysc\]") if [[ $p_pid != 0 ]]; then log err "ps returns only [db2sysc]: returning 0" rc=0 | |
verfügbare FixPacks: | |
DB2 Version 9.7 Fix Pack 9 for Linux, UNIX, and Windows | |
Lösung | |
Fixed in DB2 v9.7 FP9. | |
Workaround | |
keiner bekannt / siehe Local-Fix | |
Bug-Verfolgung | |
Vorgänger : APAR is sysrouted TO one or more of the following: IC97638 IC97667 Nachfolger : | |
Weitere Daten | |
Datum - Problem gemeldet : Datum - Problem geschlossen : Datum - der letzten Änderung: | 01.08.2013 23.12.2013 23.12.2013 |
Problem behoben ab folgender Versionen (IBM BugInfos) | |
9.7.FP9 | |
Problem behoben lt. FixList in der Version | |
9.7.0.9 | |
9.7.0.9 |