home clear 64x64
en blue 200x116 de orange 200x116 info letter User
suche 36x36
Latest versionsfixlist
11.1.0.7 FixList
10.5.0.9 FixList
10.1.0.6 FixList
9.8.0.5 FixList
9.7.0.11 FixList
9.5.0.10 FixList
9.1.0.12 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

DB2 - Problem description

Problem IC97638 Status: Closed

MAKE HADRVXX_MONITOR.KSH SCRIPT MORE ROBUST TO REDUCE FALSE NEGATIVES

product:
DB2 FOR LUW / DB2FORLUW / A50 - DB2
Problem description:
** This APAR applies only to integrated HA solutions ** 
 
On memory constrained systems or very busy systems ps behavior 
is unpredictable where ps may return the process name in square 
brackets. 
Hence there's a chance that this check: 
 
p_pid=$(ps -u ${candidate_P_instance?} -o args | grep -c 
"^db2sysc") 
 
returns 0, which in turns makes hadrVxx_monitor.ksh script 
return a status of 2 i.e. HADR resource is down. 
 
Additionally the following line with the p_out parameter ends up 
storing all the ps output on one line, which is then grepped. 
When that line contains a db2sysc entry in brackets, it returns 
with a 0 return code which places HADR in an unknown state: 
 
p_out=$(ps -u ${candidate_P_instance?} -o args | egrep 
"^db2sysc|^\[db2sysc\]"| grep -v defunct)
Problem Summary:
**************************************************************** 
* USERS AFFECTED:                                              * 
* All                                                          * 
**************************************************************** 
* PROBLEM DESCRIPTION:                                         * 
* See Error Description                                        * 
**************************************************************** 
* RECOMMENDATION:                                              * 
* Upgrade to DB2 Cancun Release 10.5.0.4 (also known as Fix    * 
* Pack 4) or higher.                                           * 
****************************************************************
Local Fix:
To avoid these "false negatives", the above lines have been 
modified as follows: 
 
OLD 
 
   # We need to checking if the instance start is complete here. 
To minimize 
   # the dependency on the checking for the db2 start script, we 
first check 
   # if there is a db2sysc process. If not, no need to go 
further. If there is 
   # a db2sysc process, check if the script is still running. 
   p_pid=$(ps -u ${candidate_P_instance?} -o args | grep -c 
"^db2sysc") 
   if [ $p_pid -eq 0 ]; then 
      logger -i -p info -t $PROGNAME "$0 $DB2INSTANCE is 
offline, hadr monitor must return offline" 
      rc=2 
      return $rc 
   elif [ $p_pid -ne 0 ]; then 
      isInstanceStartRunning=$(ps -ef | grep db2V97_start.ksh | 
grep ${candidate_P_instance?} | wc -l) 
      if [[ $isInstanceStartRunning -ne 0 ]]; then 
         logger -i -p notice -t $0 "db2sysc started but Instance 
start still running...return OFFLINE for HADR database 
${DB2HADRDBNAME?}" 
         rc=2 
         return $rc 
      fi 
   fi 
 
 
NEW 
   # We need to checking if the instance start is complete here. 
To minimize 
   # the dependency on the checking for the db2 start script, we 
first check 
   # if there is a db2sysc process. If not, no need to go 
further. If there is 
   # a db2sysc process, check if the script is still running. 
   p_out=$(ps -u ${candidate_P_instance?} -o args | egrep 
"^db2sysc|^\[db2sysc\]"| grep -v defunct) 
   p_pid=$(echo $p_out | grep -v "\[db2sysc\]" | grep -c 
"db2sysc") 
   if [ $p_pid -eq 0 ]; then 
      p_pid=$(echo $p_out | grep -c "\[db2sysc\]") 
      if [[ $p_pid != 0 ]]; then 
         log err "ps returns only [db2sysc]: returning 0" 
         rc=0 
      else 
         logger -i -p info -t $PROGNAME "$0 $DB2INSTANCE is 
offline, hadr monitor must return offline" 
         rc=2 
      fi 
      return $rc 
   elif [ $p_pid -ne 0 ]; then 
      isInstanceStartRunning=$(ps -ef | grep db2V97_start.ksh | 
grep ${candidate_P_instance?} | wc -l) 
      if [[ $isInstanceStartRunning -ne 0 ]]; then 
         logger -i -p notice -t $0 "db2sysc started but Instance 
start still running...return OFFLINE for HADR database 
${DB2HADRDBNAME?}" 
         rc=2 
         return $rc 
      fi 
   fi 
 
 
OLD 
 
 p_out=$(ps -u ${candidate_P_instance?} -o args | egrep 
"^db2sysc|^\[db2sysc\]"| grep -v defunct) 
   p_pid=$(echo $p_out | grep -v "\[db2sysc\]" | grep -c 
"db2sysc") 
   if [ $p_pid -eq 0 ]; then 
      p_pid=$(echo $p_out | grep -c "\[db2sysc\]") 
      if [[ $p_pid != 0 ]]; then 
         log err "ps returns only [db2sysc]: returning 0" 
         rc=0 
 
 
NEW 
 
   p_out=$(ps -u ${candidate_P_instance?} -o args | egrep 
"^db2sysc|^\[db2sysc\]"| grep -v defunct) 
   p_pid=$(echo $p_out | tr " " "\n" | grep -v "\[db2sysc\]" | 
grep -c "db2sysc") 
   if [ $p_pid -eq 0 ]; then 
      p_pid=$(echo $p_out | grep -c "\[db2sysc\]") 
      if [[ $p_pid != 0 ]]; then 
         log err "ps returns only [db2sysc]: returning 0" 
         rc=0
available fix packs:
DB2 Cancun Release 10.5.0.4 (also known as Fix Pack 4) for Linux, UNIX, and Windows
DB2 Version 10.5 Fix Pack 9 for Linux, UNIX, and Windows

Solution
Fixed in DB2 Cancun Release 10.5.0.4 (also known as Fix Pack 4)
Workaround
not known / see Local fix
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
13.11.2013
16.10.2014
16.10.2014
Problem solved at the following versions (IBM BugInfos)
Problem solved according to the fixlist(s) of the following version(s)
10.5.0.4 FixList