DB2 - Problem description
Problem IC94865 | Status: Closed |
IN A LARGE DPF/HA CLUSTER, IF ALL RESOURCES ARE STARTED AT ONCE IT IS POSSIBLE THAT CERTAIN RESOURCES ARE NOT BROUGHT ONLINE | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
In a large DB2 DPF environment configured for high availability, if all resources are ordered online simultaneously then it is possible that certain db2 resources are not successfully started. In the lssam output, affected DB2 partition resources will be displayed as "Offline" while the overlying Resource Group (RG) will be displayed as "Pending Online". Here is an example of a DB2 partition resource group in this described state: Example lssam output of the problem: |- Pending online IBM.ResourceGroup:db2_db2inst1_115-rg Nominal=Online |- Offline IBM.Application:db2_db2inst1_115-rs |- Offline IBM.Application:db2_db2inst1_115-rs:bcu_node17 '- Offline IBM.Application:db2_db2inst1_115-rs:bcu_node21 Here are some examples of commands that will attempt to online all resources simultaneously: db2start chrg -o online -s 1=1 hastartdb2 (applicable only to ISAS) When the start orders are issued, TSA-MP calls the "/usr/sbin/rsct/sapolicies/db2/db2V97_start.ksh" script for each partition in order to start its corresponding resource. As each resource is started, a call to query for status, forces all other nodes in the cluster to run a monitor command against all other partition resources. The result is that multiple monitors are run in parallel due to the start activity which creates race conditions that interfere with TSA-MP's ability to capture all the needed return codes. Due to this, TSA-MP is not able to send start orders for all partitions as expected. | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * DB2 DPF and TSAMP * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * As root, comment out (or delete) the following line from the * * /usr/sbin/rsct/sapolicies/db2/db2V97_start.ksh file: * * runact -s "Name like 'db2_%${DB2INSTANCE?}%'" * * IBM.Application * * refreshOpState 2> /dev/null * * * * as it is not needed and is the source of this problem. * **************************************************************** | |
Local Fix: | |
As root, comment out (or delete) the following line from the /usr/sbin/rsct/sapolicies/db2/db2V97_start.ksh file: runact -s "Name like 'db2_%${DB2INSTANCE?}%'" IBM.Application refreshOpState 2> /dev/null as it is not needed and is the source of this problem. | |
available fix packs: | |
DB2 Version 9.7 Fix Pack 9 for Linux, UNIX, and Windows | |
Solution | |
Obtain DB2 9.7 fixpack 9 | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC95596 IC95661 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 13.08.2013 17.12.2013 17.12.2013 |
Problem solved at the following versions (IBM BugInfos) | |
9.7.FP9 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.9 | |
9.7.0.9 |