home clear 64x64
en blue 200x116 de orange 200x116 info letter User
suche 36x36
Latest versionsfixlist
11.1.0.7 FixList
10.5.0.9 FixList
10.1.0.6 FixList
9.8.0.5 FixList
9.7.0.11 FixList
9.5.0.10 FixList
9.1.0.12 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

DB2 - Problem description

Problem IC66800 Status: Closed

DB2STOP TAKES A LONG TIME ON HADR SYSTEM IF STANDBY IS OFFLINE AND DATABASE
NOT ACTIVATED

product:
DB2 FOR LUW / DB2FORLUW / 910 - DB2
Problem description:
When we have a HADR system, and the standby is offline. If the 
database in the primary is not activated,  the first connection 
to the database will be the one to activate it.  If we run db2 
connect to <database name> or db2 start hadr on <database name> 
as primary, but without the "by force" option, the connection 
will try to start hadr and connect to the standby, timing out 
eventually after the HADR_TIMEOUT setting and getting SQL1768N 
Unable to start HADR. Reason code = "7". 
 
2010-02-03-04.29.45.699578+000 I2838967A388       LEVEL: Warning 
PID     : 606706               TID  : 1           PROC : 
db2agent (FRH) 0 
INSTANCE: db2frh               NODE : 000 
APPHDL  : 0-171                APPID: *LOCAL.db2frh.100203042357 
AUTHID  : DB2FRH 
FUNCTION: DB2 UDB, High Availability Disaster Recovery, 
hdrEduStartup, probe:21151 
MESSAGE : Info: HADR Startup has begun. 
 
 
2010-02-03-04.30.16.734635+000 I2842396A552       LEVEL: Error 
PID     : 999878               TID  : 1           PROC : 
db2hadrp (FRH) 0 
INSTANCE: db2frh               NODE : 000 
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduP, 
probe:20390 
MESSAGE : HADR primary did not establish connection with standby 
within timeout 
          and will shut down. BY FORCE option required to start 
primary without 
          standby. Timeout seconds = 
DATA #1 : Hexdump, 4 bytes 
0x07800001D52CD008 : 0000 001E 
 
2010-01-14-06.35.07.374445+000 I5172086A471       LEVEL: Error 
PID     : 1077874              TID  : 1           PROC : 
db2agent (AB7) 0 
INSTANCE: db2ab7               NODE : 000 
APPHDL  : 0-8                  APPID: *LOCAL.db2ab7.100114063322 
AUTHID  : DB2AB7 
FUNCTION: DB2 UDB, High Availability Disaster Recovery, 
hdrEduStartup, probe:21300 
MESSAGE : HADR EDU sqlcode: 
DATA #1 : Hexdump, 4 bytes 
0x000000011121526C : FFFF F918 
.... 
 
2010-01-14-06.35.07.374514+000 I5172558A419       LEVEL: Severe 
PID     : 1077874              TID  : 1           PROC : 
db2agent (AB7) 0 
INSTANCE: db2ab7               NODE : 000 
APPHDL  : 0-8                  APPID: *LOCAL.db2ab7.100114063322 
AUTHID  : DB2AB7 
FUNCTION: DB2 UDB, base sys utilities, sqledint, probe:230 
DATA #1 : Hexdump, 4 bytes 
0x000000011121526C : FFFF F918 
.... 
 
If many of the connection attempts are issued, they will all be 
serialized until the database is activated: 
1. The currently active connection that is trying to start HADR 
is holding the database latch. The application is waiting to 
reach the HADR timeout. 
2. All other connections that are trying to start HADR are 
queued up behind the database latch in a serialized fashion. 
 
If in this scenario we run db2stop force, this might take a long 
time, depending on how many connections have been queued to 
activate the database (they will all fail with HADR timeout 
SQL1768N) 
 
 
When "db2stop force" kicks in, it will detect the number of 
applications that need to be forced: 
 
 
FUNCTION: DB2 UDB, base sys utilities, 
sqeAppServices::ExecuteStopForce, probe:1000 
DATA #1 : String, 47 bytes 
[Force]->Number of applications to be forced : 
DATA #2 : Hexdump, 4 bytes 
0x0FFFFFFFFFFFD698 : 0000 0004 
.... 
 
It will  until all queued up applications respond, and only then 
the database is actually stopped. This might take a long time, 
and could be perceived as db2stop force actually being hung.
Problem Summary:
**************************************************************** 
* USERS AFFECTED:                                              * 
* ALL                                                          * 
**************************************************************** 
* PROBLEM DESCRIPTION:                                         * 
* DB2STOP TAKES A LONG TIME ON HADR SYSTEM IF STANDBY IS       * 
* OFFLINE AND DATABASE NOT ACTIVATED                           * 
* When we have a HADR system, and the standby is offline. If   * 
* the                                                          * 
* database in the primary is not activated,  the first         * 
* connection                                                   * 
* to the database will be the one to activate it.  If we run   * 
* db2                                                          * 
* connect to <database name> or db2 start hadr on <database    * 
* name>                                                        * 
* as primary, but without the "by force" option, the           * 
* connection                                                   * 
* will try to start hadr and connect to the standby, timing    * 
* out                                                          * 
* eventually after the HADR_TIMEOUT setting and getting        * 
* SQL1768N                                                     * 
* Unable to start HADR. Reason code = "7".                     * 
*                                                              * 
*                                                              * 
*                                                              * 
* 2010-02-03-04.29.45.699578+000 I2838967A388       LEVEL:     * 
* Warning                                                      * 
* PID     : 606706               TID  : 1           PROC :     * 
*                                                              * 
* db2agent (FRH) 0                                             * 
*                                                              * 
* INSTANCE: db2frh               NODE : 000                    * 
*                                                              * 
* APPHDL  : 0-171                APPID:                        * 
* *LOCAL.db2frh.100203042357                                   * 
* AUTHID  : DB2FRH                                             * 
*                                                              * 
* FUNCTION: DB2 UDB, High Availability Disaster Recovery,      * 
*                                                              * 
* hdrEduStartup, probe:21151                                   * 
*                                                              * 
* MESSAGE : Info: HADR Startup has begun.                      * 
*                                                              * 
*                                                              * 
*                                                              * 
*                                                              * 
*                                                              * 
* 2010-02-03-04.30.16.734635+000 I2842396A552       LEVEL:     * 
* Error                                                        * 
* PID     : 999878               TID  : 1           PROC :     * 
*                                                              * 
* db2hadrp (FRH) 0                                             * 
*                                                              * 
* INSTANCE: db2frh               NODE : 000                    * 
*                                                              * 
* FUNCTION: DB2 UDB, High Availability Disaster Recovery,      * 
* hdrEduP,                                                     * 
* probe:20390                                                  * 
*                                                              * 
* MESSAGE : HADR primary did not establish connection with     * 
* standby                                                      * 
* within timeout                                               * 
*                                                              * 
* and will shut down. BY FORCE option required to              * 
* start                                                        * 
* primary without                                              * 
*                                                              * 
* standby. Timeout seconds =                                   * 
*                                                              * 
* DATA #1 : Hexdump, 4 bytes                                   * 
*                                                              * 
* 0x07800001D52CD008 : 0000 001E                               * 
*                                                              * 
*                                                              * 
*                                                              * 
* 2010-01-14-06.35.07.374445+000 I5172086A471       LEVEL:     * 
* Error                                                        * 
* PID     : 1077874              TID  : 1           PROC :     * 
*                                                              * 
* db2agent (AB7) 0                                             * 
*                                                              * 
* INSTANCE: db2ab7               NODE : 000                    * 
*                                                              * 
* APPHDL  : 0-8                  APPID:                        * 
* *LOCAL.db2ab7.100114063322                                   * 
* AUTHID  : DB2AB7                                             * 
*                                                              * 
* FUNCTION: DB2 UDB, High Availability Disaster Recovery,      * 
*                                                              * 
* hdrEduStartup, probe:21300                                   * 
*                                                              * 
* MESSAGE : HADR EDU sqlcode:                                  * 
*                                                              * 
* DATA #1 : Hexdump, 4 bytes                                   * 
*                                                              * 
* 0x000000011121526C : FFFF F918                               * 
*                                                              * 
* ....                                                         * 
*                                                              * 
*                                                              * 
*                                                              * 
* 2010-01-14-06.35.07.374514+000 I5172558A419       LEVEL:     * 
* Severe                                                       * 
* PID     : 1077874              TID  : 1           PROC :     * 
*                                                              * 
* db2agent (AB7) 0                                             * 
*                                                              * 
* INSTANCE: db2ab7               NODE : 000                    * 
*                                                              * 
* APPHDL  : 0-8                  APPID:                        * 
* *LOCAL.db2ab7.100114063322                                   * 
* AUTHID  : DB2AB7                                             * 
*                                                              * 
* FUNCTION: DB2 UDB, base sys utilities, sqledint, probe:230   * 
*                                                              * 
* DATA #1 : Hexdump, 4 bytes                                   * 
*                                                              * 
* 0x000000011121526C : FFFF F918                               * 
*                                                              * 
* ....                                                         * 
*                                                              * 
*                                                              * 
*                                                              * 
* If many of the connection attempts are issued, they will all * 
* be                                                           * 
* serialized until the database is activated:                  * 
*                                                              * 
* 1. The currently active connection that is trying to start   * 
* HADR                                                         * 
* is holding the database latch. The application is waiting to * 
*                                                              * 
* reach the HADR timeout.                                      * 
*                                                              * 
* 2. All other connections that are trying to start HADR are   * 
*                                                              * 
* queued up behind the database latch in a serialized fashion. * 
*                                                              * 
*                                                              * 
*                                                              * 
* If in this scenario we run db2stop force, this might take a  * 
* long                                                         * 
* time, depending on how many connections have been queued to  * 
*                                                              * 
* activate the database (they will all fail with HADR timeout  * 
*                                                              * 
* SQL1768N)                                                    * 
*                                                              * 
*                                                              * 
*                                                              * 
*                                                              * 
*                                                              * 
* When "db2stop force" kicks in, it will detect the number of  * 
*                                                              * 
* applications that need to be forced:                         * 
*                                                              * 
*                                                              * 
*                                                              * 
*                                                              * 
*                                                              * 
* FUNCTION: DB2 UDB, base sys utilities,                       * 
*                                                              * 
* sqeAppServices::ExecuteStopForce, probe:1000                 * 
*                                                              * 
* DATA #1 : String, 47 bytes                                   * 
*                                                              * 
* [Force]->Number of applications to be forced :               * 
*                                                              * 
* DATA #2 : Hexdump, 4 bytes                                   * 
*                                                              * 
* 0x0FFFFFFFFFFFD698 : 0000 0004                               * 
*                                                              * 
* ....                                                         * 
*                                                              * 
*                                                              * 
*                                                              * 
* It will  until all queued up applications respond, and only  * 
* then                                                         * 
* the database is actually stopped. This might take a long     * 
* time,                                                        * 
* and could be perceived as db2stop force actually being hung. * 
**************************************************************** 
* RECOMMENDATION:                                              * 
* Upgrade to DB2 Version 9.1 Fixpack 10                        * 
****************************************************************
Local Fix:
When the Standby is offline issue db2 start hadr on <database 
name> as primary by force to activate database on the Primary. 
This will avoid the time out waits and a db2stop force if 
needed, will respond quicker.
available fix packs:
DB2 Version 9.1 Fix Pack 10  for Linux, UNIX and Windows
DB2 Version 9.1 Fix Pack 11  for Linux, UNIX and Windows
DB2 Version 9.1 Fix Pack 12  for Linux, UNIX and Windows

Solution
Problem was first fixed in DB2 Version 9.1 Fixpack 10
Workaround
not known / see Local fix
BUG-Tracking
forerunner  : APAR is sysrouted TO one or more of the following: IC67509 IC67511 IC67514 IC67515 IC68048 
follow-up : 
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
02.03.2010
04.05.2010
08.03.2012
Problem solved at the following versions (IBM BugInfos)
9.1.,
9.1.FP10
Problem solved according to the fixlist(s) of the following version(s)
9.1.0.10 FixList