Home

Latest versions	fixlist
11.1.0.7
10.5.0.9
10.1.0.6
9.8.0.5
9.7.0.11
9.5.0.10
9.1.0.12

Have problems? - contact us.
Register for free
Contact form

DB2 - Problem description

Problem IC71861	Status: Closed
DB2 HADR PAIR CAN HANG WHILE PROCESSING AN INFORMATIONAL LOG RECORD ON STANDBY
product:
DB2 FOR LUW / DB2FORLUW / 970 - DB2
Problem description:
A DB2 HADR pair can hang showing connect status "Congested" in the db2pd -hadr output: Database Partition 0 -- Database SAMPLE -- Active -- HADR Information: Role State SyncMode HeartBeatsMissed LogGapRunAvg (bytes) Primary Peer Nearsync 0 991669 ConnectStatus ConnectTime Timeout Congested Wed Sep 8 20:31:26 2010 (1283970686) 120 The ouput on standby will show that buffer is 100% full. The problem is caused while processing an informational log record on the STANDBY system. Note: The 'Congested' state is just an external symptom. A 'Congested' state will not always indicate a hang issue. A typical stack of db2redom in this situation will be: Thread 51 (Thread 0x2aaac17fe940 (LWP 12900)): #0 0x000000333a4d517a in semtimedop () from /lib64/libc.so.6 #1 0x00002aaaabca8d8b in sqloWaitEDUWaitPost () from /home/inst01/sqllib/lib64/libdb2e.so.1 #2 0x00002aaaad25ed66 in sqlprWaitDuringPRec(sqeAgent, SQLO_EDUWAITPOST) () from /home/inst01/sqllib/lib64/libdb2e.so.1 #3 0x00002aaaad25c6c6 in sqlpPRecReadLog(sqeAgent, SQLP_ACB, SQLP_DBCB) () from /home/inst01/sqllib/lib64/libdb2e.so.1 #4 0x00002aaaad24e388 in sqlpParallelRecovery(sqeAgent, sqlca) () from /home/inst01/sqllib/lib64/libdb2e.so.1 #5 0x00002aaaac5ec2b4 in sqleSubCoordProcessRequest(sqeAgent) () from /home/inst01/sqllib/lib64/libdb2e.so.1 #6 0x00002aaaab8d3d8e in sqeAgent::RunEDU() () from /home/inst01/sqllib/lib64/libdb2e.so.1 #7 0x00002aaaabf7af94 in sqzEDUObj::EDUDriver() () from /home/inst01/sqllib/lib64/libdb2e.so.1 #8 0x00002aaaabf7aeeb in sqlzRunEDU(char*, unsigned int) () from /home/inst01/sqllib/lib64/libdb2e.so.1 #9 0x00002aaaabcf6d62 in sqloEDUEntry () from /home/inst01/sqllib/lib64/libdb2e.so.1 #10 0x000000333b00673d in start_thread () from /lib64/libpthread.so.0 #11 0x000000333a4d3d1d in clone () from /lib64/libc.so.6 Normal idle would look like: sqlpPRecReadLog -> sqlpshrScanNext -> sqlorest (etc.) Where the hang shows: sqlpPRecReadLog -> sqlprWaitDuringPRec -> sqloWaitEDUWaitPost
Problem Summary:
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Problem Description above. * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 9.7 Fix Pack 4. * ****************************************************************
Local Fix:
The fewer redo workers you have, the more likely this is to be hit. You can use DB2BPVARS to configure the number of redo workers like described below. Step 1: set DB2BPVARS to point to the file that contains the new value: db2set DB2BPVARS=/home/userid/bpvars.txt (you can use whatever filename they want) Step 2: Add 1 line to this file: NOTE: the value '5' includes 4 workers and a master. If you want to try 6 (or 8) workers, they need to set this value to 7 (or 9). PREC_NUM_AGENTS=5 so the file looks like this: $cat /home/userid/bpvars.txt PREC_NUM_AGENTS=5 NOTE: the database needs to be re-cycled for this value to be picked up.
available fix packs:
DB2 Version 9.7 Fix Pack 4 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 5 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 6 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 7 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 8 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 9 for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 9a for Linux, UNIX, and Windows DB2 Version 9.7 Fix Pack 10 for Linux, UNIX, and Windows
Solution
First fixed in DB2 Version 9.7 Fix Pack 4.
Workaround
not known / see Local fix
Timestamps
Date - problem reported : Date - problem closed : Date - last modified :	13.10.2010 03.05.2011 03.05.2011
Problem solved at the following versions (IBM BugInfos)
9.7.FP4
Problem solved according to the fixlist(s) of the following version(s)
9.7.0.4