home clear 64x64
en blue 200x116 de orange 200x116 info letter User
suche 36x36
Latest versionsfixlist
14.10.xC11 FixList
12.10.xC16.X5 FixList
11.70.xC9.XB FixList
11.50.xC9.X2 FixList
11.10.xC3.W5 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

Informix - Problem description

Problem IT04342 Status: Closed

THE EVIDENCE.SH CAN CAUSE LIMITED CONNECTIVITY OR A COMPLETE INSTANCE BLOCK
WHEN AUDITING IS TURNED ON

product:
INFORMIX SERVER / 5725A3900 / C10 - IDS 12.10
Problem description:
If you have the auditing turned on (any level) and your instance 
hits an assertion failure which triggers the SYSALARMPROGRAM 
($INFORMIXDIR/etc/evidence.sh by default), the instance may 
become unresponsive to new connection requests - or even get 
completely stuck - for 6 or more minutes. 
When auditing is turned on, the onstat command sends it's 
command line arguments to the onmode_mon thread in the server to 
be written into the audit trail. If the assertion failure occurs 
in a thread running on cpuvp 1, that cpuvp gets blocked (as it 
waits for SYSALARMPROGRAM to finish) and cannot serve the 
onmode_mon thread (which is bound to it) hence the onmode_mon 
thread can't accept the command line arguments sent by the 
onstats called from SYSALARMPROGRAM. 
In such a situation the onstat waits till the onmode_mon thread 
becomes available. If it doesn't do so in 5 seconds, the onstat 
gives up and continues to print the requested outputs. 
As the default SYSALARMPROGRAM calls the onstat ~73x, the total 
time the script runs is at least 365 seconds. During this time 
all the threads bound to cpuvp 1 (onmode_mon, listeners and 
others) can't run. If you have only one cpuvp configured, the 
whole instance is blocked, which may have some adverse effects. 
For example, in a MACH11 cluster environment managed by a 
connection manager (CM), this may lead to a split-brain 
situation (two primaries in cluster) as the CM initiates a 
failover (because it can't reach the blocked old primary) and 
promotes some of the secondaries to a new primary without 
killing the old one.
Problem Summary:
**************************************************************** 
* USERS AFFECTED:                                              * 
* All users                                                    * 
**************************************************************** 
* PROBLEM DESCRIPTION:                                         * 
* See Error Description                                        * 
**************************************************************** 
* RECOMMENDATION:                                              * 
* Update to IDS-12.10.xC5                                      * 
****************************************************************
Local Fix:
A partial workaround may be: 
- make sure you have at least 2 cpuvp's configured 
- if you are using the default SYSALARMPROGRAM, find the 
"DO_ONSTAT_A=off" line in it and change it to "DO_ONSTAT_A=on". 
This will reduce the number of onstat calls from 73 to 8, so the 
time needed to complete the script should go from 365 to ~40 
seconds
Solution
Problem Fixed In IDS-12.10.xC5
Workaround
not known / see Local fix
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
11.09.2014
16.10.2015
16.10.2015
Problem solved at the following versions (IBM BugInfos)
Problem solved according to the fixlist(s) of the following version(s)
12.10.xC5 FixList
12.10.xC5.W1 FixList