suche 36x36
  • Admin-Scout-small-Banner
           

    CURSOR Admin-Scout

    get the ultimate tool for Informix

    pfeil  
Latest versionsfixlist
14.10.xC10 FixList
12.10.xC16.X5 FixList
11.70.xC9.XB FixList
11.50.xC9.X2 FixList
11.10.xC3.W5 FixList
Have problems? - contact us.
Register for free anmeldung-x26
Contact form kontakt-x26

Informix - Problem description

Problem IT31461 Status: Closed

ON WINDOWS, LOSING TRACK OF A CPU VP'S NUM_READY_THREADS CAN BURN 100 CPU
CYCLES ON OTHERWISE IDLE SYSTEM

product:
INFORMIX SERVER / 5725A3900 / C10 - IDS 12.10
Problem description:
On a seemingly idle Windows IDS server, it's possible to have a
cpu vp using 100% cpu.

For instance, on a 12 cpu Windows IDS 12.10.TC11 server, we were
able to get stacks for
the cpu vps from a memory dump.

The stacks for 1cpu, 8cpu, 9cpu, 10cpu, 11cpu, 12cpu, 13cpu,
14cpu, 16cpu, 17cpu, 18cpu:

    oninit.exe!net_aio_poll(void *hPort, int timeout) Line 173
    oninit.exe!NT_P(_VP *v) Line 1640
    oninit.exe!NT_idle_loop(_VP*i_vp, unsigned int bz, int
wakeup) Line 5210
    oninit.exe!NT_idle_processor() Line 5124
    oninit.exe!startup() Line 177

The stack for 15cpu is slightly different:

    oninit.exe!net_aio_poll(void *hPort, int timeout) Line 173
    oninit.exe!NT_yield_processor_mvp() Line 18070
    oninit.exe!NT_idle_processor() Line 5107
    oninit.exe!startup() Line 177

Looking at process explorer, we could see that the oninit.exe
thread for 15cpu was running at 100%.

The underlying issue here is that the vp struct associated with
that 15cpu has a positive num_ready_threads
value but there are no threads in its ready queue(s).  This
keeps the idle vp from every sleeping as it constantly
thinks there is a thread ready to run when there isn't.

To identify this on an idle system, you can first observe the
100% cpu usage, but you can also look at "onstat -g sch"
output.  The cpu vp that is using up the cpu cycles will have a
positive number in the Q-ln column with nothing
in the ready queue "onstat -g rea".  For instance, from "onstat
-g sch" you can see the value 1 in the Q-ln column for 15cpu
below:

Thread Migration Statistics:
 vp    pid       class      steal-at steal-sc idlvp-at idlvp-sc
inl-polls Q-ln
 1     9568      cpu        0        0        0        0
0         0
 2     8184      adm        0        0        0        0
0         0
 3     8156      lio        0        0        0        0
0         0
 4     7212      pio        0        0        0        0
0         0
 5     7156      aio        0        0        0        0
0         0
 6     11088     msc        0        0        0        0
0         0
 7     816       fifo       0        0        0        0
0         0
 8     7476      cpu        0        0        0        0
0         0
 9     10904     cpu        0        0        0        0
0         0
 10    10940     cpu        0        0        0        0
0         0
 11    10936     cpu        0        0        0        0
0         0
 12    11096     cpu        0        0        0        0
0         0
 13    8064      cpu        0        0        0        0
0         0
 14    6256      cpu        0        0        0        0
0         0
 15    5996      cpu        0        0        0        0
0         1
 16    5984      cpu        0        0        0        0
0         0
 17    6928      cpu        0        0        0        0
0         0
 18    8056      cpu        0        0        0        0
0         0
 19    924       soc        0        0        0        0
0         0
 20    920       soc        0        0        0        0
0         0
 21    10960     soc        0        0        0        0
0         0
 22    10932     soc        0        0        0        0
0         0

This defect is being entered for defensive purposes.  We should
be
able to identify this case and address it returning the idle cpu
vp to
normal behavior.
Description
Problem Summary:
Local Fix:
Solution
Workaround
not known / see Local fix
Timestamps
Date  - problem reported    :
Date  - problem closed      :
Date  - last modified       :
09.01.2020
24.02.2020
24.02.2020
Problem solved at the following versions (IBM BugInfos)
Problem solved according to the fixlist(s) of the following version(s)
Informix EditionsInformix Editions
Informix Editions
DocumentationDocumentation
Documentation
IBM NewsletterIBM Newsletter
IBM Newsletter
Current BugsCurrent Bugs
Current Bugs
Bug ResearchBug Research
Bug Research
Bug FixlistsBug Fixlists
Bug Fixlists
Release NotesRelease Notes
Release Notes
Machine NotesMachine Notes
Machine Notes
Release NewsRelease News
Release News
Product LifecycleProduct Lifecycle
Lifecycle
Media DownloadMedia Download
Media Download