DB2 - Problem description
Problem IC99289 | Status: Closed |
POSSIBLE CACHE CORRUPTION(SEGV/CRASH): ENTRY CAN BE FREED BY ONE THREAD AND ADDED TO LRU BY ANOTHER AT THE SAME TIME | |
product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
Problem description: | |
It is possible that one thread removes an entry from the LRU and cached entries list and then frees that entry while another thread is currently adding it again to the LRU list. This results in using a freed entry that most likely will lead to a segv/crash. Typical stacks would look like the following (top part of stack): == Example 1 == 0x00007FA06864E2FB sqlrlc_remove_from_lru + 0x006b 0x00007FA0686439F4 sqlrlc_csm_lru_cache + 0x0144 0x00007FA068640EDC sqlrlc_csm_lru + 0x04ac 0x00007FA06863FC4D sqlrlc_csm_processing + 0x017d 0x00007FA06863F825 sqlrlc_check_available_memory + 0x0175 0x00007FA0686A2035 sqlrlcAuthidsUpdateEntries + 0x0565 0x00007FA0686A3DFA sqlrlcAuthidsFindRolelist + 0x076a == Example 2 == 0x00007F80B5E639E0 sqlo_xlatch11getIdentityEv + 0x0000 0x00007F80B8AF0074 sqlrlc_update_lru + 0x01c4 0x00007F80B8ADE81A sqlrlc_auth_find_insert + 0x035a 0x00007F80B8B189D8 sqlrlc_sysuserauth_get_authPDs + 0x00d8 0x00007F80B8B1874D sqlrlc_sysuserauth_request_auths + 0x023d 0x00007F80B8A1ACF1 sqlrlgua + 0x0391 2013-12-04-11.06.56.120423-300 EDUID : 62 EDUNAME: db2agent (CAT) FUNCTION: DB2 UDB, catcache support, sqlrlc_create_insert_entry, probe:2445 new entry:- p_entry = 0x00007f5fed53e600 [1] 0x00007F602C7774F5 sqlrlc_create_insert_entry + 0x105 [2] 0x00007F602C76C890 sqlrlc_auth_find_insert + 0x3D0 [3] 0x00007F602C7A6D78 sqlrlc_sysuserauth_get_authPDs + 0xD8 [4] 0x00007F602C7A6AED sqlrlc_sysuserauth_request_auths + 0x23D 2013-12-04-11.06.58.487657-300 EDUID : 62 EDUNAME: db2agent (CAT) FUNCTION: DB2 UDB, catcache support, sqlrlc_auth_find_insert, probe:112 updating LRU:- p_entry = 0x00007f5fed7c9d00 p_anchor = 0x00007f5fec33cd70 0x00007F5FED7C9D00 : 70CD 33EC 5F7F 0000 00E6 53ED 5F7F 0000 p.3._.....S._... 0x00007F5FED7C9D10 : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x00007F5FED7C9D20 : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x00007F5FED7C9D30 : 0E00 0000 0200 0000 ........ [1] 0x00007F602C77E324 sqlrlc_update_lru + 0xD4 [2] 0x00007F602C76C81A sqlrlc_auth_find_insert + 0x35A [3] 0x00007F602C7A6D78 sqlrlc_sysuserauth_get_authPDs + 0xD8 [4] 0x00007F602C7A6AED sqlrlc_sysuserauth_request_auths + 0x23D 2013-12-04-11.07.07.566089-300 EDUID : 51 EDUNAME: db2agent (CAT) FUNCTION: DB2 UDB, catcache support, sqlrlc_auth_find_insert, probe:389 Removing from LRU:- p_entry = 0x00007f5fed53e600 p_anchor = 0x00007f5fec33cd70 -- Remove from LRU and put on DEFUNCT list. 0x00007F5FED53E600 : 70CD 33EC 5F7F 0000 E006 54ED 5F7F 0000 p.3._.....T._... 0x00007F5FED53E610 : 009D 7CED 5F7F 0000 0000 0000 0000 0000 ..|._........... 0x00007F5FED53E620 : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x00007F5FED53E630 : 0E00 0000 0200 0000 ........ [1] 0x00007F602C77E7A3 sqlrlc_remove_from_lru + 0xC3 [2] 0x00007F602C76D746 sqlrlc_authcache_update + 0x336 [3] 0x00007F602C775308 sqlrlc_execute_event + 0x188 [4] 0x00007F602C774666 sqlrlc_broadcast + 0x2B6 2013-12-04-11.07.07.567390-300 EDUID : 62 EDUNAME: db2agent (CAT) FUNCTION: DB2 UDB, catcache support, sqlrlc_auth_find_insert, probe:112 updating LRU:- p_entry = 0x00007f5fed53e600 p_anchor = 0x00007f5fec33cd70 -- Here we put that entry on the LRU but it seems that we do not hold -- the LRU latch nor the ANCHOR latch at this point. 0x00007F5FED53E600 : 70CD 33EC 5F7F 0000 E006 54ED 5F7F 0000 p.3._.....T._... 0x00007F5FED53E610 : 009D 7CED 5F7F 0000 0000 0000 0000 0000 ..|._........... 0x00007F5FED53E620 : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x00007F5FED53E630 : 0E00 0000 0200 0000 ........ [1] 0x00007F602C77E324 sqlrlc_update_lru + 0xD4 [2] 0x00007F602C76EE56 sqlrlc_auth_update_entries + 0x486 [3] 0x00007F602C7A6EFD sqlrlc_sysuserauth_get_authPDs + 0x25D [4] 0x00007F602C7A6AED sqlrlc_sysuserauth_request_auths + 0x23D 2013-12-04-11.07.35.070324-300 EDUID : 58 EDUNAME: db2agent (CAT) FUNCTION: DB2 UDB, catcache support, sqlrlc_auth_find_insert, probe:709 freeing entry:- p_entry = 0x00007f5fed53e600 p_anchor = 0x00007f5fec33cd70 -- CSM comes in picks up the entry from the DEFUNCT list and frees it. -- So entry is freed but added to the LRU by EDU 62 because -- EDU 62 called sqlrlc_update_lru() with no latch on anchor or lru. 0x00007F5FED53E600 : 70CD 33EC 5F7F 0000 0000 0000 0000 0000 p.3._........... 0x00007F5FED53E610 : 0000 0000 0000 0000 009D 7CED 5F7F 0000 ..........|._... 0x00007F5FED53E620 : 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x00007F5FED53E630 : 0E00 0000 0600 0000 ........ [1] 0x00007F602C7700C9 sqlrlc_csm_defunct + 0x329 [2] 0x00007F602C76FB16 sqlrlc_csm_processing + 0x46 [3] 0x00007F602C76F825 sqlrlc_check_available_memory + 0x175 [4] 0x00007F602C77F382 sqlrlc_find_request_usage_lock + 0x422 [5] 0x00007F602C7A1046 sqlrlc_systables_fetch + 0x816 2013-12-04-11.07.45.596906-300 EDUID : 208 EDUNAME: db2agent (CAT) enter function with p_anchor = 0x00007f5fec33cd70 -- We follow the LRU list and find that entry that is now -- freed by CSM. FUNCTION: DB2 UDB, base sys utilities, sqleagnt_sigsegvh, probe:10 0x00007F6029AF19E0 sqlo_xlatch + 0x0000 0x00007F602C773B63 sqlrlc_csm_lru_cache + 0x00a3 0x00007F602C770FEC sqlrlc_csm_lru + 0x04ac 0x00007F602C76FC4D sqlrlc_csm_processing + 0x017d 0x00007F602C76F825 sqlrlc_check_available_memory + 0x0175 0x00007F602C7A2BE3 sqlrlc_systables_fetch_from_disk + 0x1333 | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Problem Description above. * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 Version 10.1 Fix Pack 4 * **************************************************************** | |
Local Fix: | |
Increasing CATALOGCACHE_SZ to higher value so that catalog cache does not overflow. | |
available fix packs: | |
DB2 Version 10.1 Fix Pack 4 for Linux, UNIX, and Windows | |
Solution | |
First fixed in DB2 Version 10.1 Fix Pack 4 | |
Workaround | |
not known / see Local fix | |
BUG-Tracking | |
forerunner : APAR is sysrouted TO one or more of the following: IC99642 IT10249 IT10298 IT10299 IT10300 IT10303 follow-up : | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 11.02.2014 29.07.2014 29.07.2014 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) | |
10.1.0.4 | |
10.5.0.5 |