DB2 - Problem description
Problem IT34261 | Status: Closed |
IMPROPER LATCHING WHEN PROCESSING SQLEDIRCACHE DURING CONNECTINGTO DATABASES CAN LEAD TO INSTANCE ABEND | |
product: | |
DB2 FOR LUW / DB2FORLUW / B10 - DB2 | |
Problem description: | |
An improper latching occurs when a user runs federated stored procedures and the FMP thread attempts to connect to the remote data source. Instead of using the engine latch a different application latch is used. During the connection process the krcb->sqleDirCache is read and processed. Subsequently, if an operation such as online-backup requests a refresh of the sqlrDirCache via flag any next connection attempt from a normal engine EDU which also processes through the same code path takes the engine latch. As the latch is not held by the federated FMP thread (it hold another application latch) the engine thread proceeds to free and recreate the sqleDirCache. This may result in the FMP thread to access freed or bad memory as sqleDirCache may have changed. This then leads to eventual write access or undefined behavior because flags that are checked are not set properly. The following stack trace might be created by the FODCAppErr if the condition is encountered. 0x0900000011837A14 sqlehdir__FP8SQLE_BWA + 0xC94 0x0900000010F01720 sqleUCappConnect + 0x3E40 0x090000001197F588 CLI_sqlConnect__FP15CLI_CONNECTINFOP5sqlcaP19CLI_ERRORHEADERINFO + 0x1368 0x0900000011C27938 SQLConnect2__FP15CLI_CONNECTINFOPUcsT2T3T2T3T2T3Uc + 0x7B8 0x0900000011C15548 SQLDriverConnect2__FP15CLI_CONNECTINFOPvPUcsT3T4PsUsUcT9P19CLI_E RRORHEADERINFO + 0x2288 0x0900000011C2D0A4 SQLDriverConnect + 0x324 0x090000003B462E14 connect_and_conattr__15DRDA_ConnectionFPUcN21b + 0x6F4 0x090000003B461FC0 connect__15DRDA_ConnectionFv + 0x260 0x090000003B349D08 connect__12FencedServerFP17FencedRemote_UserPP17Remote_Connectio nb + 0x628 0x090000003B348D08 connect_current_user__12FencedServerFPP17Remote_Connection + 0x5A8 0x090000003B35A8F8 sqlqgConnect__FPUcPP17Remote_Connection + 0x178 0x090000003B36C7CC sqlqg_FMP_FedStart__FP10SQLQG_F2PC + 0x24C 0x090000003B2F7CE4 sqlqgRouter__FP17sqlqg_FMP_RequestPP15sqlqg_FMP_ReplyP10sqlri_uf ob + 0xCE4 0x090000003B2FCD20 sqlqg_fedstp_hook + 0x1E0 0x0900000019E3ED5C sqlqgDyload__FP10sqlri_ufob + 0x29C 0x090000001A4C6CFC sqlriFedInvokerTrusted__FP10sqlri_ufobP21sqlriRoutineErrorIntf + 0x59C 0x090000001A4C7BE4 sqlriFedInvokeInvoker__FP10sqlri_ufobP14sqlqg_Fmp_Info + 0x224 0x090000001A45EA1C sqlqg_Call_FMP_Thread__FP17sqlqg_FMP_RequestPP15sqlqg_FMP_Reply + 0x3BC 0x090000001B94F9FC sqlqgFedStart__FP14UnfencedServercPb + 0xE1C 0x090000001A4419F0 sqlqgOpen__FP12sqlri_rquery + 0x170 0x090000001AD02EB0 sqlri_djx_rta__FP8sqlrr_cb + 0xA90 0x0900000019BC4138 sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + 0x618 0x0900000018C3A4C4 sqlrr_process_execute_request__FP8sqlrr_cbib + 0x3624 0x0900000018BD4620 sqlrr_execute__FP14db2UCinterfaceP9UCstpInfo + 0x4A0 0x090000001B3020EC executeSection__10pvmPackageFP5sqlcaUib + 0x98C 0x090000001B2FD0A8 executeQuery__3PVMFUib + 0x208 0x090000001B309D60 run__3PVMFv + 0xFE0 0x090000001B2F59DC pvm_entry + 0x45C 0x0900000016B15A8C sqloInvokeFnArgs + 0x5D6C 0x090000001A4C2598 IPRA.$sqlriInvokerTrusted__FP10sqlri_ufobP21sqlriRoutineErrorInt fb + 0x2AF8 0x090000001A4B82E8 sqlriInvokeInvoker__FP10sqlri_ufobb + 0x2008 0x090000001A4B91F0 sqlricall__FP8sqlrr_cb + 0x670 0x0900000019BC4138 sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + 0x618 0x0900000018C3A4C4 sqlrr_process_execute_request__FP8sqlrr_cbib + 0x3624 0x0900000018BD4620 sqlrr_execute__FP14db2UCinterfaceP9UCstpInfo + 0x4A0 0x090000001DFFF4BC sqljs_ddm_excsqlstt__FP14db2UCinterfaceP13sqljDDMObject + 0x5BC 0x090000001DF7BAE0 sqljsParseRdbAccessed__FP13sqljsDrdaAsCbP13sqljDDMObjectP14db2UC interface In this case a write attempt to sqleDirCache failed as sqleDirCache had been freed. This was in a codepath the FMP thread should not have been in but due to memory being reset all checks were bypassed until it hit the code causing the null pointer access. However, there is a possibility that the same memory is reused and the FMP thread continues without problems. | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * 250 * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to 11.1.4.6 * **************************************************************** | |
Local Fix: | |
The only workaround is to move the federated thread into it's own process by changing the DRDA wrapper to fenced mode. By default the wrapper is unfenced which leads to this sharing of memory structures. ALTER WRAPPER OPTIONS(DB2_FENCED 'Y') | |
Solution | |
Workaround | |
**************************************************************** * USERS AFFECTED: * * 250 * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to 11.1.4.6 * **************************************************************** | |
Comment | |
Upgrade to 11.1.4.6 | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 17.09.2020 22.03.2021 22.03.2021 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) |