DB2 - Problem description
Problem IT03187 | Status: Closed |
DB2 INSTANCE MIGHT CRASH OR HANG IF A QUERY IS INTERRUPTED DURIN G A UNION OPERATION WITH OTHER SPECIFIC ACCESS PLAN PROPERTIES | |
product: | |
DB2 FOR LUW / DB2FORLUW / 970 - DB2 | |
Problem description: | |
For this problem to happen, the following conditions need to be true: 1) An interrupt (or any possible error) is received with a specific timing during the execution of a query. 2) The query in question features an access plan that contains a UNION operator that is on the left (probe) side of a HJSN. For example: HSJN / \ UNION table +----+-----+ leg0 leg1 leg2 3) Although there is a very small chance that the problem could exist in other configurations, it is really more likely to happen in DPF. In the example access plan shown in item 2) above, the exact symptom that may occur will depend on what access plan logic exists in the union legs (leg0, leg1, and leg 2) In one case where the problem was seen, there was a hjsn in the union legs, and the problem caused an infinite loop. Another symptom might be a crash. For example, a stack from such a crash might be: sqldRowFetch sqlriFetch sqlrieoqp sqlriftr sqlriExecThread sqlriunn sqlriset sqlriExecThread sqlrihsjn sqlriExecThread sqlrihsjn sqlriExecThread sqlrihsjn sqlriExecThread sqlrihsjn sqlriSectInvoke sqlrr_dss_router sqlrr_subagent_router sqleSubRequestRouter sqleProcessSubRequest sqeAgent::RunEDU sqzEDUObj::EDUDriver sqloEDUEntry Due to the nature of this problem, it really depends on what is inside the union legs that will influence how the problem manifests. It could be a crash, hang, or other unexpected errors. The key functions in the above stack to identify the problem are: sqlriunn sqlriset sqlriExecThread sqlrihsjn Different functions may appear above these functions in the stack of a crash case and it is not necessarily always the sqldRowFetch that would hit the problem. The issue itself is that an initial interrupt situation has occured at a specific timing when closing a hsjn that results in a failure to clean up the union structions below the hjsn in the plan. If this query is executed a second time, then the dirty union structures can lead to error/crash/hang scenario's. | |
Problem Summary: | |
For this problem to happen, the following conditions need to be true: 1) An interrupt (or any possible error) is received with a specific timing during the execution of a query. 2) The query in question features an access plan that contains a UNION operator that is on the left (probe) side of a HJSN. For example: HSJN / \ UNION table +----+-----+ leg0 leg1 leg2 3) Although there is a very small chance that the problem could exist in other configurations, it is really more likely to happen in DPF. In the example access plan shown in item 2) above, the exact symptom that may occur will depend on what access plan logic exists in the union legs (leg0, leg1, and leg 2) In one case where the problem was seen, there was a hjsn in the union legs, and the problem caused an infinite loop. Another symptom might be a crash. For example, a stack from such a crash might be: sqldRowFetch sqlriFetch sqlrieoqp sqlriftr sqlriExecThread sqlriunn sqlriset sqlriExecThread sqlrihsjn sqlriExecThread sqlrihsjn sqlriExecThread sqlrihsjn sqlriExecThread sqlrihsjn sqlriSectInvoke sqlrr_dss_router sqlrr_subagent_router sqleSubRequestRouter sqleProcessSubRequest sqeAgent::RunEDU sqzEDUObj::EDUDriver sqloEDUEntry Due to the nature of this problem, it really depends on what is inside the union legs that will influence how the problem manifests. It could be a crash, hang, or other unexpected errors. The key functions in the above stack to identify the problem are: sqlriunn sqlriset sqlriExecThread sqlrihsjn Different functions may appear above these functions in the stack of a crash case and it is not necessarily always the sqldRowFetch that would hit the problem. The issue itself is that an initial interrupt situation has occured at a specific timing when closing a hsjn that results in a failure to clean up the union structions below the hjsn in the plan. If this query is executed a second time, then the dirty union structures can lead to error/crash/hang scenario's. | |
Local Fix: | |
Solution | |
First fixed in DB2 Version 9.7 Fixpack 10. | |
Workaround | |
NA | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 15.07.2014 18.12.2014 18.12.2014 |
Problem solved at the following versions (IBM BugInfos) | |
9.7.FP10 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.7.0.10 |