DB2 - Problem description
Problem IC97570 | Status: Closed |
DB2 OPTIMIZER MAY CHOOSE TO DEFER PROCESSING DISTINCT CLAUSE OF A UNION-BRANCH | |
product: | |
DB2 FOR LUW / DB2FORLUW / A10 - DB2 | |
Problem description: | |
Even if DISTINCT clause is explicitly mentioned for a union-branch, the DB2 optimizer may choose an access plan that defers processing of the DISTINCT clause i.e. duplicate values will not be removed from that branch before doing UNION. If the union-branch has many duplicate values, then delaying DISTINCT may result in poor performance. The optimizer may choose such access plan under the following conditions: 1. SQL query contains a DISTINCT clause in a full SELECT 2. There is a UNION operation in the sub-query, which is an input to above full select 3. There is a DISTINCT clause in the full select of a UNION-branch For example, consider the following query: SELECT DISTINCT COALESCE(T1.COL1, T2.COL2) FROM T1 RIGHT OUTER JOIN ( SELECT DISTINCT COL2 FROM T0 WHERE COL2 IS NOT NULL UNION ALL SELECT '(NULL)' AS COL2 FROM SYSIBM.SYSDUMMY1 ) AS T2 ON (T1.COL2 = T2.COL2); Note the following in the above SQL query: 1. Query's full select contains a DISTINCT clause: "DISTINCT COALESCE(T1.COL1, T2.COL2)" 2. One UNION-branch also contains a DISTINCT clause: "DISTINCT COL2" DB2 optimizer may choose an access plan that does not find DISTINCT values in the T0.COL2 column. This could result in sub-optimal performance if there are many duplicate values in T0.COL2 column. In the exfmt output, the explain plan graph structure may look as shown below. Note that there is no SORT or UNIQUE operation for applying DISTINCT clause on T0.COL2 column. The SORT operation-4 is applying the "DISTINCT COALESCE(T1.COL1, T2.COL2)" clause. ... SORT ( 4) | HSJOIN< ( 5) /---------------+---------------\ TBSCAN UNION ( 6) ( 7) | /--------+---------\ TABLE: SCHEMA1 TBSCAN TBSCAN T1 ( 8) ( 9) | | TABFNC: SYSIBM TABLE: SCHEMA1 GENROW T0 | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * ALL * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 v10.1 Fixpack 4 or higher * **************************************************************** | |
Local Fix: | |
The above SQL query can be rewritten as follows: SELECT DISTINCT COALESCE(T1.COL1, T2.COL2) FROM T1 RIGHT OUTER JOIN ( SELECT MAX( COL2 ) as COL2 FROM T0 WHERE COL2 IS NOT NULL GROUP BY COL2 UNION ALL SELECT '(NULL)' AS COL2 FROM SYSIBM.SYSDUMMY1 ) AS T2 ON (T1.COL2 = T2.COL2); Note the "DISTINCT COL2" clause is replaced with "MAX(COL2) AS COL2 ... GROUP BY COL2". | |
available fix packs: | |
DB2 Version 10.1 Fix Pack 4 for Linux, UNIX, and Windows | |
Solution | |
First fixed in DB2 v10.1 Fixpack 4 | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 11.11.2013 14.07.2014 14.07.2014 |
Problem solved at the following versions (IBM BugInfos) | |
Problem solved according to the fixlist(s) of the following version(s) | |
10.1.0.4 |