DB2 - Problem description
Problem IC73582 | Status: Closed |
NESTED-LOOP JOIN WITH EARLYOUT FOR GROUPBY CLAUSES, YIELDS INCORRECT RESULTS WHEN JOIN COLUMNS ARE OF DIFFERENT DATA TYPES | |
product: | |
DB2 FOR LUW / DB2FORLUW / 980 - DB2 | |
Problem description: | |
USING THE NESTED-LOOP JOIN METHOD, WITH EARLYOUT INDICATOR FOR GROUPBY CLAUSES, YIELDS INCORRECT RESULTS WHEN JOIN COLUMNS ARE OF DIFFERENT DATA TYPES The problem occurs when the nested-loop join method, along with EARLYOUT indicator for groupby clauses, is used with a join clause that joins two columns with different data types. The following script demonstrates the problem on 64-bit RISC machines using V9.5 FP6: -------------------- Begin repro.ksh -------------------- #!/bin/ksh dbName="wsdb" export DB2OPTIONS="-ctv" db2start db2 "create db $dbName" db2 "connect to $dbName" # Create two tables for the test. Create the tables such that the # columns being joined (t1.i1 and t2.i1) have different data types. db2 "create table t1 (i1 bigint not null, c2 char(250))" db2 "create index i1 on t1 (i1)" db2 "create table t2 (i1 smallint not null, i2 bigint)" # Import the following data into the table such that we can # generate a nested-loop with early out for group by plan. # # T1: I1 C2 # --- -----------... # -49 -49 # -48 -48 # -47 -47 # ... ... # 49 49 # # T2: I1 I2 # --- --- # -49 <value1> # -49 <value2> # -49 NULL (times 10) # -48 <value1> # -48 <value2> # -48 NULL (times 10) # -47 <value1> # -47 <value2> # -47 NULL (times 10) # ... ... # 49 <value1> # 49 <value2> # 49 NULL (times 10) dataFile="data.del" typeset -i maxValue=50 perl -e 'for($i=0;$i<$ARGV[0];$i++) { printf( "$i,$i\n" ) }' $maxValue > $dataFile perl -e 'for($i=1;$i<$ARGV[0];$i++) { printf( "-$i,-$i\n" ) }' $maxValue >> $dataFile db2 "import from $dataFile of del commitcount automatic messages import1.msgs.out insert into t1 (i1,c2)" perl -e 'for($i=0;$i<$ARGV[0]*10;$i++) { printf( "%d\n", $i % $ARGV[0] ) }' $maxValue > $dataFile perl -e 'for($i=1;$i<$ARGV[0]*10;$i++) { printf( "-%d\n", $i % $ARGV[0] ) }' $maxValue >> $dataFile db2 "import from $dataFile of del commitcount automatic messages import2.msgs.out insert into t2 (i1)" perl -e 'for($i=0;$i<2*$ARGV[0];$i++) { printf( "%d,%d\n", $i/2, ($i*2)+($i%2) ) }' $maxValue > $dataFile perl -e 'for($i=0;$i<2*$ARGV[0];$i++) { printf( "-%d,%d\n",$i/2, ($i*2)+($i%2) ) }' $maxValue >> $dataFile db2 "import from $dataFile of del commitcount automatic messages import3.msg.out insert into t2 (i1,i2)" rm -f $dataFile # Create an index on t2.i1,i2 to be used for the nested-loop # join early out for group by plan: db2 "create index i2 on t2 (i1,i2)" # Collect stats so that the nested-loop join early out for group # by plan will be used: db2 "runstats on table `whoami`.t1"\ "with distribution on columns (i1, c2 num_freqvalues 50 num_quantiles 200)"\ "and indexes all" db2 "runstats on table `whoami`.t2 and indexes all" # Perform the scan that reproduces the problem db2 "select t2.i1,max(t2.i2) from t1,t2 where t1.i1=t2.i1 group by t2.i1" # Clean up db2 "drop table t1" db2 "drop table t2" db2 "terminate" db2stop --------------------- End repro.ksh --------------------- | |
Problem Summary: | |
**************************************************************** * USERS AFFECTED: * * All users * **************************************************************** * PROBLEM DESCRIPTION: * * When a query that meets all the following conditions is * * executed, fewer results may be returned: * * - Several tables are joined using columns with different * * data types * * - Nested-loop join is selected as the join method * * - The output from the nested-loop join method is included * * into the Group by clause and the Early Out flag is marked as * * GROUPBY * * * * The following scenarios indicate the problem: * * * * 1. Joining two tables with different data types. To * * demonstrate this problem, create two tables for the test. * * Create the tables such that the columns being joined (t1.i1 * * and t2.i1) have different data types. * * create table t1 (i1 bigint not null, c2 char(250)) * * create index i1 on t1 (i1) * * create table t2 (i1 smallint not null, i2 bigint) * * create index i2 on t2 (i1,i2) * * 2. Run the query with different data types (bigint and * * smallint) in predicate and the result goes directly into the * * Group by clause. * * select t2.i1, max(t2.i2) from t1, t2 where t1.i1=t2.i1 group * * by t2.i1 * * 3. When the Nested-loop join is selected as the join method. * * * * Access Plan: * * ----------- * * Total Cost: 22.8806 * * Query Degree: 1 * * Rows * * RETURN * * ( 1) * * Cost * * I/O * * | * * 99 * * GRPBY * * ( 2) * * 22.8709 * * 3 * * | * * 99 * * ^NLJOIN * * ( 3) * * 22.8655 * * 3 * * /-----+------\ * * 99 12.1111 * * IXSCAN IXSCAN * * ( 4) ( 5) * * 0.0455361 7.57676 * * 0 1 * * | | * * 99 1199 * * INDEX: DB2INST1 INDEX: DB2INST1 * * I1 I2 * * Q2 Q1 * * 3) NLJOIN: (Nested Loop Join) * * Arguments: * * --------- * * EARLYOUT: (Early Out flag) * * GROUPBY * * Predicates: * * ---------- * * 3) Predicate used in Join * * Predicate Text: * * -------------- * * (Q2."I1" = Q1."I1") * **************************************************************** * RECOMMENDATION: * * Upgrade to DB2 UDB Version 9.8 fix pack 4. * **************************************************************** | |
Local Fix: | |
Execute the following and then recycle the instance: db2set DB2_REDUCED_OPTIMIZATION=NO_NLJN_EO_FOR_GB | |
available fix packs: | |
DB2 Version 9.8 Fix Pack 5 for AIX and Linux | |
Solution | |
This problem was fixed in DB2 UDB Version 9.8 fix pack 4. | |
Workaround | |
not known / see Local fix | |
Timestamps | |
Date - problem reported : Date - problem closed : Date - last modified : | 29.12.2010 22.09.2011 22.09.2011 |
Problem solved at the following versions (IBM BugInfos) | |
9.8.FP4 | |
Problem solved according to the fixlist(s) of the following version(s) | |
9.8.0.4 |