You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Dag H. Wanvik (JIRA)" <ji...@apache.org> on 2010/07/15 03:36:49 UTC

[jira] Issue Comment Edited: (DERBY-4471) Left outer join reassociation rewrite gives wrong result

    [ https://issues.apache.org/jira/browse/DERBY-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888665#action_12888665 ] 

Dag H. Wanvik edited comment on DERBY-4471 at 7/14/10 9:35 PM:
---------------------------------------------------------------

Attaching a first patch for this issue. It tighens up the criterion
used for determining whether outer join reordering is allowed. I have
tocuhed most of HalfOuterJoinNode#LOJ_reorderable, so to review it may
be best to have a full copy of before/after since the diff is a bit
hard to read.

The new criterion looks inside the null-producing child of the OJ
(the LOJ's right side) and inspects its join predicate. Previously,
something like 1=1 would pass muster, but as we have seen this gives
wrong results.

As before, we require that the top join predicate is of the form
L.X=R.y and additionally that R.y not reference the child OJ's
null-preserving side.

Additionally, we now require that the child OJ's join predicate also
have the form L.x = R.y, discarding for example 1=1. See the code
comments for details,

I also made a new Javadoc to clarify what the method does.

I have factored out the check for join predicate of the form
L.x = R.y to a new private method, isLeftRightEquiJoin.

New tests have been added to OuterJoinTest that test for query
correctness and that the expected reorderings occur when they should.

Note also the current limitations I found in the code:

- Only left outer joins are considered
- Top left side must be a base table

I have not removed these.  I have filed DERBY-4742 for also
considering right outer joins.

Running regressions.


      was (Author: dagw):
    Attaching a first patch for this issue. It tighens up the criterion
used for determining whether outer join reordering is allowed. I have
tocuhed most of HalfOuterJoinNode#LOJ_reorderable, so to review it may
be best to have a full copy of before/after since the diff is a bit
hard to read.

The new criterion looks inside the null-producing child LOJ (the LOJ
right side) and inspects its join predicate. Previously, something
like 1=1 would pass muster, but as we have seen this gives wrong
results. 

As before, we require that the top join predicate is of the form
L.X=R.y and additionally that R.y not reference the child LOJ's
null-preserving side.

Additionally, we now require that the child LOJ's join predicate also
have the form L.x = R.y, discarding for example 1=1. See the code
comments for details,

I also made a new Javadoc to clarify what the method does.

I have factored out the check for join predicate of the form
L.x = R.y to a new private method, isLeftRightEquiJoin.

New tests have been added to OuterJoinTest that test for query
correctness and that the expected reorderings occur when they should.

Note also the current limitations I found in the code:

- Only left outer joins are considered
- Top left side must be a base table

I have not removed these.  I have filed DERBY-4742 for also
considering right outer joins.

Running regressions.

  
> Left outer join reassociation rewrite gives wrong result
> --------------------------------------------------------
>
>                 Key: DERBY-4471
>                 URL: https://issues.apache.org/jira/browse/DERBY-4471
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.0.2.0, 10.0.2.1, 10.1.1.0, 10.1.2.1, 10.1.3.1, 10.2.1.6, 10.2.2.0, 10.3.1.4, 10.3.2.1, 10.3.3.0, 10.4.1.3, 10.4.2.0, 10.5.1.1, 10.5.2.0, 10.5.3.0
>            Reporter: Dag H. Wanvik
>            Assignee: Dag H. Wanvik
>         Attachments: derby-4471-1a.diff, derby-4471-1a.stat, derby-4471-junit-repro.diff, query_plan_derby_4471.pdf
>
>
> The following script and output shows the problem:
> > create table r(c1 char(1));
> > create table s(c1 char(1), c2 char(1));
> > create table t(c1 char(1));
> > insert into r values 'a';
> > insert into s values ('b', default);
> > insert into t values ('c');
> > select * from s left outer join t on s.c2=t.c1 or s.c2 is null;
> C1  |C2  |C1  
> --------------
> b   |NULL|c   
> > select * from r left outer join s on r.c1=s.c1;
> C1  |C1  |C2  
> --------------
> a   |NULL|NULL
> > select * from (r left outer join s on r.c1=s.c1) left outer join t on s.c2=t.c1 or s.c2 is null;
> C1  |C1  |C2  |C1  
> -------------------
> a   |NULL|NULL|c   
> > select * from r left outer join (s left outer join t on s.c2=t.c1 or s.c2 is null) on r.c1=s.c1;
> C1  |C1  |C2  |C1  
> -------------------
> a   |NULL|NULL|c   
> The last result is wrong. The correct answer should be:
> C1  |C1  |C2  |C1  
> -------------------
> a   |NULL|NULL|NULL   
> since in the last form, the left table r has the value 'a', which does
> not match any row in result of the compound inner given the join
> predicate ("r.c1=s.c1"), so all nulls should be appended to the 'a'
> from the outer table r.
> This happens because internally the last form is rewritten to the
> second but the last form (left-deep), but this rewrite is not
> justified here unless the join predicate on s rejects null, which the
> present one explicitly does not ("or s.c2 is null"). Cf. for example
> [1], page 52, which describes this transform and its prerequisite
> condition as indentity #7.
> [1] Galindo-Legaria, C. & Rosenthal, A.: "Outerjoin simplification and
> reordering for query optimization", ACM Transactions on Database
> Systems, Vol 22, No 1, March 1997.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.