You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Alex Behm (Code Review)" <ge...@cloudera.org> on 2016/03/01 03:36:28 UTC

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/2365

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................

IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

The bug was that we incorrectly registered NotEmptyPredicates with
analyzer.registerConjunct() which registers the predicates as
originating from the HAVING-clause. However, a generated !empty()
predicate should only affect the result of the join of its
corresponding rhs CollectionTableRef, and not the whole FROM clause.

Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
---
M fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
2 files changed, 44 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/65/2365/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Marcel Kornacker, Internal Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/2365

to look at the new patch set (#5).

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................

IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

The bug was that we incorrectly registered NotEmptyPredicates with
analyzer.registerConjunct() which registers the predicates as
originating from the HAVING-clause. However, a generated !empty()
predicate should only affect the result of the join of its
corresponding rhs CollectionTableRef, and not the whole FROM clause.

This patch also disables generating !empty() predicates if the
parent tuple is outer joined to avoid an incorrect interaction with
our BE projection of collection-typed slots where rows are
incorrectly filtered if the predicate is assigned to a plan node that
comes after the unnest of the collection which also performs the
projection.

Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
---
M fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
2 files changed, 58 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/65/2365/5
-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/2365

to look at the new patch set (#3).

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................

IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

The bug was that we incorrectly registered NotEmptyPredicates with
analyzer.registerConjunct() which registers the predicates as
originating from the HAVING-clause. However, a generated !empty()
predicate should only affect the result of the join of its
corresponding rhs CollectionTableRef, and not the whole FROM clause.

Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
---
M fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
2 files changed, 48 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/65/2365/3
-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2365/2/fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
File fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java:

Line 108:         // affect the result of this join (e.g., and not the whole FROM clause).
> remove "(e.g., ", ")"
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has uploaded a new patch set (#2).

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................

IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

The bug was that we incorrectly registered NotEmptyPredicates with
analyzer.registerConjunct() which registers the predicates as
originating from the HAVING-clause. However, a generated !empty()
predicate should only affect the result of the join of its
corresponding rhs CollectionTableRef, and not the whole FROM clause.

Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
---
M fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
2 files changed, 48 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/65/2365/2
-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Marcel Kornacker, Internal Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/2365

to look at the new patch set (#4).

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................

IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

The bug was that we incorrectly registered NotEmptyPredicates with
analyzer.registerConjunct() which registers the predicates as
originating from the HAVING-clause. However, a generated !empty()
predicate should only affect the result of the join of its
corresponding rhs CollectionTableRef, and not the whole FROM clause.

Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
---
M fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
2 files changed, 57 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/65/2365/4
-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has abandoned this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Abandoned

Not ready for review yet, will reopen when ready.

-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/2365/1/fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
File fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java:

Line 103:         // an empty result set.
> comment doesn't apply anymore.
I think is still applies because we are hoping that the predicate is picked up in the parent scan. I changed the comment to clarify.


Line 108:         // affect the result of this join (e.g., and not the whole FROM clause).
> hm, that's a bummer, it would be nice to register it directly as a scan pre
I think it would be wrong to do that. We need to do the usual analysis to see if the predicate can be correctly assigned below outer and anti joins. The tests have such cases.


http://gerrit.cloudera.org:8080/#/c/2365/1/testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
File testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test:

Line 685: |  |  predicates: !empty(a.struct_array_col)
> what's the point of having it here? would the nlj even return anything if i
You are correct. There is no point in having it here in this case. But there is also no harm (results are empty anyway).

Our code currently does not really distinguish whether 'a' is outer joined by this join, by a join below or above. So if this NLJ was an outer join, we could not remove the predicate. We'd need to introduce special casing to remove the predicate in this case, so I'm not sure it's worth it.


-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2365/1/testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
File testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test:

Line 685: |  |  predicates: !empty(a.struct_array_col)
> so what's the point of placing these !empty() predicates?
It's also placed at the scan node.


-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2365/1/testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
File testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test:

Line 685: |  |  predicates: !empty(a.struct_array_col)
> You are correct. There is no point in having it here in this case. But ther
so what's the point of placing these !empty() predicates?


-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2365/2/fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
File fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java:

Line 108:         // affect the result of this join (e.g., and not the whole FROM clause).
remove "(e.g., ", ")"


-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/2365/1/fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java
File fe/src/main/java/com/cloudera/impala/analysis/CollectionTableRef.java:

Line 103:         // an empty result set.
comment doesn't apply anymore.


Line 108:         // affect the result of this join (e.g., and not the whole FROM clause).
hm, that's a bummer, it would be nice to register it directly as a scan predicate.


http://gerrit.cloudera.org:8080/#/c/2365/1/testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
File testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test:

Line 685: |  |  predicates: !empty(a.struct_array_col)
what's the point of having it here? would the nlj even return anything if it's empty?


-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-3065: Register !empty() predicates as On-clause conjuncts.

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3065: Register !empty() predicates as On-clause conjuncts.
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2365/1/testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
File testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test:

Line 771: |  |  predicates: !empty(a.int_map_col)
Most of these plan changes are due to the combination of this patch and the fix for IMPALA-3071, causing these !empty() predicates to have a fixed assignment based on which On-clause they originated from.


-- 
To view, visit http://gerrit.cloudera.org:8080/2365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7236940057f6468b085ffb2c228c3146b102cea7
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-HasComments: Yes