You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org> on 2017/08/31 18:34:30 UTC

[Impala-ASF-CR] IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

Thomas Tauber-Marshall has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7922

Change subject: IMPALA-5871: KuduPartitionExpr incorrectly handles its child types
......................................................................

IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

KuduPartitionExpr takes input rows and sends them to Kudu to determine
the partition the rows correspond to in a particular table's
partitioning scheme. This is then used to partition and sort rows
before sending them to Kudu when performing an INSERT.

If the input types are not the same as (but are compatible with) the
types of the columns in the table, we need to cast the input rows.
KuduPartitionExpr.analyze() actually already does this, but the casts
are dropped for the sort step during the INSERT in most cases.

As a result, attempting to insert a string value into a Kudu timestamp
column causes a crash.

Inserting a numeric value into a different but compatibly types col
(eg. tinyint into an int col) will cause the sort during a Kudu INSERT
to operate on garbage values, potentially degrading performance and
causing INSERTs to fail due to Kudu timeouts (see IMPALA-3742).

Testing:
- Added an e2e test in kudu_insert.test

Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
---
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
3 files changed, 23 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/7922/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5871: KuduPartitionExpr incorrectly handles its child types
......................................................................


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7922

to look at the new patch set (#2).

Change subject: IMPALA-5871: KuduPartitionExpr incorrectly handles its child types
......................................................................

IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

KuduPartitionExpr takes input rows and sends them to Kudu to determine
the partition the rows correspond to in a particular table's
partitioning scheme. This is then used to partition and sort rows
before sending them to Kudu when performing an INSERT.

If the input types are not the same as (but are compatible with) the
types of the columns in the table, we need to cast the input rows.
KuduPartitionExpr.analyze() actually already does this, but the casts
are dropped for the sort step during the INSERT in most cases.

As a result, attempting to insert a string value into a Kudu timestamp
column causes a crash.

Inserting a numeric value into a different but compatibly typed col
(eg. tinyint into an int col) will cause the sort during a Kudu INSERT
to operate on garbage values, potentially degrading performance and
causing INSERTs to fail due to Kudu timeouts (see IMPALA-3742).

Testing:
- Added an e2e test in kudu_insert.test

Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
---
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
3 files changed, 23 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/7922/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>

[Impala-ASF-CR] IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5871: KuduPartitionExpr incorrectly handles its child types
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7922/1//COMMIT_MSG
Commit Message:

PS1, Line 22: compatibly typed col
> compatible type?
Done


http://gerrit.cloudera.org:8080/#/c/7922/1/fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
File fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java:

PS1, Line 76: children_.set(
> wow, whoops...
So in defense of the people who originally wrote and reviewed this code (i.e. you and me) the previous version works for the particular case I was thinking about of NullLiterals.

The reason is that calling castTo on a NullLiteral doesn't create a new Expr, it just updates the NullLiteral's type field, so the 'set' is unnecessary. For most other Exprs, though, castTo creates a CastExpr to wrap the old expr.

Somewhat poor interface design, but its of course still my fault for not catching it. We need an equivalent to WARN_UNUSED_RESULT for Java.


-- 
To view, visit http://gerrit.cloudera.org:8080/7922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5871: KuduPartitionExpr incorrectly handles its child types
......................................................................


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7922/1/fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
File fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java:

PS1, Line 76: children_.set(
> So in defense of the people who originally wrote and reviewed this code (i.
Easy to miss this. Good insights.


-- 
To view, visit http://gerrit.cloudera.org:8080/7922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5871: KuduPartitionExpr incorrectly handles its child types
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1188/

-- 
To view, visit http://gerrit.cloudera.org:8080/7922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5871: KuduPartitionExpr incorrectly handles its child types
......................................................................


IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

KuduPartitionExpr takes input rows and sends them to Kudu to determine
the partition the rows correspond to in a particular table's
partitioning scheme. This is then used to partition and sort rows
before sending them to Kudu when performing an INSERT.

If the input types are not the same as (but are compatible with) the
types of the columns in the table, we need to cast the input rows.
KuduPartitionExpr.analyze() actually already does this, but the casts
are dropped for the sort step during the INSERT in most cases.

As a result, attempting to insert a string value into a Kudu timestamp
column causes a crash.

Inserting a numeric value into a different but compatibly typed col
(eg. tinyint into an int col) will cause the sort during a Kudu INSERT
to operate on garbage values, potentially degrading performance and
causing INSERTs to fail due to Kudu timeouts (see IMPALA-3742).

Testing:
- Added an e2e test in kudu_insert.test

Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Reviewed-on: http://gerrit.cloudera.org:8080/7922
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/exprs/kudu-partition-expr.cc
M fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
3 files changed, 23 insertions(+), 9 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-5871: KuduPartitionExpr incorrectly handles its child types

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5871: KuduPartitionExpr incorrectly handles its child types
......................................................................


Patch Set 1: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7922/1//COMMIT_MSG
Commit Message:

PS1, Line 22: compatibly types col
compatible type?


http://gerrit.cloudera.org:8080/#/c/7922/1/fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java
File fe/src/main/java/org/apache/impala/analysis/KuduPartitionExpr.java:

PS1, Line 76: children_.set(
wow, whoops...


-- 
To view, visit http://gerrit.cloudera.org:8080/7922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44cf31e46a77f3e7c92cf6b9112653808a001705
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-HasComments: Yes