You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Huaisi Xu (Code Review)" <ge...@cloudera.org> on 2016/06/21 01:08:04 UTC

[Impala-CR](cdh5-2.2.0 5.4.x) CDH-41243: Parquet scanner regression on wide tables (part 2)

Huaisi Xu has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3417

Change subject: CDH-41243: Parquet scanner regression on wide tables (part 2)
......................................................................

CDH-41243: Parquet scanner regression on wide tables (part 2)

IMPALA-2473 introduced a check that prevent row batches growing
beyond 8MB, but it has a corner case that when an empty row
batch is larger than 8MB, it returns this row batch immediately
after it materialize one row, essentailly setting batch_size=1.

Revert "IMPALA-2473: reduce scanner memory usage"

This reverts commit cecb4cf4c5bfe4d21afc2f650880e5bdda14b024.

Change-Id: Id21c26771cd9f5239da4e07a6c59c5126b4d8a0b
---
M be/src/exec/data-source-scan-node.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/runtime/row-batch.h
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
7 files changed, 26 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/17/3417/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id21c26771cd9f5239da4e07a6c59c5126b4d8a0b
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.2.0_5.4.x
Gerrit-Owner: Huaisi Xu <hx...@cloudera.com>

[Impala-CR](cdh5-2.2.0 5.4.x) CDH-41243: Parquet scanner regression on wide tables (part 2)

Posted by "Huaisi Xu (Code Review)" <ge...@cloudera.org>.
Huaisi Xu has submitted this change and it was merged.

Change subject: CDH-41243: Parquet scanner regression on wide tables (part 2)
......................................................................


CDH-41243: Parquet scanner regression on wide tables (part 2)

IMPALA-2473 introduced a check that prevent row batches growing
beyond 8MB, but it has a corner case that when an empty row
batch is larger than 8MB, it returns this row batch immediately
after it materialize one row, essentailly setting batch_size=1.

Revert "IMPALA-2473: reduce scanner memory usage"

This reverts commit cecb4cf4c5bfe4d21afc2f650880e5bdda14b024.

Change-Id: Id21c26771cd9f5239da4e07a6c59c5126b4d8a0b
Reviewed-on: http://gerrit.cloudera.org:8080/3417
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Huaisi Xu <hx...@cloudera.com>
---
M be/src/exec/data-source-scan-node.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/runtime/row-batch.h
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
7 files changed, 26 insertions(+), 64 deletions(-)

Approvals:
  Huaisi Xu: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/3417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id21c26771cd9f5239da4e07a6c59c5126b4d8a0b
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.2.0_5.4.x
Gerrit-Owner: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-2.2.0 5.4.x) CDH-41243: Parquet scanner regression on wide tables (part 2)

Posted by "Huaisi Xu (Code Review)" <ge...@cloudera.org>.
Huaisi Xu has posted comments on this change.

Change subject: CDH-41243: Parquet scanner regression on wide tables (part 2)
......................................................................


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id21c26771cd9f5239da4e07a6c59c5126b4d8a0b
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.2.0_5.4.x
Gerrit-Owner: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-2.2.0 5.4.x) CDH-41243: Parquet scanner regression on wide tables (part 2)

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: CDH-41243: Parquet scanner regression on wide tables (part 2)
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id21c26771cd9f5239da4e07a6c59c5126b4d8a0b
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.2.0_5.4.x
Gerrit-Owner: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-2.2.0 5.4.x) CDH-41243: Parquet scanner regression on wide tables (part 2)

Posted by "Huaisi Xu (Code Review)" <ge...@cloudera.org>.
Huaisi Xu has posted comments on this change.

Change subject: CDH-41243: Parquet scanner regression on wide tables (part 2)
......................................................................


Patch Set 2:

fixed many things for 5.4.x. passed core test, but failed exhaustive(flaky things. 5.4.x exhaustive has been red since oct last year) I will merge this to unblock CI job since (part 1) was already merged and broke test as expected. I will make sure it is green.

-- 
To view, visit http://gerrit.cloudera.org:8080/3417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id21c26771cd9f5239da4e07a6c59c5126b4d8a0b
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.2.0_5.4.x
Gerrit-Owner: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-2.2.0 5.4.x) CDH-41243: Parquet scanner regression on wide tables (part 2)

Posted by "Huaisi Xu (Code Review)" <ge...@cloudera.org>.
Huaisi Xu has posted comments on this change.

Change subject: CDH-41243: Parquet scanner regression on wide tables (part 2)
......................................................................


Patch Set 1:

it is a clean revert.. the conflict is from part 1.

-- 
To view, visit http://gerrit.cloudera.org:8080/3417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id21c26771cd9f5239da4e07a6c59c5126b4d8a0b
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.2.0_5.4.x
Gerrit-Owner: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hx...@cloudera.com>
Gerrit-HasComments: No