You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Lars Volker (Code Review)" <ge...@cloudera.org> on 2017/02/23 22:03:00 UTC

[Impala-ASF-CR] IMPALA-4982: Add parques stats test

Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6130

Change subject: IMPALA-4982: Add parques stats test
......................................................................

IMPALA-4982: Add parques stats test

IMPALA-2328 added support for skipping row groups based on
parquet::Statistics. This change adds a test for root-level
scalar columns of parquet files with nested types.

Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
---
M testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
1 file changed, 8 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/6130/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has uploaded a new patch set (#3).

Change subject: IMPALA-4982: Add parquet stats test
......................................................................

IMPALA-4982: Add parquet stats test

IMPALA-2328 added support for skipping row groups based on
parquet::Statistics. This change adds a test for root-level
scalar columns of parquet files with nested types.

Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
---
M testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
1 file changed, 29 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/6130/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6130

to look at the new patch set (#6).

Change subject: IMPALA-4982: Add parquet stats test
......................................................................

IMPALA-4982: Add parquet stats test

IMPALA-2328 added support for skipping row groups based on
parquet::Statistics. This change adds a test for root-level
scalar columns of parquet files with nested types.

Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
---
A testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test
M tests/query_test/test_nested_types.py
2 files changed, 37 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/6130/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 5: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/298/

-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-4982: Add parquet stats test
......................................................................

IMPALA-4982: Add parquet stats test

IMPALA-2328 added support for skipping row groups based on
parquet::Statistics. This change adds a test for root-level
scalar columns of parquet files with nested types.

Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
---
M testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
1 file changed, 8 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/6130/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


IMPALA-4982: Add parquet stats test

IMPALA-2328 added support for skipping row groups based on
parquet::Statistics. This change adds a test for root-level
scalar columns of parquet files with nested types.

Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Reviewed-on: http://gerrit.cloudera.org:8080/6130
Reviewed-by: Lars Volker <lv...@cloudera.com>
Tested-by: Impala Public Jenkins
---
A testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test
M tests/query_test/test_nested_types.py
2 files changed, 37 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Lars Volker: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/308/

-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has uploaded a new patch set (#5).

Change subject: IMPALA-4982: Add parquet stats test
......................................................................

IMPALA-4982: Add parquet stats test

IMPALA-2328 added support for skipping row groups based on
parquet::Statistics. This change adds a test for root-level
scalar columns of parquet files with nested types.

Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
---
A testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test
M tests/query_test/test_nested_types.py
2 files changed, 34 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/6130/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has uploaded a new patch set (#4).

Change subject: IMPALA-4982: Add parquet stats test
......................................................................

IMPALA-4982: Add parquet stats test

IMPALA-2328 added support for skipping row groups based on
parquet::Statistics. This change adds a test for root-level
scalar columns of parquet files with nested types.

Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
---
A testdata/workloads/functional-query/queries/QueryTest/nested-types-scanner-stats.test
M testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
M tests/query_test/test_nested_types.py
3 files changed, 63 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/6130/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6130/3/testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test:

Line 232: ---- QUERY
Move these into a separate .test file or find another suitable test/file (e.g. test_nested_types.py). I'm requesting this change because these tests will fail when running with the legacy joins/aggs where referencing nested types is not supported (will break a build).


Line 252: # Nested columns do not support stats based filtering.
Do we have a JIRA for this yet? We can definitely fix this.


-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 5:

(4 comments)

Thank you for the review. Please see PS5.

http://gerrit.cloudera.org:8080/#/c/6130/4/testdata/workloads/functional-query/queries/QueryTest/nested-types-scanner-stats.test
File testdata/workloads/functional-query/queries/QueryTest/nested-types-scanner-stats.test:

Line 3
> call the file nested-types-parquet-stats.test to be consistent with your ot
Done


http://gerrit.cloudera.org:8080/#/c/6130/4/testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test:

Line 233
> remove tests from here
:( Sorry for missing this.


http://gerrit.cloudera.org:8080/#/c/6130/4/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:

Line 78:   def test_parquet_stats(self, vector):
> test_parquet_stats:
Done


Line 79:     """Queries that test evaluation of Parquet row group statistics."""
> Parquet row group statistics
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 5:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/298/

-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 3:

(2 comments)

Thank you for the review! Please see my comments and PS4.

http://gerrit.cloudera.org:8080/#/c/6130/3/testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test:

Line 232: ---- QUERY
> Move these into a separate .test file or find another suitable test/file (e
Thank you for catching this. I moved the tests into a new file, anticipating we will add more tests there as we add support for filtering nested data.


Line 252: # Nested columns do not support stats based filtering.
> Do we have a JIRA for this yet? We can definitely fix this.
I opened IMPALA-4985 to track this.


-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6130/4/testdata/workloads/functional-query/queries/QueryTest/nested-types-scanner-stats.test
File testdata/workloads/functional-query/queries/QueryTest/nested-types-scanner-stats.test:

Line 3: # Filter root-level scalar column in file with nested types.
call the file nested-types-parquet-stats.test to be consistent with your other test file


http://gerrit.cloudera.org:8080/#/c/6130/4/testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test
File testdata/workloads/functional-query/queries/QueryTest/parquet_stats.test:

Line 233: # Filter root-level scalar column in file with nested types.
remove tests from here


http://gerrit.cloudera.org:8080/#/c/6130/4/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:

Line 78:   def test_scanner_stats(self, vector):
test_parquet_stats:


Line 79:     """Queries that test evaluation of row group statistics."""
Parquet row group statistics


-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4982: Add parquet stats test

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-4982: Add parquet stats test
......................................................................


Patch Set 6: Code-Review+2

Added num_nodes=1 to the exec_option of the test. Carrying Alex's +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/6130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If81c8a1ecea937794885d4e5e7bf765bd238f5fb
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No