You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Alex Behm (Code Review)" <ge...@cloudera.org> on 2018/02/04 18:34:23 UTC
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Alex Behm has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9210
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
Changes the default value for the PARQUET_ARRAY_RESOLUTION
query option to conform to the Parquet standard.
Before: TWO_LEVEL_THEN_THREE_LEVEL
After: THREE_LEVEL
For more information see:
* IMPALA-4725
* https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
Testing:
- expands and cleans up the existing tests for more coverage
over the different resolution policies
- private core/hdfs run passed
Cherry-picks: not for 2.x.
Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
---
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/query_test/test_nested_types.py
3 files changed, 138 insertions(+), 183 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/9210/1
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm <al...@cloudera.com>
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/9210
to look at the new patch set (#3).
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
Changes the default value for the PARQUET_ARRAY_RESOLUTION
query option to conform to the Parquet standard.
Before: TWO_LEVEL_THEN_THREE_LEVEL
After: THREE_LEVEL
For more information see:
* IMPALA-4725
* https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
Testing:
- expands and cleans up the existing tests for more coverage
over the different resolution policies
- private core/hdfs run passed
Cherry-picks: not for 2.x.
Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
---
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/query_test/test_nested_types.py
3 files changed, 153 insertions(+), 193 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/9210/3
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 )
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
Patch Set 3: Verified+1
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 06 Feb 2018 22:58:24 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 )
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
Patch Set 3:
Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1889/
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:18:00 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9210 )
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
Changes the default value for the PARQUET_ARRAY_RESOLUTION
query option to conform to the Parquet standard.
Before: TWO_LEVEL_THEN_THREE_LEVEL
After: THREE_LEVEL
For more information see:
* IMPALA-4725
* https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
Testing:
- expands and cleans up the existing tests for more coverage
over the different resolution policies
- private core/hdfs run passed
Cherry-picks: not for 2.x.
Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Reviewed-on: http://gerrit.cloudera.org:8080/9210
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/query_test/test_nested_types.py
3 files changed, 153 insertions(+), 193 deletions(-)
Approvals:
Alex Behm: Looks good to me, approved
Impala Public Jenkins: Verified
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 )
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
Patch Set 1:
(2 comments)
http://gerrit.cloudera.org:8080/#/c/9210/1/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:
http://gerrit.cloudera.org:8080/#/c/9210/1/tests/query_test/test_nested_types.py@134
PS1, Line 134: ARRAY_RESOUTION_POLICIES
> Typo: RESOLUTION
Done
http://gerrit.cloudera.org:8080/#/c/9210/1/tests/query_test/test_nested_types.py@208
PS1, Line 208: self.client.execute("set parquet_array_resolution=%s" % arr_res)
> Will this setting stick around when the client is recycled? Should we be re
Good point. Switched to self.execute_query() which internally calls uses self.client.set_configuration().
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 06 Feb 2018 02:18:46 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 )
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
Patch Set 3: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:17:22 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 )
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
Patch Set 1:
(2 comments)
http://gerrit.cloudera.org:8080/#/c/9210/1/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:
http://gerrit.cloudera.org:8080/#/c/9210/1/tests/query_test/test_nested_types.py@134
PS1, Line 134: ARRAY_RESOUTION_POLICIES
Typo: RESOLUTION
http://gerrit.cloudera.org:8080/#/c/9210/1/tests/query_test/test_nested_types.py@208
PS1, Line 208: self.client.execute("set parquet_array_resolution=%s" % arr_res)
Will this setting stick around when the client is recycled? Should we be resetting it?
Should we also be using client.set_configuration() instead of executing the server-side statement?
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 06 Feb 2018 01:25:24 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 )
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
Patch Set 2: Code-Review+2
(1 comment)
http://gerrit.cloudera.org:8080/#/c/9210/2/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:
http://gerrit.cloudera.org:8080/#/c/9210/2/tests/query_test/test_nested_types.py@355
PS2, Line 355: arr_res = vector.get_value('parquet_array_resolution')
Seems ok, but we could probably factor these three lines out into a function.
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 06 Feb 2018 16:37:11 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 )
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
Patch Set 2:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/9210/2/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:
http://gerrit.cloudera.org:8080/#/c/9210/2/tests/query_test/test_nested_types.py@355
PS2, Line 355: arr_res = vector.get_value('parquet_array_resolution')
> Seems ok, but we could probably factor these three lines out into a functio
Done
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:16:44 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/9210
to look at the new patch set (#2).
Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
......................................................................
IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
Changes the default value for the PARQUET_ARRAY_RESOLUTION
query option to conform to the Parquet standard.
Before: TWO_LEVEL_THEN_THREE_LEVEL
After: THREE_LEVEL
For more information see:
* IMPALA-4725
* https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
Testing:
- expands and cleans up the existing tests for more coverage
over the different resolution policies
- private core/hdfs run passed
Cherry-picks: not for 2.x.
Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
---
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/query_test/test_nested_types.py
3 files changed, 155 insertions(+), 193 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/9210/2
--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>