You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org> on 2018/11/28 20:14:42 UTC

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12004


Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
5 files changed, 161 insertions(+), 30 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/2
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 6:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/hdfs-parquet-table-writer.cc
File be/src/exec/parquet/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/hdfs-parquet-table-writer.cc@973
PS5, Line 973:     parquet::SchemaElement& col_schema = file_metadata_.schema[i + 1];
> Maybe choose a better name here?
I choose 'col_schema' -  I can look for a new name if others do not like it. 'schema_element' would be more exact for someone who know Parquet well, but for other people 'col_schema' is more informative in my opinion (+ it is shorter).


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.h
File be/src/exec/parquet/parquet-metadata-utils.h:

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.h@62
PS5, Line 62:   static void FillSchemaElement(const ColumnType& col_type,
> We usually put input parameters first, then output, i.e. move node to the e
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@153
PS5, Line 153:   return Status::OK();;
> This can be in the anonymous namespace, too
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@283
PS5, Line 283:     ErrorMsg msg(TErrorCode::PARQUET_INCOMPATIBLE_DECIMAL, filename, schema_element.name,
> Move to the anonymous namespace. Pass output parameters by pointer and move
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@292
PS5, Line 292: th' 
> Can you think of a better name for the schema element than "node"?
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@297
PS5, Line 297:   parquet::LogicalType logical_type;
> Should this be a case statement instead?
I think that the special handling needed for strings causes switch/case to loose its benefits. I can do it in the next change if you still prefer it.


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@310
PS5, Line 310:     col_schema->__set_type_length(ParquetPlainEncoder::DecimalSize(col_type));
> Move else to previous line.
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@311
PS5, Line 311:     col_schema->__set_scale(col_type.scale);
> Please add a brief comment that VARCHAR and CHAR are always UTF8 annotated,
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@362
PS5, Line 362: 
> Add function comment (here and for all other new functions
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@376
PS5, Line 376:   @staticmethod
> Maybe you can clean up some of the other uses of os.walk() in this file?
Done - test_def_level_encoding uses os.walk a bit differently (call be/util/parquet-reader instead of python functions), so I did not change it.


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@394
PS5, Line 394:         assert val is None
> Can you replace the values with the names from parquet.thrift?
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@397
PS5, Line 397:   def _check_no_logical_type(self, schemas, column_name):
> The other types have to be None, right? If so, maybe create a helper _check
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@432
PS5, Line 432: lf
> nit: once
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/util/get_parquet_metadata.py@181
PS5, Line 181:     for f in files:
> Add (as some uses of this in test_insert_parquet.py do):
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 05 Dec 2018 15:26:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Reviewed-on: http://gerrit.cloudera.org:8080/12004
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
6 files changed, 272 insertions(+), 83 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Anonymous Coward (359), Zoltan Ivanfi, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12004

to look at the new patch set (#10).

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
5 files changed, 264 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/10
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 13: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Thu, 17 Jan 2019 02:48:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12004

to look at the new patch set (#7).

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
5 files changed, 228 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/7
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1455/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 28 Nov 2018 20:58:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc@158
PS7, Line 158: static bool IsEncodingSupported(parquet::Encoding::type e) {
There's already a "IsSupportedType()" in the anonymous namespace above, I think we can move this there, and add a comment to explain what it does.


http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc@290
PS7, Line 290: namespace {
I'd consider moving all anonymous helpers up into one anonymous namespace


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@388
PS5, Line 388:     found = False
> Sorry, I forgot this one in patch set 6.
You could rename _find_schema to _get_schema() if you feel that that would express more clearly that it actually has to exist, and then add an assert there instead of here (since there doesn't seem to be a case where it doesn't exist). I don't feel strongly about this.


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@37
PS7, Line 37: from tests.util.get_parquet_metadata import decode_stats_value, \
nit: Wrap these in parentheses, and while you're here the ones above, too. (see https://www.python.org/dev/peps/pep-0328/)


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@388
PS7, Line 388:     found = False
I think you can shorten this to something like:
 
 keys = [k for k, v in obj_dict.iteritems() if v is not None]
 assert keys == [var_name]

Or make it one line if you prefer


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@410
PS7, Line 410:       8: ConvertedType.INT_8,
nit: I think we indent 4 spaces here


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@456
PS7, Line 456:     # This test will break once INT64 becomes the default Parquet type for TIMESTAMP
Is there actually a Jira we can add here? If not that's ok, too



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 12 Dec 2018 05:31:50 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3645/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 16 Jan 2019 22:29:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 10:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1646/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 16:46:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Anonymous Coward (359) has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 8: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 13:32:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1468/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 29 Nov 2018 12:28:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 8: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@456
PS7, Line 456:     file_metadata = \
> IMPALA-5049 is mentioned below - is that not enough?
Sry, I don't know how I missed that.



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 01:26:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 12: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 09 Jan 2019 18:30:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12004

to look at the new patch set (#4).

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
5 files changed, 159 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has removed a vote on this change.

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Removed Code-Review+2 by Impala Public Jenkins <im...@cloudera.com>
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3642/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 16 Jan 2019 16:42:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1598/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 13 Dec 2018 18:22:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12004/4/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/4/tests/query_test/test_insert_parquet.py@407
PS4, Line 407:  
Sorry, I thought that I have run this test, but this was completely wrong and should have failed. Patch set 5 fixed the test.



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 29 Nov 2018 19:47:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3593/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Thu, 20 Dec 2018 18:31:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1542/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 05 Dec 2018 16:07:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 8:

(1 comment)

Patch set 9 is only rebase + conflict resolution while patch set replaces the "if" chain with a "case" statement.

http://gerrit.cloudera.org:8080/#/c/12004/8/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/8/be/src/exec/parquet/parquet-metadata-utils.cc@303
PS8, Line 303:   if (col_type.type == TYPE_DECIMAL) {
> (optional) I would prefer a switch-case here.
Done - I moved most logic to separate functions to avoid issues with variable scopes in "case:" blocks.



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 16:15:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 5:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/hdfs-parquet-table-writer.cc
File be/src/exec/parquet/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/hdfs-parquet-table-writer.cc@973
PS5, Line 973:     parquet::SchemaElement& node = file_metadata_.schema[i + 1];
Maybe choose a better name here?


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.h
File be/src/exec/parquet/parquet-metadata-utils.h:

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.h@62
PS5, Line 62:   static void FillSchemaElement(parquet::SchemaElement& node, const ColumnType& type,
We usually put input parameters first, then output, i.e. move node to the end. By convention we also pass output parameters by pointer.


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@73
PS5, Line 73: };
This is actually the end of the anonymous namespace and confused me. Please remove the semicolon and add a comment  // anonymous namespace.


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@78
PS5, Line 78: bool IsSupportedType(PrimitiveType impala_type,
This method should be in the anonymous namespace


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@153
PS5, Line 153: static bool IsEncodingSupported(parquet::Encoding::type e) {
This can be in the anonymous namespace, too


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@283
PS5, Line 283: void SetIntLogicalType(parquet::SchemaElement& node, int bitwidth) {
Move to the anonymous namespace. Pass output parameters by pointer and move to the end of the parameter list. Also add a brief function comment.


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@292
PS5, Line 292: node
Can you think of a better name for the schema element than "node"?


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@297
PS5, Line 297:   if (type.type == TYPE_DECIMAL) {
Should this be a case statement instead?


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@310
PS5, Line 310:   else if (type.type == TYPE_VARCHAR || type.type == TYPE_CHAR ||
Move else to previous line.


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@311
PS5, Line 311:       (type.type == TYPE_STRING && query_options.parquet_annotate_strings_utf8)) {
Please add a brief comment that VARCHAR and CHAR are always UTF8 annotated, but STRING is only when set by the query option (I had to look it up).


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@362
PS5, Line 362:   def _ctas_and_get_metadata(self, vector, unique_database, tmp_dir, source_table):
Add function comment (here and for all other new functions


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@376
PS5, Line 376:     file_metadata_list = get_parquet_metadata_from_hdfs_folder(hdfs_path, tmp_dir)
Maybe you can clean up some of the other uses of os.walk() in this file?


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@388
PS5, Line 388:     schema = self._find_schema(schemas, column_name)
You could add

  assert schema is not None

You could also add that in _find_schema, since it seems required that the schema exists.


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@394
PS5, Line 394:     bit_width_to_converted_type_map = {8: 15, 16: 16, 32: 17, 64: 18}
Can you replace the values with the names from parquet.thrift?


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@397
PS5, Line 397:     assert schema.logicalType.INTEGER is not None
The other types have to be None, right? If so, maybe create a helper _check_only_one_type_is_set(the_type) and then make sure that all others are None.


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@432
PS5, Line 432: if
nit: once


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/util/get_parquet_metadata.py@181
PS5, Line 181:     for f in files:
Add (as some uses of this in test_insert_parquet.py do):

 if not f.endswith('parq'):
  continue



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 30 Nov 2018 05:10:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 12: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Thu, 20 Dec 2018 22:41:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3586/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 16:35:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 11: Code-Review+1

(2 comments)

Looks like a test failed, too.

http://gerrit.cloudera.org:8080/#/c/12004/11/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/11/be/src/exec/parquet/parquet-metadata-utils.cc@323
PS11, Line 323:   case TYPE_DECIMAL:
nit: indent case statements by 2 spaces (see rest of file).


http://gerrit.cloudera.org:8080/#/c/12004/11/be/src/exec/parquet/parquet-metadata-utils.cc@354
PS11, Line 354:     // Leave logical and converted types empty for other types.
Have you considered enumerating the other types here? Then you can DCHECK on an unknown type. That way we make future mistakes less likely by forgetting to update this statement.



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 20:51:47 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@388
PS5, Line 388:     found = False
> You could add
Sorry, I forgot this one in patch set 6.



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 05 Dec 2018 15:33:34 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 11: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 16:35:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1469/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 29 Nov 2018 20:15:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 10: Code-Review+1

LGTM, thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 16:18:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc@158
PS7, Line 158:     case parquet::Encoding::RLE:
> There's already a "IsSupportedType()" in the anonymous namespace above, I t
Done


http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc@290
PS7, Line 290: }
> I'd consider moving all anonymous helpers up into one anonymous namespace
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@388
PS5, Line 388:     schema = self._find_schema(schemas, column_name)
> You could rename _find_schema to _get_schema() if you feel that that would 
Done


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@37
PS7, Line 37: from tests.util.get_parquet_metadata import get_parquet_metadata, decode_stats_value, \
> nit: Wrap these in parentheses, and while you're here the ones above, too. 
Done


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@388
PS7, Line 388:     schema = self._find_schema(schemas, column_name)
> I think you can shorten this to something like:
Thank, it is nicer this way.


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@410
PS7, Line 410: 
> nit: I think we indent 4 spaces here
You are right, I also changed indentation at some other places.


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@456
PS7, Line 456: 
> Is there actually a Jira we can add here? If not that's ok, too
IMPALA-5049 is mentioned below - is that not enough?



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 13 Dec 2018 17:53:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12004

to look at the new patch set (#5).

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
5 files changed, 162 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/5
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12004/2/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/2/tests/query_test/test_insert_parquet.py@776
PS2, Line 776: 
flake8: W292 no newline at end of file


http://gerrit.cloudera.org:8080/#/c/12004/2/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/12004/2/tests/util/get_parquet_metadata.py@186
PS2, Line 186: 
flake8: W292 no newline at end of file



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 28 Nov 2018 20:15:30 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12004

to look at the new patch set (#6).

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
5 files changed, 225 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/6
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@73
PS5, Line 73: };
> This is actually the end of the anonymous namespace and confused me. Please
Done


http://gerrit.cloudera.org:8080/#/c/12004/5/be/src/exec/parquet/parquet-metadata-utils.cc@78
PS5, Line 78: bool IsSupportedType(PrimitiveType impala_type,
> This method should be in the anonymous namespace
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 05 Dec 2018 15:27:33 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12004/2/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/2/tests/query_test/test_insert_parquet.py@776
PS2, Line 776: 
> flake8: W292 no newline at end of file
Done


http://gerrit.cloudera.org:8080/#/c/12004/2/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/12004/2/tests/util/get_parquet_metadata.py@186
PS2, Line 186: 
> flake8: W292 no newline at end of file
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 29 Nov 2018 11:54:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Anonymous Coward (Code Review)" <ge...@cloudera.org>.
Anonymous Coward (359) has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 10: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 16:34:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Anonymous Coward (359), Zoltan Ivanfi, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12004

to look at the new patch set (#12).

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
6 files changed, 272 insertions(+), 83 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/12
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1645/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 16:06:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 12:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1655/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Thu, 20 Dec 2018 19:22:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 13: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 16 Jan 2019 16:42:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Anonymous Coward (359), Zoltan Ivanfi, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12004

to look at the new patch set (#9).

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
5 files changed, 235 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/9
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 8: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12004/8/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/8/be/src/exec/parquet/parquet-metadata-utils.cc@303
PS8, Line 303:   if (col_type.type == TYPE_DECIMAL) {
(optional) I would prefer a switch-case here.



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 14:51:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12004

to look at the new patch set (#8).

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................

IMPALA-7889: Write new logical types in Parquet

Fill the LogicalType field in Parquet schemas for columns
that have an associated logical type. ConvertedType still
has to be filled to remain compatible with older readers.

Testing:
- added new tests to check both logical and converted types
  to test_insert_parquet.py

Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
---
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M tests/query_test/test_insert_parquet.py
M tests/util/get_parquet_metadata.py
5 files changed, 239 insertions(+), 83 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/12004/8
-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 8
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3586/


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 20:37:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12004/11/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/11/be/src/exec/parquet/parquet-metadata-utils.cc@323
PS11, Line 323:   case TYPE_DECIMAL:
> nit: indent case statements by 2 spaces (see rest of file).
Done


http://gerrit.cloudera.org:8080/#/c/12004/11/be/src/exec/parquet/parquet-metadata-utils.cc@354
PS11, Line 354:     // Leave logical and converted types empty for other types.
> Have you considered enumerating the other types here? Then you can DCHECK o
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Thu, 20 Dec 2018 18:33:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1541/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 05 Dec 2018 15:44:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 10: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Comment-Date: Wed, 19 Dec 2018 16:34:48 +0000
Gerrit-HasComments: No