You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Daniel Becker (Code Review)" <ge...@cloudera.org> on 2022/12/16 15:34:31 UTC

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Daniel Becker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19364


Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................

IMPALA-11778: Printing maps may produce invalid json

Impala allows non-string types, for example numbers, to be keys in maps.
We print maps as json objects, but json objects only allow string keys.
If the Impala map has for example an INT key, the printed json is
invalid.

For example, in Impala the following two maps are not the same:
{1: "a", 2: "b"}
{"1": "a", "2": "b"}

The first map has INT keys, the second has STRING keys. Only the second
one is valid json.

Hive has the same behaviour.

This change introduces the STRINGIFY_MAP_KEYS query option that, when
set to true, converts non-string keys to strings. The default value of
the new query option is false because
  - conversion to string causes loss of information and
  - setting it to true would be a breaking change.

Testing:
  - Added tests in nested-map-in-select-list.test and map_null_keys.test
    that check the behaviour when STRINGIFY_MAP_KEYS is set to true.

Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
---
M be/src/runtime/complex-value-writer.h
M be/src/runtime/complex-value-writer.inline.h
M be/src/service/hs2-util.cc
M be/src/service/hs2-util.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/service/query-result-set.cc
M be/src/service/query-result-set.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M testdata/workloads/functional-query/queries/QueryTest/map_null_keys.test
M testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test
14 files changed, 209 insertions(+), 98 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/19364/2
-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <da...@cloudera.com>

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/19364 )

Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19364/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19364/2//COMMIT_MSG@20
PS2, Line 20: 
            : Hive has the same behaviour.
Can you make it clearer that Hive also produces invalid JSON?


http://gerrit.cloudera.org:8080/#/c/19364/2/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/19364/2/be/src/service/query-options.h@284
PS2, Line 284: REGULAR
I would consider this advanced (should affect very few users)



-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pr...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 15:03:02 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19364 )

Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12138/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pr...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 16:00:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/19364 )

Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................

IMPALA-11778: Printing maps may produce invalid json

Impala allows non-string types, for example numbers, to be keys in maps.
We print maps as json objects, but json objects only allow string keys.
If the Impala map has for example an INT key, the printed json is
invalid.

For example, in Impala the following two maps are not the same:
{1: "a", 2: "b"}
{"1": "a", "2": "b"}

The first map has INT keys, the second has STRING keys. Only the second
one is valid json.

Hive has the same behaviour as Impala, i.e. it produces invalid json if
the map keys have a non-string type.

This change introduces the STRINGIFY_MAP_KEYS query option that, when
set to true, converts non-string keys to strings. The default value of
the new query option is false because
  - conversion to string causes loss of information and
  - setting it to true would be a breaking change.

Testing:
  - Added tests in nested-map-in-select-list.test and map_null_keys.test
    that check the behaviour when STRINGIFY_MAP_KEYS is set to true.

Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
---
M be/src/runtime/complex-value-writer.h
M be/src/runtime/complex-value-writer.inline.h
M be/src/service/hs2-util.cc
M be/src/service/hs2-util.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/service/query-result-set.cc
M be/src/service/query-result-set.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M testdata/workloads/functional-query/queries/QueryTest/map_null_keys.test
M testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test
14 files changed, 209 insertions(+), 98 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/19364/3
-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pr...@cloudera.com>

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19364 )

Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................

IMPALA-11778: Printing maps may produce invalid json

Impala allows non-string types, for example numbers, to be keys in maps.
We print maps as json objects, but json objects only allow string keys.
If the Impala map has for example an INT key, the printed json is
invalid.

For example, in Impala the following two maps are not the same:
{1: "a", 2: "b"}
{"1": "a", "2": "b"}

The first map has INT keys, the second has STRING keys. Only the second
one is valid json.

Hive has the same behaviour as Impala, i.e. it produces invalid json if
the map keys have a non-string type.

This change introduces the STRINGIFY_MAP_KEYS query option that, when
set to true, converts non-string keys to strings. The default value of
the new query option is false because
  - conversion to string causes loss of information and
  - setting it to true would be a breaking change.

Testing:
  - Added tests in nested-map-in-select-list.test and map_null_keys.test
    that check the behaviour when STRINGIFY_MAP_KEYS is set to true.

Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Reviewed-on: http://gerrit.cloudera.org:8080/19364
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/runtime/complex-value-writer.h
M be/src/runtime/complex-value-writer.inline.h
M be/src/service/hs2-util.cc
M be/src/service/hs2-util.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/service/query-result-set.cc
M be/src/service/query-result-set.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M testdata/workloads/functional-query/queries/QueryTest/map_null_keys.test
M testdata/workloads/functional-query/queries/QueryTest/nested-map-in-select-list.test
14 files changed, 209 insertions(+), 98 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pr...@cloudera.com>

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19364 )

Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pr...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 15:40:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19364 )

Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12038/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 16 Dec 2022 15:54:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19364 )

Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pr...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 20:50:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11778: Printing maps may produce invalid json

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19364 )

Change subject: IMPALA-11778: Printing maps may produce invalid json
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8957/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/19364
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1820036a1c614c34ae5d70ac4fe79a992c9bce3a
Gerrit-Change-Number: 19364
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Peter Rozsa <pr...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 15:40:08 +0000
Gerrit-HasComments: No