You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "anujphadke (Code Review)" <ge...@cloudera.org> on 2016/06/29 20:50:20 UTC

[Impala-CR](cdh5-2.5.0 5.7.x) IMPALA-3732: handle string length overflow in avro files

Hello Internal Jenkins, Tim Armstrong,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/3538

to review the following change.

Change subject: IMPALA-3732: handle string length overflow in avro files
......................................................................

IMPALA-3732: handle string length overflow in avro files

Avro string lengths are encoded as 64-bit integers. Impala can only
handle up to 32-bit integers, so we need to be careful about handling
out-of-range integers. Negative integers were already handled by a
previous patch, but if a positive 64-bit integer is truncated to a
32-bit integer, the result can be a negative length.

This patch fixes CHAR/VARCHAR behaviour, where we can just truncate
the string, and STRING, where we can't truncate the string, so must
return an error.

Testing:
Added unit tests for STRING, CHAR, and VARCHAR that exercise the string
overflow handling.

Change-Id: If6541e7c68255bf599b26386a55057c93e62af51
Reviewed-on: http://gerrit.cloudera.org:8080/3383
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit e78b6db7e2b334ff88dd5678290f5b932a6a715f)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/13636
Tested-by: Anuj Phadke <ap...@cloudera.com>
(cherry picked from commit ea0200e1ea6ff0f6e54b22b00b4e25ce95a7c2d9)
---
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner-test.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M common/thrift/generate_error_codes.py
5 files changed, 109 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/38/3538/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If6541e7c68255bf599b26386a55057c93e62af51
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.x
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-2.5.0 5.7.x) IMPALA-3732: handle string length overflow in avro files

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.
anujphadke has submitted this change and it was merged.

Change subject: IMPALA-3732: handle string length overflow in avro files
......................................................................


IMPALA-3732: handle string length overflow in avro files

Avro string lengths are encoded as 64-bit integers. Impala can only
handle up to 32-bit integers, so we need to be careful about handling
out-of-range integers. Negative integers were already handled by a
previous patch, but if a positive 64-bit integer is truncated to a
32-bit integer, the result can be a negative length.

This patch fixes CHAR/VARCHAR behaviour, where we can just truncate
the string, and STRING, where we can't truncate the string, so must
return an error.

Testing:
Added unit tests for STRING, CHAR, and VARCHAR that exercise the string
overflow handling.

Change-Id: If6541e7c68255bf599b26386a55057c93e62af51
Reviewed-on: http://gerrit.cloudera.org:8080/3383
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit e78b6db7e2b334ff88dd5678290f5b932a6a715f)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/13636
Tested-by: Anuj Phadke <ap...@cloudera.com>
(cherry picked from commit ea0200e1ea6ff0f6e54b22b00b4e25ce95a7c2d9)
Reviewed-on: http://gerrit.cloudera.org:8080/3538
Reviewed-by: anujphadke <ap...@cloudera.com>
Tested-by: anujphadke <ap...@cloudera.com>
---
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner-test.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M common/thrift/generate_error_codes.py
5 files changed, 109 insertions(+), 16 deletions(-)

Approvals:
  anujphadke: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If6541e7c68255bf599b26386a55057c93e62af51
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.x
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>

[Impala-CR](cdh5-2.5.0 5.7.x) IMPALA-3732: handle string length overflow in avro files

Posted by "anujphadke (Code Review)" <ge...@cloudera.org>.
anujphadke has posted comments on this change.

Change subject: IMPALA-3732: handle string length overflow in avro files
......................................................................


Patch Set 1: Code-Review+2 Verified+1

http://sandbox.jenkins.cloudera.com/view/Impala/view/Private-Utility/job/impala-private-build-and-test/3551/

-- 
To view, visit http://gerrit.cloudera.org:8080/3538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If6541e7c68255bf599b26386a55057c93e62af51
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.x
Gerrit-Owner: anujphadke <ap...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: anujphadke <ap...@cloudera.com>
Gerrit-HasComments: No