You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Jim Apple (Code Review)" <ge...@cloudera.org> on 2018/11/11 01:46:11 UTC

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Jim Apple has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11919


Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................

IMPALA-5031: signed overflow in TimestampValue

The standard says that overflow for signed arithmetic operations is
undefined behavior; see [expr]:

    If during the evaluation of an expression, the result is not
    mathematically defined or not in the range of representable values
    for its type, the behavior is undefined.

This patch fixes a signed overflow with the folowing backtrace
(uninteresting parts elided):

runtime/timestamp-value.inline.h:67:13: runtime error: signed integer overflow: -9223372036854775808 + -9223372037 cannot be represented in type 'long'
    #0 TimestampValue::FromUnixTimeNanos(long, long, cctz::time_zone const&) runtime/timestamp-value.inline.h:67:13
    #1 TimestampValue::FromSubsecondUnixTime(double, cctz::time_zone const&) runtime/timestamp-value.inline.h:62:10
    #2 CastFunctions::CastToTimestampVal(impala_udf::FunctionContext*, impala_udf::FloatVal const&) exprs/cast-functions-ir.cc:248:172
    #3 impala_udf::TimestampVal ScalarFnCall::InterpretEval<impala_udf::TimestampVal>(ScalarExprEvaluator*, TupleRow const*) const exprs/scalar-fn-call.cc:485:208
    #4 ScalarFnCall::GetTimestampVal(ScalarExprEvaluator*, TupleRow const*) const exprs/scalar-fn-call.cc:608:44
    #5 ScalarExprEvaluator::GetValue(ScalarExpr const&, TupleRow const*) exprs/scalar-expr-evaluator.cc:314:41
    #6 ScalarExprEvaluator::GetValue(TupleRow const*) exprs/scalar-expr-evaluator.cc:250:10
    #7 void Tuple::MaterializeExprs<false, false>(TupleRow*, TupleDescriptor const&, ScalarExprEvaluator* const*, MemPool*, StringValue**, int*, int*) runtime/tuple.cc:222:27
    #8 void Tuple::MaterializeExprs<false, false>(TupleRow*, TupleDescriptor const&, vector<ScalarExprEvaluator*> const&, MemPool*, vector<StringValue*>*, int*) runtime/tuple.h:174:5
    #9 UnionNode::MaterializeExprs(vector<ScalarExprEvaluator*> const&, TupleRow*, unsigned char*, RowBatch*) exec/union-node-ir.cc:29:14
    #10 UnionNode::GetNextConst(RuntimeState*, RowBatch*) exec/union-node.cc:263:5
    #11 UnionNode::GetNext(RuntimeState*, RowBatch*, bool*) exec/union-node.cc:296:45

This was seen in the backend test ExprTest.CastExprs.

Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
---
M be/src/runtime/timestamp-value.inline.h
1 file changed, 11 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/11919/1
-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple <jb...@apache.org>

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Hello Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11919

to look at the new patch set (#3).

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................

IMPALA-5031: signed overflow in TimestampValue

The standard says that overflow for signed arithmetic operations is
undefined behavior; see [expr]:

    If during the evaluation of an expression, the result is not
    mathematically defined or not in the range of representable values
    for its type, the behavior is undefined.

This patch fixes a signed overflow with the folowing backtrace
(uninteresting parts elided):

runtime/timestamp-value.inline.h:67:13: runtime error: signed integer overflow: -9223372036854775808 + -9223372037 cannot be represented in type 'long'
    #0 TimestampValue::FromUnixTimeNanos(long, long, cctz::time_zone const&) runtime/timestamp-value.inline.h:67:13
    #1 TimestampValue::FromSubsecondUnixTime(double, cctz::time_zone const&) runtime/timestamp-value.inline.h:62:10
    #2 CastFunctions::CastToTimestampVal(impala_udf::FunctionContext*, impala_udf::FloatVal const&) exprs/cast-functions-ir.cc:248:172
    #3 impala_udf::TimestampVal ScalarFnCall::InterpretEval<impala_udf::TimestampVal>(ScalarExprEvaluator*, TupleRow const*) const exprs/scalar-fn-call.cc:485:208
    #4 ScalarFnCall::GetTimestampVal(ScalarExprEvaluator*, TupleRow const*) const exprs/scalar-fn-call.cc:608:44
    #5 ScalarExprEvaluator::GetValue(ScalarExpr const&, TupleRow const*) exprs/scalar-expr-evaluator.cc:314:41
    #6 ScalarExprEvaluator::GetValue(TupleRow const*) exprs/scalar-expr-evaluator.cc:250:10
    #7 void Tuple::MaterializeExprs<false, false>(TupleRow*, TupleDescriptor const&, ScalarExprEvaluator* const*, MemPool*, StringValue**, int*, int*) runtime/tuple.cc:222:27
    #8 void Tuple::MaterializeExprs<false, false>(TupleRow*, TupleDescriptor const&, vector<ScalarExprEvaluator*> const&, MemPool*, vector<StringValue*>*, int*) runtime/tuple.h:174:5
    #9 UnionNode::MaterializeExprs(vector<ScalarExprEvaluator*> const&, TupleRow*, unsigned char*, RowBatch*) exec/union-node-ir.cc:29:14
    #10 UnionNode::GetNextConst(RuntimeState*, RowBatch*) exec/union-node.cc:263:5
    #11 UnionNode::GetNext(RuntimeState*, RowBatch*, bool*) exec/union-node.cc:296:45

This was seen in the backend test ExprTest.CastExprs.

Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
---
M be/src/runtime/timestamp-value.inline.h
1 file changed, 7 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/11919/3
-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 1:

(1 comment)

I have two concerns with the solution:
1. performance - convert-timestamp-benchmark.cc has a test for this function, can you run it in release mode? I would call this function medium performance critical - it is called in decimal->timestamp conversion, so it is not used in most queries, but has the potential to dominate some specific queries.
2. readability - the change adds some complexity that I would prefer to avoid if possible, see my comment in line 66 for ideas.

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h@66
PS1, Line 66: int64_t nanos
Some explanation for the reason why I did not care about overflow when I last touched this code: nanos makes sense in the +-10^9 range, and all (non-test) callers pass int32/uint32, so the range we can get here is not much larger than that (~ -2 to 4 sec). This means that overflow of unix_time is only possible near the min/max value representable on 64 bits, which is far outside the valid range of timestamps, so the overflow can only change an invalid timestamp to another invalid timestamp.

I would prefer to avoid the overflow by changing the interface to handle nanos only in the -999'999'999 .. + 999'999'999 range, and treat other values as invalid timestamps. This would mean that unix_time would be affected only in the negative case. The only caller code that would need to be changed are tests and timestamp-test.cc.

Another way to do this is to change the valid range to 0 .. 999'999'999, which would make this function even simpler, but would need some changes in DecimalOperators::ConvertToTimestampVal.



-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Nov 2018 13:34:26 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/1350/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Sun, 11 Nov 2018 02:17:48 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h@66
PS1, Line 66: int64_t nanos
> > I prefer the comment + adding the validation to this function. Some of th
Yes, I would add it in a non-DCHECK way. Callers that expect the function to be successful could add a DCHECK on the validity of TimestampValue.



-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Wed, 13 Feb 2019 12:36:50 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h@66
PS1, Line 66: int64_t nanos
> Some explanation for the reason why I did not care about overflow when I la
A couple of clarifying questions:

When you say "all (non-test) callers pass int32/uint32", so you mean as the value of nanos or as the value of unix time?

When you say "overflow of unix_time is only possible near the min/max value representable on 64 bits", are the "64 bits" you're referring to the ones in unix_time?

When you say "changing the interface to handle nanos only in the -999'999'999 .. + 999'999'999 range", you mean adding a comment and a DCHECK and changing callers if necessary, or do you mean something else about the interface?

When you say "This would mean that unix_time would be affected only in the negative case", can you explain why?



-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Sat, 09 Feb 2019 19:31:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 3:

> > I like the current solution, but I think that it should break
 > > several tests in timestamp-test.cc, for example https://github.com/apache/impala/blob/ecf12bec42e11262b88dc0993e375fe4d8acaafb/be/src/runtime/timestamp-test.cc#L290
 > 
 > Earlier, you said, "I would prefer to avoid the overflow by
 > changing the interface to handle nanos only in the -999'999'999 ..
 > + 999'999'999 range, and treat other values as invalid timestamps."
 > 
 > I gather from this that you would prefer to alter the test to meet
 > this restriction than you would to widen the allowed values in
 > FromUnixTimeNanos?

As another comment: if changing the semantics has subtleties, I'd MUCH rather keep the semantics the same, altering only the path where there is undefined behavior, even for a marginal slowdown.


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Mon, 11 Mar 2019 22:42:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 5: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 5
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 May 2019 18:32:47 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 3:

I like the current solution, but I think that it should break several tests in timestamp-test.cc, for example https://github.com/apache/impala/blob/ecf12bec42e11262b88dc0993e375fe4d8acaafb/be/src/runtime/timestamp-test.cc#L290


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Mon, 11 Mar 2019 22:29:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4130/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 5
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 May 2019 13:04:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h@66
PS1, Line 66: int64_t nanos
> I prefer the comment + adding the validation to this function. Some of the callers are scanners, so if the data in the file is corrupt, then it is possible to get out of range values here. So if we would add a DCHECK, then some files could crash a debug build.

If we add the validation to this function, do you want it added in a non-DCHECK way? Do you just want to return an invalid TimestampValue, rather than crashing?



-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Mon, 11 Feb 2019 16:59:29 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2950/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 4
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Sun, 28 Apr 2019 00:05:16 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2405/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Mon, 11 Mar 2019 18:15:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 5
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 May 2019 13:04:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 4
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 May 2019 05:14:05 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has uploaded a new patch set (#4) to the change originally created by Jim Apple. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................

IMPALA-5031: signed overflow in TimestampValue

The standard says that overflow for signed arithmetic operations is
undefined behavior; see [expr]:

    If during the evaluation of an expression, the result is not
    mathematically defined or not in the range of representable values
    for its type, the behavior is undefined.

This patch fixes a signed overflow with the folowing backtrace
(uninteresting parts elided):

runtime/timestamp-value.inline.h:67:13: runtime error: signed integer overflow: -9223372036854775808 + -9223372037 cannot be represented in type 'long'
    #0 TimestampValue::FromUnixTimeNanos(long, long, cctz::time_zone const&) runtime/timestamp-value.inline.h:67:13
    #1 TimestampValue::FromSubsecondUnixTime(double, cctz::time_zone const&) runtime/timestamp-value.inline.h:62:10
    #2 CastFunctions::CastToTimestampVal(impala_udf::FunctionContext*, impala_udf::FloatVal const&) exprs/cast-functions-ir.cc:248:172
    #3 impala_udf::TimestampVal ScalarFnCall::InterpretEval<impala_udf::TimestampVal>(ScalarExprEvaluator*, TupleRow const*) const exprs/scalar-fn-call.cc:485:208
    #4 ScalarFnCall::GetTimestampVal(ScalarExprEvaluator*, TupleRow const*) const exprs/scalar-fn-call.cc:608:44
    #5 ScalarExprEvaluator::GetValue(ScalarExpr const&, TupleRow const*) exprs/scalar-expr-evaluator.cc:314:41
    #6 ScalarExprEvaluator::GetValue(TupleRow const*) exprs/scalar-expr-evaluator.cc:250:10
    #7 void Tuple::MaterializeExprs<false, false>(TupleRow*, TupleDescriptor const&, ScalarExprEvaluator* const*, MemPool*, StringValue**, int*, int*) runtime/tuple.cc:222:27
    #8 void Tuple::MaterializeExprs<false, false>(TupleRow*, TupleDescriptor const&, vector<ScalarExprEvaluator*> const&, MemPool*, vector<StringValue*>*, int*) runtime/tuple.h:174:5
    #9 UnionNode::MaterializeExprs(vector<ScalarExprEvaluator*> const&, TupleRow*, unsigned char*, RowBatch*) exec/union-node-ir.cc:29:14
    #10 UnionNode::GetNextConst(RuntimeState*, RowBatch*) exec/union-node.cc:263:5
    #11 UnionNode::GetNext(RuntimeState*, RowBatch*, bool*) exec/union-node.cc:296:45

This was seen in the backend test ExprTest.CastExprs.

Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
---
M be/src/runtime/timestamp-value.inline.h
1 file changed, 3 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/11919/4
-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 4
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11919/1/be/src/runtime/timestamp-value.inline.h@66
PS1, Line 66: int64_t nanos
> A couple of clarifying questions:
"When you say "all (non-test) callers pass int32/uint32", so you mean as the value of nanos or as the value of unix time?"

I meant the value of nanos.

"When you say "overflow of unix_time is only possible near the min/max value representable on 64 bits", are the "64 bits" you're referring to the ones in unix_time?"

Yes, I meant unix_time.

"When you say "changing the interface to handle nanos only in the -999'999'999 .. + 999'999'999 range", you mean adding a comment and a DCHECK and changing callers if necessary, or do you mean something else about the interface?"

I prefer the comment + adding the validation to this function. Some of the callers are scanners, so if the data in the file is corrupt, then it is possible to get out of range values here. So if we would add a DCHECK, then some files could crash a debug build.

"When you say "This would mean that unix_time would be affected only in the negative case", can you explain why?"

For nanos in 0..999'999'999 range SplitTime<NANOS_PER_SEC> will return 0, so unix_time time will not be affected, and nanos can be simply added in line 74 of the original code.



-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 1
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Mon, 11 Feb 2019 16:44:02 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 3:

> I like the current solution, but I think that it should break
 > several tests in timestamp-test.cc, for example https://github.com/apache/impala/blob/ecf12bec42e11262b88dc0993e375fe4d8acaafb/be/src/runtime/timestamp-test.cc#L290

Earlier, you said, "I would prefer to avoid the overflow by changing the interface to handle nanos only in the -999'999'999 .. + 999'999'999 range, and treat other values as invalid timestamps."

I gather from this that you would prefer to alter the test to meet this restriction than you would to widen the allowed values in FromUnixTimeNanos?


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Mon, 11 Mar 2019 22:37:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 4: Code-Review+1

Ouch, sorry for neglecting this for so long.

I would still prefer to change the interface to accept nanos only in the (-1,1) range (the only possible source of out of range nanos are corrupt files) but I am also ok with your original solution.


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 4
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Mon, 29 Apr 2019 19:32:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Jim Apple (Code Review)" <ge...@cloudera.org>.
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................


Patch Set 4:

> Uploaded patch set 4.

This has no benchmark implications


-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 4
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Comment-Date: Sat, 27 Apr 2019 23:21:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5031: signed overflow in TimestampValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11919 )

Change subject: IMPALA-5031: signed overflow in TimestampValue
......................................................................

IMPALA-5031: signed overflow in TimestampValue

The standard says that overflow for signed arithmetic operations is
undefined behavior; see [expr]:

    If during the evaluation of an expression, the result is not
    mathematically defined or not in the range of representable values
    for its type, the behavior is undefined.

This patch fixes a signed overflow with the folowing backtrace
(uninteresting parts elided):

runtime/timestamp-value.inline.h:67:13: runtime error: signed integer overflow: -9223372036854775808 + -9223372037 cannot be represented in type 'long'
    #0 TimestampValue::FromUnixTimeNanos(long, long, cctz::time_zone const&) runtime/timestamp-value.inline.h:67:13
    #1 TimestampValue::FromSubsecondUnixTime(double, cctz::time_zone const&) runtime/timestamp-value.inline.h:62:10
    #2 CastFunctions::CastToTimestampVal(impala_udf::FunctionContext*, impala_udf::FloatVal const&) exprs/cast-functions-ir.cc:248:172
    #3 impala_udf::TimestampVal ScalarFnCall::InterpretEval<impala_udf::TimestampVal>(ScalarExprEvaluator*, TupleRow const*) const exprs/scalar-fn-call.cc:485:208
    #4 ScalarFnCall::GetTimestampVal(ScalarExprEvaluator*, TupleRow const*) const exprs/scalar-fn-call.cc:608:44
    #5 ScalarExprEvaluator::GetValue(ScalarExpr const&, TupleRow const*) exprs/scalar-expr-evaluator.cc:314:41
    #6 ScalarExprEvaluator::GetValue(TupleRow const*) exprs/scalar-expr-evaluator.cc:250:10
    #7 void Tuple::MaterializeExprs<false, false>(TupleRow*, TupleDescriptor const&, ScalarExprEvaluator* const*, MemPool*, StringValue**, int*, int*) runtime/tuple.cc:222:27
    #8 void Tuple::MaterializeExprs<false, false>(TupleRow*, TupleDescriptor const&, vector<ScalarExprEvaluator*> const&, MemPool*, vector<StringValue*>*, int*) runtime/tuple.h:174:5
    #9 UnionNode::MaterializeExprs(vector<ScalarExprEvaluator*> const&, TupleRow*, unsigned char*, RowBatch*) exec/union-node-ir.cc:29:14
    #10 UnionNode::GetNextConst(RuntimeState*, RowBatch*) exec/union-node.cc:263:5
    #11 UnionNode::GetNext(RuntimeState*, RowBatch*, bool*) exec/union-node.cc:296:45

This was seen in the backend test ExprTest.CastExprs.

Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Reviewed-on: http://gerrit.cloudera.org:8080/11919
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/runtime/timestamp-value.inline.h
1 file changed, 3 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/11919
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iaad158e6634314a5690a43a0cc04426c1aba8f41
Gerrit-Change-Number: 11919
Gerrit-PatchSet: 6
Gerrit-Owner: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Jim Apple <jb...@apache.org>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>