You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org> on 2017/09/04 15:21:33 UTC

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash impala (boost exception)

Csaba Ringhofer has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7954

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash impala (boost exception)
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash impala (boost exception)

TimestampValue::FromSubsecondUnixTime and UtcFromUnixTimeMicros
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400.01.01 00.00.00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999.12.31 59.59.59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000.01.01. 00.00.00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
2 files changed, 10 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#10).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
9 files changed, 76 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/10
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#4).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
5 files changed, 47 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@139
PS9, Line 139:   }
> Literal::Literal(ColumnType type, double v) also uses TimestampValue::FromS
I have found the cause of this issue, see https://issues.apache.org/jira/browse/IMPALA-5664?focusedCommentId=16203482&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16203482



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 13 Oct 2017 13:05:36 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 81: ptime
> Have you considered catching the error when the invalid ptime object is cre
There was no exception thrown at the time of creation, but much later, when the TimestampValue::ToString() function was called. It would be possible to throw an exception here, and add try+catch to every function where this constructor is called, but it would be a bit more work.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash impala (boost exception)

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash impala (boost exception)
......................................................................


Patch Set 1:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/7954/1//COMMIT_MSG
Commit Message:

PS1, Line 7: impala
nit: Capitalize Impala (it's a name)


PS1, Line 9: FromSubsecondUnixTime
We usually refer to functions as Function().


PS1, Line 11: 1400.01.01 00.00.00
You could write dates in the more standard format (1400-01-01 00:00:00).


PS1, Line 13: The maximum
            : case, 9999.12.31 59.59.59 is a bit different, because as I understand,
            : with nanosecond precision posix times, the maximum value is actually
            : 10000.01.01. 00.00.00 minus 1 nanosec.
Can you add tests with sub-second deltas around the upper bound, too?


PS1, Line 18: TimestampValue::FromUnixTimeNanos can create problematic TimestampValues
            : both <1400 and 10000<=.
Doens't this change fix this?


Line 22: HasDate/HasTime is true, then it really is a valid timestamp.
Can you include in the commit message how you fixed it and how you test it? Generally speaking, it's usually good to structure commit messages for bug fixes as "Problem, Solution, Tests".


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

Line 637:   // last second of 1399 need can be special, see https://issues.apache.org/jira/browse/IMPALA-5664
Please wrap lines at 90 characters. Please follow the surrounding code for indents and capitalization of comments.


PS1, Line 637: need
Extra word?


PS1, Line 637: 1399
Please state of what.


PS1, Line 637: https://issues.apache.org/jira/browse/IMPALA-5664
Please outline in the comment briefly why the last second needs special attention.


Line 640:   EXPECT_FALSE(TimestampValue::FromUnixTimeNanos(MIN_DATE_AS_UNIX_TIME, -NANOS_PER_MICRO*100).HasDate());
While you are here, can you also add tests for the exact beginning of the last second?

We also should have tests that add a subsecond interval for symmetry.

Then please also add similar tests for the maximum date.


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS1, Line 80: !date_.is_special()
Can you add a comment explaining why this check is necessary?


Line 80: 	  if(UNLIKELY(!date_.is_special() && !IsValidDate())) *this = TimestampValue();
Please use spaces instead of tabs.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 15:

I have checked the output of jenkins. There are 3 problematic front end tests:
AnalyzeDDLTest.TestCreateManagedKuduTable error, (null pointer exception), side effect of adding warning to invalid string->timestamp casts

PlannerTest.testKuduSelectivity: failure, not 100% sure that it is related to the patch

PlannerTest.testPartitionPruning: failure, side effect of adding warning to invalid string->timestamp casts

As a short term solution, the warning can be removed from the string->timestamp conversion, as it is not related to the original issue.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 12 Oct 2017 15:59:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 17: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 20 Oct 2017 23:13:02 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#11).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
7 files changed, 66 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/11
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 15:

Csaba, are those failing tests specifically require a timestamp that issues such a warning? If not, I suggest we just change the timestamps of those tests in a way that preserves test coverage and avoids the warnings problem.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 13 Oct 2017 18:16:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 9:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@136
PS9, Line 136: if(!tv.HasDateAndTime()){
here an elsewhere - please follow the style used by the rest of impala.


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@137
PS9, Line 137: UnixMicrosToUtcTimestamp
is that meaningful to the user?


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@139
PS9, Line 139:   }
are the ones you added the only places where your new validation might produce a !HasDateAndTime() TimestampValue?


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@751
PS9, Line 751:     datetime.date().year();
it still seems like this is doing the same validation. should we remove this an instead do the result_value.HasDateAndTime() check here? Or is this really a different case?



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Mon, 02 Oct 2017 21:51:05 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1323/


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:48:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has uploaded a new patch set (#2).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
2 files changed, 34 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 81: ptime
Have you considered catching the error when the invalid ptime object is created? What is the benefit of doing it here?


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 11:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
> They are exercised by different parts of be/exprs/expr-test.cc .I have sear
I mean the warning messages that you're adding here have no verification.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
> I can change this, but I prefer it this way actually. What is the rule, sho
Yes, we should follow the style guide so we have a consistent style, which overall makes the code more readable.  I'm sure everyone has something that they dislike about the style, but we need consistency.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h@43
PS11, Line 43: 9999
> This comment was just wrong, we state <= 9999-12-31 or <10000-01-01 everywh
Okay, thanks for confirming.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
> Done
still not fixed:
if (HasDate()...



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 06 Oct 2017 17:27:05 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Reviewed-on: http://gerrit.cloudera.org:8080/7954
Reviewed-by: Lars Volker <lv...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
8 files changed, 146 insertions(+), 10 deletions(-)

Approvals:
  Lars Volker: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#3).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
6 files changed, 50 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#9).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
7 files changed, 66 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/9
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 80: !date_.is_special()
> how about HasDate()?  But also, should IsValidDate() be checking HasDate()?
I looked at ScalarColumnReader<TimestampValue, true>::ValidateSlot(), and now I see that TimestampValue is created via reinterpret_cast on buffer pointers - this means that its constructor is skipped, so it is not possible to force validity in the public functions.

>That code looks bogus if date_.is_special(), 
Special values lie outside the valid time range, so IsValidDate can only return true if is_special() is false.

TimestampValue can have not_a_date_time in date_ or time_, it is only invalid, if both are not_a_date_time. I am not sure, what will/should happen in these cases.


Line 86:     }
> why is that that the year() calls in timestamp-functions-ir.cc aren't suffi
About giving a warning to the user: is it ok to log in low level classes like TimestampValue? AddSub calls FunctionContext::AddWarning, while TimestampValue`s functions do not receive FunctionContext arguments.

If TimestampValue functions cannot log, then I will have to look for their callers and add checks + warnings there.

Note that I am not familiar with Impala`s logging system yet.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7954/1//COMMIT_MSG
Commit Message:

PS1, Line 7: Impala
> nit: Capitalize Impala (it's a name)
Done


PS1, Line 9: FromSubsecondUnixTime
> We usually refer to functions as Function().
Done


PS1, Line 11: 1400-01-01 00:00:00
> You could write dates in the more standard format (1400-01-01 00:00:00).
Done


PS1, Line 13: The maximum
            : case, 9999-12-31 59:59:59 is a bit different, because as I understand,
            : with nanosecond precision posix times, the maximum value is actually
            : 10000-01-01. 00:00:00 minus 1 nanosec.
> Can you add tests with sub-second deltas around the upper bound, too?
Done


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

Line 637:   // FromUnixTime functions incorrectly accepted the last second of 1399 as valid,
> Please wrap lines at 90 characters. Please follow the surrounding code for 
Done


PS1, Line 637: he last second of 1399 as valid,
> Please outline in the comment briefly why the last second needs special att
Done


PS1, Line 637: ns i
> Extra word?
Done


Line 640:       MIN_DATE_AS_UNIX_TIME - 0.1).HasDate());
> While you are here, can you also add tests for the exact beginning of the l
Done


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS1, Line 80: s_special() && UNLI
> Can you add a comment explaining why this check is necessary?
Done


Line 80:     if(!date_.is_special() && UNLIKELY(!IsValidDate())){
> Please use spaces instead of tabs.
Done


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 84:       time_ = boost::posix_time::time_duration(boost::posix_time::not_a_date_time);
The first patch was incorrect - it did not set time_ to not_a_date_time, but 00:00:00 instead. This caused the last second of year 1399 to be inconsistent with other dates before year 1400.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-test.cc@637
PS2, Line 637: Sub-second F
> The problem is only with the functions that take sub-second resolution, rig
Done


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-test.cc@685
PS2, Line 685: 
> nit: typo
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 21 Sep 2017 15:43:35 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
> I mean the warning messages that you're adding here have no verification.
I have added tests to exprs.test. 
There are files called "out-of-range-timestamp-abort-on-error.test" and "out-of-range-timestamp-continue-on-error.test", but they seem to be Parquet specific.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
> Yes, we should follow the style guide so we have a consistent style, which 
Done


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
> still not fixed:
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Oct 2017 13:55:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/exprs/timestamp-functions-ir.cc@332
PS2, Line 332:   DCHECK(ts_value.IsValidDate());
> why does this DCHECK no longer make sense?  are we saying it's trivially tr
Yes, my idea was that range check must be done at construction, and it is enough to check HasDate() in other timestamp related functions.



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Sep 2017 13:35:26 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 11:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
all of these new if-stmts need end-to-end tests to cover. they aren't exercised by the unit test since that exercises the layer below.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/timestamp-functions-ir.cc@139
PS11, Line 139:   }
also needs test


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
nit: missing space here and below


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h@43
PS11, Line 43: 9999
is that what the documentation says? and did this code change affect the upper bound or was this comment just wrong?


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
style



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Oct 2017 17:34:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 14: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:47:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 5:

Ping?


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Sep 2017 17:20:47 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 17:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1360/


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 20 Oct 2017 23:13:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 80: !date_.is_special()
> I looked at ScalarColumnReader<TimestampValue, true>::ValidateSlot(), and n
I think that ScalarColumnReader<TimestampValue, true>::ValidateSlot() is actually a bit too strict, which lead to inconsistency between text and parquet files. I have created a ticket for this:
https://issues.apache.org/jira/browse/IMPALA-5942

We should decide the correct behavior in these cases, before continuing with this ticket.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#14).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 183 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/14
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#12).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
8 files changed, 81 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/12
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 80: !date_.is_special()
> I looked at ScalarColumnReader<TimestampValue, true>::ValidateSlot(), and n
My comment was just that HasDate() == !date_.is_special(), and using HasDate() would make it easier to grep all the places we check for this condition.

I'm not sure I follow your comment about public functions.  Since the scanner does a cast, it explicitly checks IsValidDate(). It is kinda gross, though.


PS2, Line 80: !date_.is_special()
> I think that ScalarColumnReader<TimestampValue, true>::ValidateSlot() is ac
Thanks for finding that. Let's discuss on the jira.


Line 86:     }
> About giving a warning to the user: is it ok to log in low level classes li
One purpose of FunctionContext is to be the interface back into impala's runtime for UDFs (and expressions in general). It'd be best to leak it outside of exprs, udf, and codegen.

So, I think it'd make sense to have a static wrapper around the constructor where we do this validation that lives in the exprs code, and that can have access to FunctionContext to do the logging, and construct the TimestampValue only for well formed values.  (Then this constructor could do a dcheck for that).


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2: Code-Review+1

Thanks for fixing this. LGTM, let's see what others say.

-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/17/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/17/be/src/exprs/expr-test.cc@5005
PS17, Line 5005:   // Timestamp conversions of "dateless" times should return null (and not crash,
This part is from different commit that was applied because of the rebase.



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 20 Oct 2017 12:40:57 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 9:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@136
PS9, Line 136: if(!tv.HasDateAndTime()){
> here an elsewhere - please follow the style used by the rest of impala.
Done


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@137
PS9, Line 137: UnixMicrosToUtcTimestamp
> is that meaningful to the user?
I have changed it to its query name.


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@139
PS9, Line 139:   }
> are the ones you added the only places where your new validation might prod
Literal::Literal(ColumnType type, double v) also uses TimestampValue::FromSubsecondUnixTime, but there is no function context there, and I do not know how to call that constructor. If I put a double to a place where a timestamp is expected, I get an analyses exception.

I have also noticed someting strange:
select timestamp_cmp(timstamp_col, cast(cast(-17987443200.1 as double) as timestamp)) from table;
The table has 3 rows and 4 "UDF WARNING: Could not convert -17987443200.1 to timestamp" are printed.

It would make more sense to me to have 1 call to cast (if the casts with constant arguments would be optimized away) or 3 (if no optimization takes place, and the cast is called for every row).

Does Impala try to optimize functions calls with constant parameters?


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@751
PS9, Line 751:     datetime.date().year();
> it still seems like this is doing the same validation. should we remove thi
It is the same check, but I would prefer to leave it as it is for now - it leads to a nice warning message, and the try-catch block is needed anyway because of AddInterval. It could be done in another commit + jira like clean up/speed up timestamp functions.



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 04 Oct 2017 15:09:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 17: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Sat, 21 Oct 2017 03:13:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 15:

> Csaba, are those failing tests specifically require a timestamp
 > that issues such a warning? If not, I suggest we just change the
 > timestamps of those tests in a way that preserves test coverage and
 > avoids the warnings problem.
Yes, those tests work with "non timestamp" strings by design.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Oct 2017 14:28:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Dan Hecht, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#13).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
8 files changed, 81 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/13
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86
PS2, Line 86:     }
> But what we need is to log to the warning log, like the code in timestamp-f
I have added some logging to warning. The string -> timestamp cast is not related to this issue, but I think it is very useful to warn the user about the invalid formatting.

I have left the logging in TimestampValue::Validate, maybe it will be useful for postmortem analysis one day.


http://gerrit.cloudera.org:8080/#/c/7954/8/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/8/be/src/runtime/timestamp-value.cc@116
PS8, Line 116: /// The time zone of the resulting ptime is local time. This is called by
> nit: for style consistency combine lines 115 & 116.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Mon, 02 Oct 2017 18:00:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/exprs/timestamp-functions-ir.cc@332
PS2, Line 332:   StringVal result(context, 10);
why does this DCHECK no longer make sense?  are we saying it's trivially true now after check HasDate() and with the checks you've added in this commit?


http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h@173
PS4, Line 173: boost::gregorian::date
i think it would be best to avoid running the constructor given these things can throw exceptions, so let's use 'const &' here.


http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h@287
PS4, Line 287:  
nit: extraneous space


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86
PS2, Line 86:   /// Constructors that parse from a date/time string. See TimestampParser for details
> I think it would be better to do the logging on the caller side, because de
Logging in the caller is fine but I don't see that added to your diff.



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 21 Sep 2017 20:50:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 10:

The fe/ files should not be in the patch, I just needed them to fix a compile error.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 04 Oct 2017 15:15:46 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 11:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
> all of these new if-stmts need end-to-end tests to cover. they aren't exerc
They are exercised by different parts of be/exprs/expr-test.cc .I have searched for "TestIsNull.*as timestamp", and the existing tests cover every code line, but not every type instantiation for CAST_TO_SUBSECOND_TIMESTAMP / CAST_TO_TIMESTAMP, only BIGINT/DECIMAL/DOUBLE. Actually most integer types can not represent values that are outside the valid range, because it needs >32 bit.

This design (BE unit tests call FE to parse expressions) seems strange to me, but there is probably a valid reason why it was not done in FE or E2E.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/timestamp-functions-ir.cc@139
PS11, Line 139:   }
> also needs test
Done


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
> nit: missing space here and below
I can change this, but I prefer it this way actually. What is the rule, should we follow the style guide even if the contributor consciously ignored it?  It doesn`t matter in this case, but it would matter with short variable names, e.g. I find a*a + 2*a*b much more readable than a * a + 2 * a * b.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h@43
PS11, Line 43: 9999
> is that what the documentation says? and did this code change affect the up
This comment was just wrong, we state <= 9999-12-31 or <10000-01-01 everywhere else. This limitation comes from boost::gregorian, and this patch has no effect on it.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
> style
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 06 Oct 2017 13:00:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

PS2, Line 637: FromUnixTime
The problem is only with the functions that take sub-second resolution, right? I think that's important to clarify -- otherwise it's not clear what we mean by rounding.


PS2, Line 685: n
nit: typo


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 80: !date_.is_special()
how about HasDate()?  But also, should IsValidDate() be checking HasDate()? That code looks bogus if date_.is_special(), so the call from ScalarColumnReader<TimestampValue, true>::ValidateSlot() seems dangerous (if the parquet file contained a corrupted value that corresponded to not_a_date_time).


PS2, Line 81: ptime
> There was no exception thrown at the time of creation, but much later, when
We don't want to use exceptions in impala. We limit their use to when interacting with other libraries. So not throwing an exception here is the right thing.


Line 86:     }
why is that that the year() calls in timestamp-functions-ir.cc aren't sufficient, e.g.:

    // Validate that the ptime is not "special" (ie not_a_date_time) and has a valid year.
    // If validation fails, an exception is thrown.
    datetime.date().year();

Centralizing where we do this validation seems like a good idea, but unlike the other code that attempts to do this kind of thing, e.g. TimestampFunctions::AddSub(), we don't get a warning. The user should probably get a warning in this case.
2) If we're going to do the validation in a central place, we should try to clean up the other code that does the year() calls (assuming it's trying to validate the same thing).


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h@173
PS4, Line 173: 
> i think it would be best to avoid running the constructor given these thing
Done


http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h@287
PS4, Line 287:  
> nit: extraneous space
Done


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86
PS2, Line 86:     }
> Logging in the caller is fine but I don't see that added to your diff.
I have added some logging, but not on the caller side.



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 29 Sep 2017 16:40:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Dan Hecht (Code Review)" <ge...@cloudera.org>.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 15: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:48:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Alex Behm, Dan Hecht, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#16).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, 9999-12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
10000-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 10000<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
8 files changed, 146 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/16
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 16
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 15: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1323/


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Oct 2017 21:46:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7954/5/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/5/be/src/runtime/timestamp-value.h@173
PS5, Line 173: 
This became static to enable other functions to check whether their date is valid or not without actually creating a TimestampValue or calling year() to generate an exception. I think it would be better if there was a single function that does this check, so if 1400 will be changed for some reason ( IMPALA-2009 ), there will be no need to change many parts of the code.


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86
PS2, Line 86:     }
> One purpose of FunctionContext is to be the interface back into impala's ru
I think it would be better to do the logging on the caller side, because depending on the caller, the format for the log can be different, e.g. unix timestamp vs YYYY-MM-DD string.



-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dh...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 21 Sep 2017 17:29:05 +0000
Gerrit-HasComments: Yes