You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Adam Tamas (Code Review)" <ge...@cloudera.org> on 2020/05/25 09:00:11 UTC

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Adam Tamas has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15866


Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp);
This will come back as NULL values

select cast(string_col as timestamp) from table_name;
Casting from a table with a similar method like this
will also give back NULL values similar to the above example.

select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing error which is:
ERROR: PARSE ERROR: No date tokens provided.

Testing:
modified the previous test related to dateless timestamps and
added new tests to check if the conversions backwards(timestamp to
string) is still working

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
20 files changed, 131 insertions(+), 198 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/2
-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 2
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6136/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 May 2020 09:53:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 13: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 13
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Jul 2020 17:09:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 5:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/6385/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 5
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Sun, 21 Jun 2020 11:44:46 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 278 insertions(+), 231 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/8
-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 8
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6386/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 22 Jun 2020 09:02:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 6:

There was a rebase between patch set 5 and 6. The only thing that is changed is bin/rat_exclude_files.txt.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 22 Jun 2020 08:35:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 4:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/6336/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Jun 2020 14:40:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 4:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@15
PS3, Line 15: select cast('01:02:59' as timestamp);
> This should return parse error because the format itself doesn't contain da
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17
PS3, Line 17: 
> nit: dot at the end of a sentence.
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17
PS3, Line 17: 
> nit: This should be plural in my opinion as it stands for 2 separate exampl
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@28
PS3, Line 28: timesta
> I still don't see tests where you use a file populated by dateless timestam
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29
PS3, Line 29: are requ
> nit: start a sentence with capital letter.
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29
PS3, Line 29: utpu
> nit: tests
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@30
PS3, Line 30: 
> It's just a matter of your point of view which direction you call backwards
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@4507
PS3, Line 4507: 1.00
> Is this related to your change or does it come with a rebase you've made in
Sorry for this one, I really did a rebase between the commits.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@7536
PS3, Line 7536: H:mm:ss'
> Actually, we should get an error when parsing the format and see that there
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h@88
PS3, Line 88: 
> Can you re-use the CastDirection type similarly as it is used in e.g. ISO S
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@179
PS3, Line 179:   if (cast_mode == PARSE) return (dt_ctx->has_date_toks);
> Please don't leave commented code
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@180
PS3, Line 180:   return (dt_ctx->has_date_toks || dt_ctx->has_time_toks);
> This seems logically correct but not that readable. Additionally, this is n
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@610
PS3, Line 610:     (TimestampTC("yyyy-MM-dd HH:m:ss", "2020-05-11 1:24:34", false, true))
> Here the intention was to also cover the case when the minute part of the i
Here we are checking the format tokens not the given string themselves which has a short minute part("yyyy-MM-dd HH:m:ss")

We use the short form here because we expect it to fail.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@612
PS3, Line 612: (TimestampTC("yyyy-MM-dd HH:mm:s", "2020-05-11 14:24:34", false, true, true, 2020,
             :         05, 11, 14, 24, 34))
> Similarly to the other tests, please add another one where the second part 
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@679
PS3, Line 679: mestampFormatTC(1382337792, "yyy
> nit: This sentence sounds a bit weird for me.
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@695
PS3, Line 695: Test padding on double di
> nit: same as above
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@2976
PS3, Line 2976: select to_timestamp('01:01:01', 'HH:mm:ss');
> As mentioned elsewhere, this should result a parse error as the format does
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@2981
PS3, Line 2981: select to_timestamp('01:01:01.123', 'HH:mm:ss.SSS');
> same as above
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Jun 2020 13:56:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6100/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Jul 2020 13:08:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 4:

(8 comments)

Nice work! I feel that this patch is getting into shape. I have some follow-up comments but none of the seems super difficult to address.

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/timestamp-functions-ir.cc@167
PS3, Line 167: dt_ctx->Reset(reinterpret_cast<co
Did you find out what this check is for? Don't you break anything with simply getting rig of this check?


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions-ir.cc@168
PS4, Line 168: if (!SimpleDateFormatTokenizer::Tokenize(dt_ctx, true, PARSE)) {
Anyway, if you tokenize the format here then you do the tokenization for each row of the table even though it would be enough to do once before Impala starts reading rows.


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.h
File be/src/exprs/timestamp-functions.h:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.h@104
PS4, Line 104: bool format
Please use datetime_parse_util::CastDirection


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc@155
PS4, Line 155: bool format = true
Why don't you use the parameter type that could be directly passed to Tokenize()?


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc@167
PS4, Line 167: if (format)
             :       parse_result = SimpleDateFormatTokenizer::Tokenize(dt_ctx, true, FORMAT);
             :     else
             :       parse_result = SimpleDateFormatTokenizer::Tokenize(dt_ctx, true, PARSE);
Please check how to format if-else in Impala


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/runtime/datetime-simple-date-format-parser.h@88
PS4, Line 88: CastDirection cast_mode = FORMAT
I wouldn't give this a default value unless you have any specific reason making it this way.


http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/README@503
PS4, Line 503: Timestamp_parquet_test
The file name is not that verbose. After reading it it doesn't give any hints what timestamp tests it is used for. "dateless_timestamps.parq" would be better for this purpose.

nit: No need to include "parquet" into the file name as the extension already expresses the format.
nit: The file name itself starts with a lowercase letter.


http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/timestamp_text_test.txt
File testdata/data/timestamp_text_test.txt:

http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/timestamp_text_test.txt@1
PS4, Line 1: 1996-04-22 10:00:00.432100000
Same comment for the file name as for the parquet file: "dateless_timestamps.txt" would be more verbose.



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Jun 2020 15:12:25 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 277 insertions(+), 231 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/10
-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
M testdata/data/lazy_timestamp.csv
A testdata/data/timestamp_parquet_test.parq
A testdata/data/timestamp_text_test.txt
M testdata/workloads/functional-query/queries/QueryTest/date.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
30 files changed, 263 insertions(+), 217 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/4
-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15866/6/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/6/be/src/exprs/timestamp-functions.cc@154
PS6, Line 154: void UnixAndFromUnixPrepare(FunctionContext* context,
This function is not included in the header because you didn't want to include the whole datatime-common for CastDirection, right? Could you mention it here?


http://gerrit.cloudera.org:8080/#/c/15866/6/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/6/be/src/runtime/datetime-simple-date-format-parser.h@87
PS6, Line 87: CastDirection cast_mode = FORMAT
This still has a default value. (I saw that you actually added the parameters on the callsites.)



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Jun 2020 13:58:38 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6462/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 8
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 30 Jun 2020 07:29:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
33 files changed, 274 insertions(+), 230 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/5
-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 5
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6108/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 13
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Jul 2020 12:05:00 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp);
This will come back as NULL values

select cast(string_col as timestamp) from table_name;
Casting from a table with a similar method like this
will also give back NULL values similar to the above example.

select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing error which is:
ERROR: PARSE ERROR: No date tokens provided.

Testing:
modified the previous test related to dateless timestamps and
added new tests to check if the conversions backwards(timestamp to
string) is still working

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
20 files changed, 131 insertions(+), 198 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/3
-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 3:

(18 comments)

Thanks for taking care of my comments! In general the casting part of this change is getting into shape in my opinion. However, I miss the part where the timestamp is not a result of a cast() or to_timestamp() or such, and it comes from reading a table. e.g. I don't see now any changes that could prevent us reading dateless timestamp when we run e.g. "select ts_col from table_name".

We discussed offline that some of the file formats are unable to hold these values because only Hive can write them and there Hive doesn't allow writing dateless TSs but for the ones where Impala can write them it should at leas be covered by tests that a previously written file with dateless timestamps can be read after this changes and the dateless TSs will be nulls.

http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@15
PS3, Line 15: select to_timestamp('01:01:01', 'HH:mm:ss');
This should return parse error because the format itself doesn't contain date just time.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17
PS3, Line 17: values
nit: dot at the end of a sentence.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17
PS3, Line 17: This
nit: This should be plural in my opinion as it stands for 2 separate examples.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@28
PS3, Line 28: Testing
I still don't see tests where you use a file populated by dateless timestamps (created by an Impala pre this change) and try to read it with Impala containing your changes.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29
PS3, Line 29: modified
nit: start a sentence with capital letter.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29
PS3, Line 29: test
nit: tests


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@30
PS3, Line 30: backwards
It's just a matter of your point of view which direction you call backwards so I wouldn't use that word here. Simply "Added test to cover timestamp to string path when no date tokens are requested in the output string" or something like that.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@4507
PS3, Line 4507: 1.00
Is this related to your change or does it come with a rebase you've made in the background?
If the second, then a general comment is to try to push rebase related changes to gerrit separately from your actual changes to make the reviewers life easier.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@7536
PS3, Line 7536: HH:mm:ss
Actually, we should get an error when parsing the format and see that there is no date part.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h@88
PS3, Line 88: parse
Can you re-use the CastDirection type similarly as it is used in e.g. ISO SQL format tokenizer?


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@179
PS3, Line 179:   //return (dt_ctx->has_date_toks || dt_ctx->has_time_toks);
Please don't leave commented code


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@180
PS3, Line 180:   return parse?(dt_ctx->has_date_toks):(dt_ctx->has_date_toks || dt_ctx->has_time_toks);
This seems logically correct but not that readable. Additionally, this is not how we format ternary operators in Impala:
 - Use space around the question marks
 - As the condition is simple no need to surround with brackets.
 - Use space around the colon
 
Anyway, I feel that rewriting to something like this would help a bit:

if (parse) return dt_ctx->has_date_toks;
return dt_ctx->has_date_toks || dt_ctx->has_time_toks;

Or something similar.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@610
PS3, Line 610:     (TimestampTC("yyyy-MM-dd HH:m:ss", "2020-05-11 1:24:34", false, true))
Here the intention was to also cover the case when the minute part of the input is 1 char long, right? Instead this tests that the hour part is 1 long.
Additionally, shouldn't you also give params to test the expected result?


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@612
PS3, Line 612: (TimestampTC("yyyy-MM-dd HH:mm:s", "2020-05-11 14:24:34", false, true, true, 2020,
             :         05, 11, 14, 24, 34))
Similarly to the other tests, please add another one where the second part of the input is 1 long.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@679
PS3, Line 679: Test just formatting time tokens
nit: This sentence sounds a bit weird for me.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@695
PS3, Line 695: Test just formatting time
nit: same as above


http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@2976
PS3, Line 2976: select to_timestamp('01:01:01', 'HH:mm:ss');
As mentioned elsewhere, this should result a parse error as the format doesn't contain date.


http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@2981
PS3, Line 2981: select to_timestamp('01:01:01.123', 'HH:mm:ss.SSS');
same as above



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 May 2020 13:49:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/scalar-expr-evaluator.cc@444
PS4, Line 444:   TimestampFunctions::UnixAndFromUnixPrepare(nullptr, FunctionContext::FRAGMENT_LOCAL, true);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/15866/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/15866/4/tests/query_test/test_scanners.py@391
PS4, Line 391: a
flake8: E501 line too long (98 > 90 characters)



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 16 Jun 2020 13:56:41 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 5:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/scalar-expr-evaluator.cc@444
PS4, Line 444:   TimestampFunctions::FromUnixPrepare(nullptr, FunctionContext::FRAGMENT_LOCAL);
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/timestamp-functions-ir.cc@167
PS3, Line 167: if (!context->IsArgConstant(1)) {
> Did you find out what this check is for? Don't you break anything with simp
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions-ir.cc@168
PS4, Line 168:   dt_ctx->Reset(reinterpret_cast<const char*>(fmt.ptr), fmt.len)
> Anyway, if you tokenize the format here then you do the tokenization for ea
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.h
File be/src/exprs/timestamp-functions.h:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.h@104
PS4, Line 104: 
> Please use datetime_parse_util::CastDirection
We can't reach datetime_parse_util::CastDirection from here that is why I used bool.

I spoke with Csaba Ringhofer about it and in the end we come to the conclusion that it is not necessary to be declared here, it is enough in the .cc because it is just there to avoid code duplication.


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc@155
PS4, Line 155: CastDirection cast
> Why don't you use the parameter type that could be directly passed to Token
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc@167
PS4, Line 167: if (!parse_result) {
             :       delete dt_ctx;
             :       ReportBadFormat(context, datetime_parse_util::GENERAL_ERROR, fmt_val, true);
             :       return;
> Please check how to format if-else in Impala
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/runtime/datetime-simple-date-format-parser.h@88
PS4, Line 88: 
> I wouldn't give this a default value unless you have any specific reason ma
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/README@503
PS4, Line 503: dateless_timestamps.pa
> The file name is not that verbose. After reading it it doesn't give any hin
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/timestamp_text_test.txt
File testdata/data/timestamp_text_test.txt:

http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/timestamp_text_test.txt@1
PS4, Line 1: 
> Same comment for the file name as for the parquet file: "dateless_timestamp
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/15866/4/tests/query_test/test_scanners.py@391
PS4, Line 391: 
> flake8: E501 line too long (98 > 90 characters)
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 5
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Sun, 21 Jun 2020 10:58:49 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 13: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 13
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Jul 2020 19:32:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 11: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Jul 2020 13:08:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Reviewed-on: http://gerrit.cloudera.org:8080/15866
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Gabor Kaszab <ga...@cloudera.com>
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 278 insertions(+), 231 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Gabor Kaszab: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 14
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6100/


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Jul 2020 18:17:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 12:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6527/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 12
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Jul 2020 09:12:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/exprs/timestamp-functions.cc@154
PS7, Line 154: // The purpose of making this .cc only is to avoid moving the code of
> I'd rather say something to indicate that the purpose of making this .cc on
Done


http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/runtime/datetime-simple-date-format-parser.cc@56
PS7, Line 56: Tokenize(&DEFAULT_DATE_TIME_CTX[i], PARSE);
> It's no longer needed to give the 3rd parameter in case it's true, right? L
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Jul 2020 11:27:41 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6135/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 2
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 May 2020 09:46:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 2:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@7
PS1, Line 7: d
> no need for quotation marks
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@10
PS1, Line 10: During dateless timestamp casts if the format doesn't contain
            : date part we get an error during tokenization of the format
> I'd rather mention that when a format is provided for a cast then if the fo
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@14
PS1, Line 14: s:
> I'd use this example:
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@15
PS1, Line 15: estam
> I'd use an example as '01:02:59'
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@16
PS1, Line 16: select cast('01:02:59' as timestamp);
> I'd also add an example for to_timestamp(input_str, format_str)
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@17
PS1, Line 17: This will come back as NULL values
> Another example:
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@23
PS1, Line 23: select cast('01:02:59' as timestamp format 'HH12:MI:SS');
            : select cast('12 AM' as timestamp FORMAT 'AM.HH12');
            : These will come back with a parsing error which is:
            : ERROR: PARSE ERROR: No date tokens provid
> No need to list the modified test files here. If there is something in part
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/exprs/expr-test.cc@6777
PS1, Line 6777: TestValue("dayofweek(cast('2011-12-22' as timestamp))", TYPE_INT, 5);
              :   TestValue("dayofweek(cast('2011-12-24' as timestamp))", TYPE_INT, 7);
              :   TestStringValue(
              :       "to_date(cast('2011-12-22 09:10:11.12345678' as timestamp))", "2011-12-22");
              : 
> these are the same tests as in L6759-6762. No need to duplicate them.
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/exprs/expr-test.cc@7155
PS1, Line 7155: 
              :   TestStringValue(
> These 2 tests lost their purpose with your change, A comment in L7151 refer
If we call trunc() on timestamp is gives back everything that stands before that value and the value that stands in the given place, as I see we can no longer create a valid test which will give back NULL value since we dropped the timeless timestamps.

I will remove these tests since they are indeed misleading and there are other test case where we trunc() a NULL value.


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-iso-sql-format-tokenizer.cc
File be/src/runtime/datetime-iso-sql-format-tokenizer.cc:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@195
PS1, Line 195: 
> I'd move this check after L-197-202. We exit there if cast_mode_ == FORMAT 
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.h@77
PS1, Line 77: DEFAULT_SHORT_DATE_T
> Is this needed anymore?
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.cc@38
PS1, Line 38: DEFAULT_SHORT_DATE_T
> Is this still needed?
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.cc@101
PS1, Line 101:     bool accept_time_toks, bool parse) {
> It would be nice to see tests for to_timestamp as well. I haven't seem any 
I looked around a bit and as I seen there was no test for to_timestamp() with dateless conversions, but I tried it out with manual tests and it is no longer accepting dateless formats.
I will look around and add a failing test with dateless timestamp to the to_timestamp() tests.


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.cc@344
PS1, Line 344: // Check if this string starts with a date
> I don't think that this check is needed.
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/timestamp-test.cc@a617
PS1, Line 617: 
             : 
             : 
             : 
             : 
             : 
             : 
             : 
             : 
             : 
             : 
             : 
             : 
             : 
> These still make sense, you should fix them to work with your patch rather 
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/timestamp-test.cc@a688
PS1, Line 688: 
             : 
> If I understand this test correctly, here the direction is TS to string so 
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/timestamp-test.cc@a704
PS1, Line 704: 
             : 
> Same as above.
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/testdata/workloads/functional-query/queries/QueryTest/date.test
File testdata/workloads/functional-query/queries/QueryTest/date.test:

http://gerrit.cloudera.org:8080/#/c/15866/1/testdata/workloads/functional-query/queries/QueryTest/date.test@a604
PS1, Line 604: 
> Have you checked if this error message/error handling is used anywhere else
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/15866/1/testdata/workloads/functional-query/queries/QueryTest/exprs.test@a2969
PS1, Line 2969: 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
              : 
> Did you leave a test somewhere to cover the case when we get a null for dat
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 2
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Comment-Date: Mon, 25 May 2020 09:00:25 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15866/11/testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
File testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test:

http://gerrit.cloudera.org:8080/#/c/15866/11/testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test@8
PS11, Line 8: localhost:20500
Note, the dockerised tests fail because the hardcoded localhost.



-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 08 Jul 2020 06:23:30 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 8: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 8
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 06 Jul 2020 05:28:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................


Patch Set 10:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/6511/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Jul 2020 11:56:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 278 insertions(+), 231 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/12
-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 12
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

Posted by "Adam Tamas (Code Review)" <ge...@cloudera.org>.
Adam Tamas has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
......................................................................

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 275 insertions(+), 230 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/6
-- 
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>