You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Paul Rogers (Code Review)" <ge...@cloudera.org> on 2019/02/01 06:10:44 UTC

[Impala-ASF-CR] IMPALA-4018 Part1: Add FORMAT clause in CAST()

Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/12267 )

Change subject: IMPALA-4018 Part1: Add FORMAT clause in CAST()
......................................................................


Patch Set 3:

(1 comment)

Key question here is the syntax of the format string. There are two questions.

First, does our syntax here have to be identical to that of Hive? As I understand it, Impala can write view definitions to HMS that Hive should be able to read and visa/versa. If we used a different format string than Hive, would we know which format to use for a given view?

The other consideration is that we'd like to be ISO SQL compliant. The SQL formats are similar to, but distinct from, the Java8 date/time formats. If we go the ISO SQL route, we'd have to parse the SQL format into the Impala (that is, Java 8) format.

Finally, if we use ISO SQL formats in CAST, we should also create new functions to do the same tasks using the SQL format strings. (These would be in addition to the existing Hive-compatible functions.) Unfortunately, unless Hive adds the same functions, these will break view compatibility if used in views.

http://gerrit.cloudera.org:8080/#/c/12267/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

http://gerrit.cloudera.org:8080/#/c/12267/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@220
PS2, Line 220:     }
> It would also be doable. I know because I went for that approach initially 
Probably doable. As you noted, we load functions on startup. Cast functions are declared in CastExpr.initBuiltins(). You would have to specially handle the String --> Timestamp and Timestamp --> String functions. The code already has a number of type-specific rules.

Then, when analyzing the cast, the FORMAT string would be moved into the third argument position in the AST. From here on, the normal function mechanism should take you the rest of the way.

You would need corresponding changes on the BE. Basically, if the format is null, you'd treat the call as a two-arg cast.

In fact, you may even be able to register both two- and three- arg overloads of the function and choose between them depending on whether a FORMAT string exists.



-- 
To view, visit http://gerrit.cloudera.org:8080/12267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia514aaa9e8f5487d396587d5ed24c7348a492697
Gerrit-Change-Number: 12267
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <at...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Greg Rahn <gr...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <pr...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Feb 2019 06:10:44 +0000
Gerrit-HasComments: Yes