You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by "Nick Dimiduk (JIRA)" <ji...@apache.org> on 2016/05/01 22:27:12 UTC

[jira] [Commented] (PHOENIX-2784) phoenix-spark: Allow coercion of DATE fields to TIMESTAMP when loading DataFrames

    [ https://issues.apache.org/jira/browse/PHOENIX-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265928#comment-15265928 ] 

Nick Dimiduk commented on PHOENIX-2784:
---------------------------------------

I'm not following the intended use-case here. Date, Time, and Timestamp are all different types with different precisions. Instead of overriding the meaning through configuration, the type information in the schema should be preserved throughout. If you want Timestamp, use it in your schema.

> phoenix-spark: Allow coercion of DATE fields to TIMESTAMP when loading DataFrames
> ---------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2784
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2784
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 4.7.0
>            Reporter: Josh Mahonin
>            Assignee: Josh Mahonin
>            Priority: Minor
>         Attachments: PHOENIX-2784.patch
>
>
> The Phoenix DATE type is internally represented as an 8 bytes, which can store a full 'yyyy-MM-dd hh:mm:ss' time component. However, Spark SQL follows the SQL Date spec and keeps only the 'yyyy-MM-dd' portion as a 4 byte type. When loading Phoenix DATE columns using the Spark DataFrame API, the 'hh:mm:ss' component is lost.
> This patch allows setting a new 'dateAsTimestamp' option when loading a DataFrame, which will coerce the underlying Date object to a Timestamp so that the full time component is loaded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)