You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Devender Yadav <de...@impetus.co.in> on 2017/03/27 12:29:00 UTC

How to insert nano seconds in the TimestampType in Spark

Hi All,

I am using spark version - 1.6.1

I have a text table in hive having `timestamp` datatype with nanoseconds precision.

Hive Table Schema:

    c_timestamp             timestamp

Hive Table data:

    hive> select * from tbl1;
    OK
    00:00:00.000000001
    12:12:12.123456789
    23:59:59.999999999

But as per the docs, from Spark 1.5

Timestamps are now stored at a precision of 1us, rather than 1ns

Sample code:

    SparkConf conf = new SparkConf(true).setMaster("yarn-cluster").setAppName("SAMPLE_APP");
    SparkContext sc = new SparkContext(conf);
    HiveContext hc = new HiveContext(sc);
    DataFrame df = hc.table("testdb.tbl1");

Data is truncated to microseconds.

    00:00:00
    12:12:12.123456
    23:59:59.999999


Is there any way to use nanoseconds here?


Regards,
Devender


________________________________






NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.

Re: How to insert nano seconds in the TimestampType in Spark

Posted by Michael Armbrust <mi...@databricks.com>.

The timestamp type is only microsecond precision.  You would need to store
it on your own (as binary or limited range long or something) if you
require nanosecond precision.

On Mon, Mar 27, 2017 at 5:29 AM, Devender Yadav <
devender.yadav@impetus.co.in> wrote:

> Hi All,
>
> I am using spark version - 1.6.1
>
> I have a text table in hive having `timestamp` datatype with nanoseconds
> precision.
>
> Hive Table Schema:
>
>     c_timestamp             timestamp
>
> Hive Table data:
>
>     hive> select * from tbl1;
>     OK
>     00:00:00.000000001
>     12:12:12.123456789
>     23:59:59.999999999
>
> But as per the docs, from Spark 1.5
>
> *Timestamps are now stored at a precision of 1us, rather than 1ns*
>
>
> Sample code:
>
>     SparkConf conf = new SparkConf(true).setMaster("
> yarn-cluster").setAppName("SAMPLE_APP");
>     SparkContext sc = new SparkContext(conf);
>     HiveContext hc = new HiveContext(sc);
>     DataFrame df = hc.table("testdb.tbl1");
>
> Data is truncated to microseconds.
>
>     00:00:00
>     12:12:12.123456
>     23:59:59.999999
>
>
> Is there any way to use nanoseconds here?
>
>
> Regards,
> Devender
>
>
> ------------------------------
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>