You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Serega Sheypak <se...@gmail.com> on 2018/03/11 10:55:19 UTC

how "hour" function in Spark SQL is supposed to work?

hi, desperately trying to extract hour from unix seconds

year, month, dayofmonth functions work as expected.
hour function always returns 0.

val ds  = dataset
  .withColumn("year", year(to_date(from_unixtime(dataset.col("ts") / 1000))))
  .withColumn("month", month(to_date(from_unixtime(dataset.col("ts") / 1000))))
  .withColumn("day",
dayofmonth(to_date(from_unixtime(dataset.col("ts") / 1000))))
  .withColumn("hour", hour(from_utc_timestamp(dataset.col("ts") / 1000, "UTC")))

  //.withColumn("hour", hour(dataset.col("ts") / 1000))
  //.withColumn("hour1", hour(dataset.col("ts")))
  //.withColumn("hour", hour(dataset.col("ts")))
  //.withColumn("hour", hour("2009-07-30 12:58:59"))

I took a look at source code

year, month, dayofmonth expect to get

override def inputTypes: Seq[AbstractDataType] = Seq(DateType)

hour function expects something different

override def inputTypes: Seq[AbstractDataType] = Seq(TimestampType)

from_utc_timestamp returns Timestamp

override def dataType: DataType = TimestampType

but It didn't help

What do I do wrong? how can I get hour from unix seconds?
Thanks!

Re: how "hour" function in Spark SQL is supposed to work?

Posted by Serega Sheypak <se...@gmail.com>.

Ok, this one works:

.withColumn("hour", hour(from_unixtime(typedDataset.col("ts") / 1000)))



2018-03-20 22:43 GMT+01:00 Serega Sheypak <se...@gmail.com>:

> Hi, any updates? Looks like some API inconsistency or bug..?
>
> 2018-03-17 13:09 GMT+01:00 Serega Sheypak <se...@gmail.com>:
>
>> > Not sure why you are dividing by 1000. from_unixtime expects a long type
>> It expects seconds, I have milliseconds.
>>
>>
>>
>> 2018-03-12 6:16 GMT+01:00 vermanurag <an...@fnmathlogic.com>:
>>
>>> Not sure why you are dividing by 1000. from_unixtime expects a long type
>>> which is time in milliseconds from reference date.
>>>
>>> The following should work:
>>>
>>> val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("ts
>>> "))))
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>
>>>
>>
>

Re: how "hour" function in Spark SQL is supposed to work?

Posted by Serega Sheypak <se...@gmail.com>.

Hi, any updates? Looks like some API inconsistency or bug..?

2018-03-17 13:09 GMT+01:00 Serega Sheypak <se...@gmail.com>:

> > Not sure why you are dividing by 1000. from_unixtime expects a long type
> It expects seconds, I have milliseconds.
>
>
>
> 2018-03-12 6:16 GMT+01:00 vermanurag <an...@fnmathlogic.com>:
>
>> Not sure why you are dividing by 1000. from_unixtime expects a long type
>> which is time in milliseconds from reference date.
>>
>> The following should work:
>>
>> val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("
>> ts"))))
>>
>>
>>
>>
>>
>>
>> --
>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Re: how "hour" function in Spark SQL is supposed to work?

Posted by Serega Sheypak <se...@gmail.com>.

> Not sure why you are dividing by 1000. from_unixtime expects a long type
It expects seconds, I have milliseconds.



2018-03-12 6:16 GMT+01:00 vermanurag <an...@fnmathlogic.com>:

> Not sure why you are dividing by 1000. from_unixtime expects a long type
> which is time in milliseconds from reference date.
>
> The following should work:
>
> val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("ts"))))
>
>
>
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Re: how "hour" function in Spark SQL is supposed to work?

Posted by vermanurag <an...@fnmathlogic.com>.

Not sure why you are dividing by 1000. from_unixtime expects a long type
which is time in milliseconds from reference date.

The following should work:

val ds = dataset.withColumn("hour",hour(from_unixtime(dataset.col("ts"))))






--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org