You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Chunky Gupta <ch...@vizury.com> on 2013/02/13 10:43:24 UTC

Change timestamp format in hive

Hi,

I have a log file which has timestamp in format "YYYY-MM-DD-HH:MM:SS". But
since the timestamp datatype format in hive is "YYYY-MM-DD HH:MM:SS".
I created a table with datatype of that column as TIMESTAMP. But when I
load the data it is throwing error. I think it is because of difference in
format.

Is there any way to set the timestamp format while creating the table. Or
is there some other solution for this issue ?

Thanks,
Chunky.

Re: Change timestamp format in hive

Posted by Dean Wampler <de...@thinkbiganalytics.com>.
I'll mention some workarounds, but they all add overhead:

1. Use STRING for the column, then parse it with the date functions
Alexander mentioned.
2. Use STRING, then replace the offending '-' with a space, e.g.,

select printf("%s %s", substr('2013-02-13-08:11:22', 0, 10),
substr('2013-02-13-08:11:22', 12)) as time from hadoop.stocks limit 1;

Obviously I hard coded a value for the string; you would pass in a column
name.

You could even cast the result to TIMESTAMP:

select cast(printf("%s %s", substr('2013-02-13-08:11:22', 0, 10),
substr('2013-02-13-08:11:22', 12)) as TIMESTAMP) as time from hadoop.stocks
limit 1;

3. If you just need the year-month-day, i.e., the date, the to_date
function appears to work fine with these strings. However, there isn't a
corresponding to_time function for the HMS.

Ugly and not something you would want to do for every query. However, you
could hide any of these hacks behind a view.

The best thing to do would be to have your ETL process convert these
strings while loading into HDFS, if possible.

I'm experimenting with Hive v0.10, by the way.

dean

On Wed, Feb 13, 2013 at 3:48 AM, Alexander Alten-Lorenz <wget.null@gmail.com
> wrote:

> May
> https://cwiki.apache.org/Hive/languagemanual-udf.html#LanguageManualUDF-DateFunctionshelp you?
>
> - Alex
>
> On Feb 13, 2013, at 10:43 AM, Chunky Gupta <ch...@vizury.com>
> wrote:
>
> > Hi,
> >
> > I have a log file which has timestamp in format "YYYY-MM-DD-HH:MM:SS".
> But since the timestamp datatype format in hive is "YYYY-MM-DD HH:MM:SS".
> > I created a table with datatype of that column as TIMESTAMP. But when I
> load the data it is throwing error. I think it is because of difference in
> format.
> >
> > Is there any way to set the timestamp format while creating the table.
> Or is there some other solution for this issue ?
> >
> > Thanks,
> > Chunky.
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>
>


-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330

Re: Change timestamp format in hive

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
May https://cwiki.apache.org/Hive/languagemanual-udf.html#LanguageManualUDF-DateFunctions help you?

- Alex

On Feb 13, 2013, at 10:43 AM, Chunky Gupta <ch...@vizury.com> wrote:

> Hi,
> 
> I have a log file which has timestamp in format "YYYY-MM-DD-HH:MM:SS". But since the timestamp datatype format in hive is "YYYY-MM-DD HH:MM:SS". 
> I created a table with datatype of that column as TIMESTAMP. But when I load the data it is throwing error. I think it is because of difference in format. 
> 
> Is there any way to set the timestamp format while creating the table. Or is there some other solution for this issue ?
> 
> Thanks,
> Chunky.

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF