You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Manish R <li...@gmail.com> on 2016/09/22 17:05:07 UTC

iso 8601 to utc with timezone conversion

Hi Guys,

There is a scenario here that I am trying to implement

I have a table say table1 which contains aid and request_date in ISO 8601
format. I have one more table say table2 which contains aid and timezone
details. Now I want to convert request_date from table1 to UTC and apply
the timezone that is in table2 format for that corresponding aid.

Table 1 example data
*2016-09-15T23:45:22.943762Z abs123*
*2016-09-16T22:48:12.943762Z erty456*

Table 2 example data
*abs123   Asia/Kolkata*
*erty456  Europe/Amsterdam*

Re: iso 8601 to utc with timezone conversion

Posted by Manish R <li...@gmail.com>.
Guys,

I tried below query but it looks like the conversion is somehow not
correct. The logs are from AWS ELB in Singapore region. Can you guys please
check and let me know what am I missing?

hive> SELECT s.request_date, a.tz,
FROM_UTC_TIMESTAMP(UNIX_TIMESTAMP(s.request_date,
"yyyy-MM-dd'T'hh:mm:ss.SSS'Z'") * 1000, a.tz) from small_table s JOIN
aid_tz a on (a.aid = s.aid) limit 3;

*2016-09-15T23:04:18.013208Z Australia/Sydney 2016-09-16 09:04:31*
*2016-09-15T23:04:17.397250Z Australia/Sydney 2016-09-16 09:10:54*
*2016-09-15T23:04:18.972533Z UTC 2016-09-15 23:20:30*


On Fri, Sep 23, 2016 at 1:24 PM, Manish R <li...@gmail.com>
wrote:

> Yes Sekine I am talking about AWS ELB logs in Mumbai region. Let me try
> implementing what Andres suggested and I also in a verge of implementing
> some other solution  as well. I will let you all know once any of the
> solution works.
> On Sep 23, 2016 1:11 PM, "Sékine Coulibaly" <sc...@gmail.com> wrote:
>
>> Manish,
>>
>> UTC is not a format (but, ISO 8601 is).
>> Consider UTC as +0000 at the end of a ISO 8601 time.
>>
>> Eg:
>> 2016-01-01T*23:45:22.943762*+0000
>>
>> is stricylt equivalent to :
>> 2016-01-01T*23:45:22.943762Z*
>>
>> *and is also strictly equivalent to the same time expressed in another
>> timezone such as Europe/Pa**ris (CEST) :*
>> 2016-01-02T01*:45:22.943762+0200*
>>
>> The documentation you cite says that the format is a ISO 8601, but
>> doesn't specify the timezone used in the format. I guess you are using AWS
>> Load Balancer logs. Please find an example of mine below :
>>
>> 2016-06-15T14:03:00.784203Z awseb-e-9-xxxxxxxxxx 194.154.216.89:58603 1.2.3.4:80 0.000041 0.426008 0.000047 200 200 163 168 "POST https://whatever.domain.org:443/oauth/token HTTP/1.1" "UA" AES256-SHA256 TLSv1.2
>>
>> As you can see, the timestamp is ISO8601 compliant, and does specify Z, so as Andres explained, this is a UTC time.
>>
>>
>> Regards
>>
>> Sekine
>>
>>
>> 2016-09-23 5:48 GMT+02:00 Manish R <li...@gmail.com>:
>>
>>> Hi Andres,
>>>
>>> No that is not in UTC format. Plz see the description of that field
>>> below. so if timezone of table2 is Europe/Amsterdam then we have to convert
>>> the request_date of table1 in UTC Europe/Amsterdam timezone ( for example
>>> 2016-09-18 23:30:52). We have a lot of timezone entries in table2 and I
>>> wonder how am I going to convert all the request_date field according to
>>> timezone field. Do I have to maintains separate table for that?
>>>
>>> timestamp
>>>
>>> The time when the load balancer received the request from the client, in
>>> ISO 8601 format.
>>>
>>> On Fri, Sep 23, 2016 at 1:26 AM, Andres Koitmäe <
>>> andres.koitmae@gmail.com> wrote:
>>>
>>>> Hi!
>>>>
>>>> It seems that in Table1 you already have request_date in UTC format. *Z
>>>> *at the end of the timezone is the zone designator for the zero UTC
>>>> offset.
>>>>
>>>> Now all you have to do is to use standard Hive functions which you can
>>>> find from Hive wiki https://cwiki.apache.org/
>>>> confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF
>>>> -TypeConversionFunctions
>>>>
>>>> Use from_utc_timestamp to convert request_date to timestamp to
>>>> timezone specified in Table 2 (join two tables using aid column)
>>>>
>>>> Regards,
>>>>
>>>> Andres Koitmäe
>>>>
>>>> On 22 September 2016 at 20:05, Manish R <linuxtricksfordevops@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Guys,
>>>>>
>>>>> There is a scenario here that I am trying to implement
>>>>>
>>>>> I have a table say table1 which contains aid and request_date in ISO
>>>>> 8601 format. I have one more table say table2 which contains aid and
>>>>> timezone details. Now I want to convert request_date from table1 to UTC and
>>>>> apply the timezone that is in table2 format for that corresponding aid.
>>>>>
>>>>> Table 1 example data
>>>>> *2016-09-15T23:45:22.943762Z abs123*
>>>>> *2016-09-16T22:48:12.943762Z erty456*
>>>>>
>>>>> Table 2 example data
>>>>> *abs123   Asia/Kolkata*
>>>>> *erty456  Europe/Amsterdam*
>>>>>
>>>>
>>>>
>>>
>>

Re: iso 8601 to utc with timezone conversion

Posted by Manish R <li...@gmail.com>.
Yes Sekine I am talking about AWS ELB logs in Mumbai region. Let me try
implementing what Andres suggested and I also in a verge of implementing
some other solution  as well. I will let you all know once any of the
solution works.
On Sep 23, 2016 1:11 PM, "Sékine Coulibaly" <sc...@gmail.com> wrote:

> Manish,
>
> UTC is not a format (but, ISO 8601 is).
> Consider UTC as +0000 at the end of a ISO 8601 time.
>
> Eg:
> 2016-01-01T*23:45:22.943762*+0000
>
> is stricylt equivalent to :
> 2016-01-01T*23:45:22.943762Z*
>
> *and is also strictly equivalent to the same time expressed in another
> timezone such as Europe/Pa**ris (CEST) :*
> 2016-01-02T01*:45:22.943762+0200*
>
> The documentation you cite says that the format is a ISO 8601, but doesn't
> specify the timezone used in the format. I guess you are using AWS Load
> Balancer logs. Please find an example of mine below :
>
> 2016-06-15T14:03:00.784203Z awseb-e-9-xxxxxxxxxx 194.154.216.89:58603 1.2.3.4:80 0.000041 0.426008 0.000047 200 200 163 168 "POST https://whatever.domain.org:443/oauth/token HTTP/1.1" "UA" AES256-SHA256 TLSv1.2
>
> As you can see, the timestamp is ISO8601 compliant, and does specify Z, so as Andres explained, this is a UTC time.
>
>
> Regards
>
> Sekine
>
>
> 2016-09-23 5:48 GMT+02:00 Manish R <li...@gmail.com>:
>
>> Hi Andres,
>>
>> No that is not in UTC format. Plz see the description of that field
>> below. so if timezone of table2 is Europe/Amsterdam then we have to convert
>> the request_date of table1 in UTC Europe/Amsterdam timezone ( for example
>> 2016-09-18 23:30:52). We have a lot of timezone entries in table2 and I
>> wonder how am I going to convert all the request_date field according to
>> timezone field. Do I have to maintains separate table for that?
>>
>> timestamp
>>
>> The time when the load balancer received the request from the client, in
>> ISO 8601 format.
>>
>> On Fri, Sep 23, 2016 at 1:26 AM, Andres Koitmäe <andres.koitmae@gmail.com
>> > wrote:
>>
>>> Hi!
>>>
>>> It seems that in Table1 you already have request_date in UTC format. *Z
>>> *at the end of the timezone is the zone designator for the zero UTC
>>> offset.
>>>
>>> Now all you have to do is to use standard Hive functions which you can
>>> find from Hive wiki https://cwiki.apache.org/
>>> confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF
>>> -TypeConversionFunctions
>>>
>>> Use from_utc_timestamp to convert request_date to timestamp to timezone
>>> specified in Table 2 (join two tables using aid column)
>>>
>>> Regards,
>>>
>>> Andres Koitmäe
>>>
>>> On 22 September 2016 at 20:05, Manish R <li...@gmail.com>
>>> wrote:
>>>
>>>> Hi Guys,
>>>>
>>>> There is a scenario here that I am trying to implement
>>>>
>>>> I have a table say table1 which contains aid and request_date in ISO
>>>> 8601 format. I have one more table say table2 which contains aid and
>>>> timezone details. Now I want to convert request_date from table1 to UTC and
>>>> apply the timezone that is in table2 format for that corresponding aid.
>>>>
>>>> Table 1 example data
>>>> *2016-09-15T23:45:22.943762Z abs123*
>>>> *2016-09-16T22:48:12.943762Z erty456*
>>>>
>>>> Table 2 example data
>>>> *abs123   Asia/Kolkata*
>>>> *erty456  Europe/Amsterdam*
>>>>
>>>
>>>
>>
>

Re: iso 8601 to utc with timezone conversion

Posted by Sékine Coulibaly <sc...@gmail.com>.
Manish,

UTC is not a format (but, ISO 8601 is).
Consider UTC as +0000 at the end of a ISO 8601 time.

Eg:
2016-01-01T*23:45:22.943762*+0000

is stricylt equivalent to :
2016-01-01T*23:45:22.943762Z*

*and is also strictly equivalent to the same time expressed in another
timezone such as Europe/Pa**ris (CEST) :*
2016-01-02T01*:45:22.943762+0200*

The documentation you cite says that the format is a ISO 8601, but doesn't
specify the timezone used in the format. I guess you are using AWS Load
Balancer logs. Please find an example of mine below :

2016-06-15T14:03:00.784203Z awseb-e-9-xxxxxxxxxx 194.154.216.89:58603
1.2.3.4:80 0.000041 0.426008 0.000047 200 200 163 168 "POST
https://whatever.domain.org:443/oauth/token HTTP/1.1" "UA"
AES256-SHA256 TLSv1.2

As you can see, the timestamp is ISO8601 compliant, and does specify
Z, so as Andres explained, this is a UTC time.


Regards

Sekine


2016-09-23 5:48 GMT+02:00 Manish R <li...@gmail.com>:

> Hi Andres,
>
> No that is not in UTC format. Plz see the description of that field below.
> so if timezone of table2 is Europe/Amsterdam then we have to convert the
> request_date of table1 in UTC Europe/Amsterdam timezone ( for example
> 2016-09-18 23:30:52). We have a lot of timezone entries in table2 and I
> wonder how am I going to convert all the request_date field according to
> timezone field. Do I have to maintains separate table for that?
>
> timestamp
>
> The time when the load balancer received the request from the client, in
> ISO 8601 format.
>
> On Fri, Sep 23, 2016 at 1:26 AM, Andres Koitmäe <an...@gmail.com>
> wrote:
>
>> Hi!
>>
>> It seems that in Table1 you already have request_date in UTC format. *Z *at
>> the end of the timezone is the zone designator for the zero UTC offset.
>>
>> Now all you have to do is to use standard Hive functions which you can
>> find from Hive wiki https://cwiki.apache.org/
>> confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF
>> -TypeConversionFunctions
>>
>> Use from_utc_timestamp to convert request_date to timestamp to timezone
>> specified in Table 2 (join two tables using aid column)
>>
>> Regards,
>>
>> Andres Koitmäe
>>
>> On 22 September 2016 at 20:05, Manish R <li...@gmail.com>
>> wrote:
>>
>>> Hi Guys,
>>>
>>> There is a scenario here that I am trying to implement
>>>
>>> I have a table say table1 which contains aid and request_date in ISO
>>> 8601 format. I have one more table say table2 which contains aid and
>>> timezone details. Now I want to convert request_date from table1 to UTC and
>>> apply the timezone that is in table2 format for that corresponding aid.
>>>
>>> Table 1 example data
>>> *2016-09-15T23:45:22.943762Z abs123*
>>> *2016-09-16T22:48:12.943762Z erty456*
>>>
>>> Table 2 example data
>>> *abs123   Asia/Kolkata*
>>> *erty456  Europe/Amsterdam*
>>>
>>
>>
>

Re: iso 8601 to utc with timezone conversion

Posted by Manish R <li...@gmail.com>.
Hi Andres,

No that is not in UTC format. Plz see the description of that field below.
so if timezone of table2 is Europe/Amsterdam then we have to convert the
request_date of table1 in UTC Europe/Amsterdam timezone ( for example
2016-09-18 23:30:52). We have a lot of timezone entries in table2 and I
wonder how am I going to convert all the request_date field according to
timezone field. Do I have to maintains separate table for that?

timestamp

The time when the load balancer received the request from the client, in
ISO 8601 format.

On Fri, Sep 23, 2016 at 1:26 AM, Andres Koitmäe <an...@gmail.com>
wrote:

> Hi!
>
> It seems that in Table1 you already have request_date in UTC format. *Z *at
> the end of the timezone is the zone designator for the zero UTC offset.
>
> Now all you have to do is to use standard Hive functions which you can
> find from Hive wiki https://cwiki.apache.org/confluence/display/Hive/
> LanguageManual+UDF#LanguageManualUDF-TypeConversionFunctions
>
> Use from_utc_timestamp to convert request_date to timestamp to timezone
> specified in Table 2 (join two tables using aid column)
>
> Regards,
>
> Andres Koitmäe
>
> On 22 September 2016 at 20:05, Manish R <li...@gmail.com>
> wrote:
>
>> Hi Guys,
>>
>> There is a scenario here that I am trying to implement
>>
>> I have a table say table1 which contains aid and request_date in ISO 8601
>> format. I have one more table say table2 which contains aid and timezone
>> details. Now I want to convert request_date from table1 to UTC and apply
>> the timezone that is in table2 format for that corresponding aid.
>>
>> Table 1 example data
>> *2016-09-15T23:45:22.943762Z abs123*
>> *2016-09-16T22:48:12.943762Z erty456*
>>
>> Table 2 example data
>> *abs123   Asia/Kolkata*
>> *erty456  Europe/Amsterdam*
>>
>
>

Re: iso 8601 to utc with timezone conversion

Posted by Andres Koitmäe <an...@gmail.com>.
Hi!

It seems that in Table1 you already have request_date in UTC format. *Z *at
the end of the timezone is the zone designator for the zero UTC offset.

Now all you have to do is to use standard Hive functions which you can find
from Hive wiki
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-TypeConversionFunctions

Use from_utc_timestamp to convert request_date to timestamp to timezone
specified in Table 2 (join two tables using aid column)

Regards,

Andres Koitmäe

On 22 September 2016 at 20:05, Manish R <li...@gmail.com>
wrote:

> Hi Guys,
>
> There is a scenario here that I am trying to implement
>
> I have a table say table1 which contains aid and request_date in ISO 8601
> format. I have one more table say table2 which contains aid and timezone
> details. Now I want to convert request_date from table1 to UTC and apply
> the timezone that is in table2 format for that corresponding aid.
>
> Table 1 example data
> *2016-09-15T23:45:22.943762Z abs123*
> *2016-09-16T22:48:12.943762Z erty456*
>
> Table 2 example data
> *abs123   Asia/Kolkata*
> *erty456  Europe/Amsterdam*
>