You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by 朱健 <zh...@jd.com> on 2015/06/15 12:51:19 UTC
Questions about oozie timezone
Hi,
Thanks for read this email.
I have used oozie for about 2 years. Now I have encountered one problem about the time zone.
Because we located at GMT+08:00 timezone, our Hadoop system makes the convention that all the data path on the HDFS is named by the GMT+08:00 timezone. That means:
At UTC 2015-01-01T00:00Z, the output hourly data located under this folder: $root/2015010108, not the $root/2015010100
At UTC 2015-01-01T01:00Z, the output hourly data located under this folder: $root/2015010109, not the $root/2015010101
So if I set the timezone in the coord to UTC, the oozie job will read the data of 00 hour, but I want it to read the 08. For me in Beijing, China, it is natural for me to understand that the oozie job will read the 08 data at local 08:00
I also tried to set the timezone to GMT+08:00, it didn’t work. Seems the timezone only impact the “Daylight Saving Time”.
Currently I add 8 to my instance number in the coord to fix it temporarily : Change From <instance>0</instance> to <instance>8</instance>
This may be acceptable for hourly job. But it is really ugly to minutes jobs or dailyl jobs. Almost unreadable for human.
So how can I solve this problem?
Thanks,
Jian
Re: Questions about oozie timezone
Posted by David Morel <da...@amakuru.net>.
On 16 Jun 2015, at 2:00, Laurent H wrote:
> I've got the same issue Jian, it's could be great to have an answer
> oozie
> experts! ;)
>
> --
> Laurent HATIER - Consultant Big Data & Business Intelligence chez
> CapGemini
> fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
> <http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>
>
> 2015-06-15 12:51 GMT+02:00 朱健 <zh...@jd.com>:
>
>> Hi,
>>
>> Thanks for read this email.
>>
>> I have used oozie for about 2 years. Now I have encountered one
>> problem
>> about the time zone.
>>
>> Because we located at GMT+08:00 timezone, our Hadoop system makes the
>> convention that all the data path on the HDFS is named by the
>> GMT+08:00
>> timezone. That means:
>> At UTC 2015-01-01T00:00Z, the output hourly data located under this
>> folder: $root/2015010108, not the $root/2015010100
>> At UTC 2015-01-01T01:00Z, the output hourly data located under this
>> folder: $root/2015010109, not the $root/2015010101
>>
>> So if I set the timezone in the coord to UTC, the oozie job will read
>> the
>> data of 00 hour, but I want it to read the 08. For me in Beijing,
>> China, it
>> is natural for me to understand that the oozie job will read the 08
>> data at
>> local 08:00
>>
>> I also tried to set the timezone to GMT+08:00, it didn’t work.
>> Seems the
>> timezone only impact the “Daylight Saving Time”.
>>
>> Currently I add 8 to my instance number in the coord to fix it
>> temporarily
>> : Change From <instance>0</instance> to <instance>8</instance>
>> This may be acceptable for hourly job. But it is really ugly to
>> minutes
>> jobs or dailyl jobs. Almost unreadable for human.
>>
>> So how can I solve this problem?
>>
>> Thanks,
>> Jian
>>
Hi,
the timezone spec in the coordinator node only serves to figure out
wether
there are 23, 24 or 25 hours on a given day (DST switches); the
timezones
calculations and anything related to time offsets is done in the
datasets
sections; try something like:
<coordinator-app xmlns="uri:oozie:coordinator:0.1" timezone="UTC"
name="${appName}"
frequency="${coord:hours(1)}"
start="${startTime}"
end="${endTime}"
>
...
<datasets>
<dataset
name="hourly-partition"
frequency="${coord:hours(1)}"
initial-instance="${startTime}"
timezone="Asia/Shanghai">
<uri-template><!--whatever path
-->/yyyymmddhh=${YEAR}${MONTH}${DAY}${HOUR}</uri-template>
</dataset>
</datasets>
<input-events>
<data-in name="in" dataset="hourly-partition">
<instance>${coord:current(coord:tzOffset()/60)}</instance>
</data-in>
</input-events>
David
Re: Questions about oozie timezone
Posted by Laurent H <la...@gmail.com>.
it could be* sorry !
--
Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini
fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
<http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>
2015-06-16 2:00 GMT+02:00 Laurent H <la...@gmail.com>:
> I've got the same issue Jian, it's could be great to have an answer oozie
> experts! ;)
>
> --
> Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini
> fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
> <http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>
>
> 2015-06-15 12:51 GMT+02:00 朱健 <zh...@jd.com>:
>
>> Hi,
>>
>> Thanks for read this email.
>>
>> I have used oozie for about 2 years. Now I have encountered one problem
>> about the time zone.
>>
>> Because we located at GMT+08:00 timezone, our Hadoop system makes the
>> convention that all the data path on the HDFS is named by the GMT+08:00
>> timezone. That means:
>> At UTC 2015-01-01T00:00Z, the output hourly data located under this
>> folder: $root/2015010108, not the $root/2015010100
>> At UTC 2015-01-01T01:00Z, the output hourly data located under this
>> folder: $root/2015010109, not the $root/2015010101
>>
>> So if I set the timezone in the coord to UTC, the oozie job will read the
>> data of 00 hour, but I want it to read the 08. For me in Beijing, China, it
>> is natural for me to understand that the oozie job will read the 08 data at
>> local 08:00
>>
>> I also tried to set the timezone to GMT+08:00, it didn’t work. Seems the
>> timezone only impact the “Daylight Saving Time”.
>>
>> Currently I add 8 to my instance number in the coord to fix it
>> temporarily : Change From <instance>0</instance> to <instance>8</instance>
>> This may be acceptable for hourly job. But it is really ugly to minutes
>> jobs or dailyl jobs. Almost unreadable for human.
>>
>> So how can I solve this problem?
>>
>> Thanks,
>> Jian
>>
>
>
Re: Questions about oozie timezone
Posted by Laurent H <la...@gmail.com>.
I've got the same issue Jian, it's could be great to have an answer oozie
experts! ;)
--
Laurent HATIER - Consultant Big Data & Business Intelligence chez CapGemini
fr.linkedin.com/pub/laurent-hatier/25/36b/a86/
<http://fr.linkedin.com/pub/laurent-h/25/36b/a86/>
2015-06-15 12:51 GMT+02:00 朱健 <zh...@jd.com>:
> Hi,
>
> Thanks for read this email.
>
> I have used oozie for about 2 years. Now I have encountered one problem
> about the time zone.
>
> Because we located at GMT+08:00 timezone, our Hadoop system makes the
> convention that all the data path on the HDFS is named by the GMT+08:00
> timezone. That means:
> At UTC 2015-01-01T00:00Z, the output hourly data located under this
> folder: $root/2015010108, not the $root/2015010100
> At UTC 2015-01-01T01:00Z, the output hourly data located under this
> folder: $root/2015010109, not the $root/2015010101
>
> So if I set the timezone in the coord to UTC, the oozie job will read the
> data of 00 hour, but I want it to read the 08. For me in Beijing, China, it
> is natural for me to understand that the oozie job will read the 08 data at
> local 08:00
>
> I also tried to set the timezone to GMT+08:00, it didn’t work. Seems the
> timezone only impact the “Daylight Saving Time”.
>
> Currently I add 8 to my instance number in the coord to fix it temporarily
> : Change From <instance>0</instance> to <instance>8</instance>
> This may be acceptable for hourly job. But it is really ugly to minutes
> jobs or dailyl jobs. Almost unreadable for human.
>
> So how can I solve this problem?
>
> Thanks,
> Jian
>