You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2015/02/26 22:41:06 UTC

[jira] [Commented] (OOZIE-2155) Incorrect DST Shifts are occurring based on the Database timezone

    [ https://issues.apache.org/jira/browse/OOZIE-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339212#comment-14339212 ] 

Robert Kanter commented on OOZIE-2155:
--------------------------------------

We were able to test this on Oracle, and it is also affected by this issue; but we couldn't figure out how to set the timezone over JDBC.  I found a [Stack Overflow comment|http://stackoverflow.com/questions/2805907/setting-session-timezone-with-spring-jdbc-oracle] describing how we could call a proprietary {{OracleConnection.setSessionTimeZone(...)}} method using reflection to do this, but I haven't tried it.

> Incorrect DST Shifts are occurring based on the Database timezone
> -----------------------------------------------------------------
>
>                 Key: OOZIE-2155
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2155
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: trunk, 4.1.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>            Priority: Blocker
>
> We discovered a critical bug where incorrect Daylight Saving Time shifts were occurring based on the Database timezone.
> Try running this Coordinator:
> {code:xml}
> <coordinator-app name=“foo” frequency="${coord:hours(1)}" start="2014-11-02T04:15Z" end="2014-11-02T16:00Z" timezone=“UTC” xmlns="uri:oozie:coordinator:0.1">
>    <controls>
>       <concurrency>10</concurrency>
>    </controls>
>    <action>
>       <workflow>
>          <app-path>${appPath}</app-path>
>          <configuration>
>             <property>
>                <name>jobTracker</name>
>                <value>${jobTracker}</value>
>             </property>
>             <property>
>                <name>nameNode</name>
>                <value>${nameNode}</value>
>             </property>
>             <property>
>                <name>queueName</name>
>                <value>${queueName}</value>
>             </property>
>          </configuration>
>       </workflow>
>    </action>     
> </coordinator-app>
> {code}
> Note that it runs over a DST shift (at least in most US timezones).
> Here's a sample of some of the actions, along with their Nominal Times:
> {noformat}
> 0000013-150212140838307-oozie-rkan-C@4     2014-11-02 07:15 GMT 
> 0000013-150212140838307-oozie-rkan-C@6     2014-11-02 09:15 GMT 
> 0000013-150212140838307-oozie-rkan-C@5     2014-11-02 09:15 GMT 
> 0000013-150212140838307-oozie-rkan-C@7     2014-11-02 10:15 GMT 
> {noformat}
> Note that actions 5 and 6 have the same time.  This is incorrect because the times are in GMT (with no DST).  I've also confirmed via a debugger that these dates have the same number of seconds since epoch (so it's not just a rendering issue).  
> By the way, if you're in a different timezone, you'll see this problem occur on different actions because the issue is related to the timezone that your database is in.  It depends on when the DST shift occurs in your timezone; for example, in "America/New_York", it happens with actions 2 and 3.  
> On that note, if I ask Oozie to use "America/Los_Angeles" to print the dates, I get this:
> {noformat}
> 0000013-150212140838307-oozie-rkan-C@4     2014-11-02 00:15 PDT 
> 0000013-150212140838307-oozie-rkan-C@6     2014-11-02 01:15 PST 
> 0000013-150212140838307-oozie-rkan-C@5     2014-11-02 01:15 PST 
> 0000013-150212140838307-oozie-rkan-C@7     2014-11-02 02:15 PST 
> {noformat}
> Action 5's nominal time should be {{2014-11-02 01:15 PDT}}, not {{2014-11-02 01:15 PST}}.  
> Using the debugger some more, I verified that Oozie is creating the nominal times correctly, and writing them to the database correctly (at least, it's converting them to Java's SQL TimeStamp objects correctly; OpenJPA handles writing them).  But when the problematic value is read back from the database, it has the wrong value!
> Here's something interesting from the [MySQL documentation|http://dev.mysql.com/doc/refman/5.5/en/datetime.html]:
> {quote}MySQL converts TIMESTAMP values from the current time zone to UTC for storage, and back from UTC to the current time zone for retrieval. (This does not occur for other types such as DATETIME.) By default, the current time zone for each connection is the server's time. The time zone can be set on a per-connection basis. As long as the time zone setting remains constant, you get back the same value you store. If you store a TIMESTAMP value, and then change the time zone and retrieve the value, the retrieved value is different from the value you stored. This occurs because the same time zone was not used for conversion in both directions.
> {quote}
> So, I think what's happening is that it's interpreting everything in PST, and not in PST and PDT depending on the time itself.  Ideally, it would just store the time since epoch, like Java's Date does, but it's doing this broken interpretation instead.  In fact, this [Stack Overflow|http://stackoverflow.com/questions/1646171/mysql-datetime-fields-and-daylight-savings-time-how-do-i-reference-the-extra/1650910#1650910] I found talks about this problem.
> I tried to create a unit test to show the problem without all the action stuff, but it seems like HSQLDB doesn't have this problem, so the test passes.  I've confirmed that this affects Derby and MySQL; I'm not sure about Postgres, Oracle, or SQLServer.
> I was able to find a workaround for Derby and one of my colleagues found one for MySQL:
> - For Derby, you just need to change the JVM's timezone to GMT (see [here|http://objectmix.com/apache/647950-moving-derby-database-across-timezones.html]).  All you have to do is add {{-Duser.timezone=GMT}} to {{CATALINA_OPTS}}.
> - For MySQL, you can either change the global timezone to GMT (which the DB admin probably won't go for), or you can add {{useLegacyDatetimeCode=false&serverTimezone=GMT}} to JDBC URL.  For example:
> {code:xml}
> <property>
>    <name>oozie.service.JPAService.jdbc.url</name>
>    <value>jdbc:mysql://HOST/oozie?useUnicode=true&characterEncoding=UTF-8&useLegacyDatetimeCode=false&serverTimezone=GMT</value>
> </property>
> {code}
> I imagine the Derby workaround won't work for anything else, but perhaps the MySQL workaround would?  
> The easiest way to fix this is to add the {{user.timezone}} thing (to fix Derby), and have Oozie always add the JDBC configs to the JDBC URL (for the others).  If Oracle, Postgres, and SQLServer don't support that, we'll have to come up with other ideas.  The other ideas I had were to replace TIMESTAMP with DATETIME or the SQL equivalent of {{long}}; though these have the downside of being schema changes and possibly other side effects.
> I have some homework for everyone :)
> Please check if the DB type listed below is affected by this, and if so, does the above workaround (or some  similar configs) fix it?
> - [~bowenzhangusa]: SQLServer
> - [~chitnis], [~puru], [~rohini], or [~egashira]: Oracle
> - Me: Postgres



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)