You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Artem Ervits (JIRA)" <ji...@apache.org> on 2017/06/20 18:50:00 UTC

[jira] [Assigned] (OOZIE-2310) If the Hadoop configuration is not configured, you get a NullPointerException on job submission

     [ https://issues.apache.org/jira/browse/OOZIE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Artem Ervits reassigned OOZIE-2310:
-----------------------------------

    Assignee: Artem Ervits

> If the Hadoop configuration is not configured, you get a NullPointerException on job submission
> -----------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-2310
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2310
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 4.1.0
>            Reporter: Robert Kanter
>            Assignee: Artem Ervits
>            Priority: Blocker
>
> A user reported an NPE on startup here:
> http://mail-archives.apache.org/mod_mbox/oozie-user/201507.mbox/%3cCALBGZ8oZ0GZ+hf76nQYKxiATHH5g2gbQ_0sQ78uQv_=r4Hct=Q@mail.gmail.com%3e
> I did some digging and the problem is that Oozie is trying to load the Sharelib from but the {{FileSystem}} class variable is {{null}} because the {{ShareLibService}} wasn't able to create it on {{init}}.  That would normally cause Oozie to fail on startup, but the default value of {{oozie.service.ShareLibService.fail.fast.on.startup}} is {{false}}, so it gets ignored.
> The code in question is this:
> {code:java}
> try {
>             fs = FileSystem.get(has.createJobConf(uri.getAuthority()));
>             //cache action key sharelib conf list
>             cacheActionKeySharelibConfList();
>             updateLauncherLib();
>             updateShareLib();
>         }
>         catch (Throwable e) {
>             if (failOnfailure) {
>                 LOG.error("Sharelib initialization fails", e);
>                 throw new ServiceException(ErrorCode.E0104, getClass().getName(), "Sharelib initialization fails. ", e);
>             }
>             else {
>                 // We don't want to actually fail init by throwing an Exception, so only create the ServiceException and
>                 // log it
>                 ServiceException se = new ServiceException(ErrorCode.E0104, getClass().getName(),
>                         "Not able to cache sharelib. An Admin needs to install the sharelib with oozie-setup.sh and issue the "
>                                 + "'oozie admin' CLI command to update the sharelib", e);
>                 LOG.error(se);
>             }
>         }
> {code}
> where {{failOnfailure}} is {{false}} by default.  So, {{fs}} ends up being {{null}}, and if anything later tries to use it, you get an NPE.
> I think we should do two things here:
> # Creating the {{FileSystem}} should be in a different try-catch so that the {{failOnfailure}} doesn't affect it.  The original intention of that behavior was to ignore ShareLib failures, not Hadoop failures.
> # We should improve the default Hadoop configuration (i.e. {{oozie.service.HadoopAccessorService.hadoop.configurations}}).  This has been a problem for a while now where out-of-the-box, Oozie doesn't work even for a local psuedo-cluster because of this config's default.  If that's not possible, we need to make it more obvious that user's must configure this before doing anything.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)