You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2015/07/17 20:46:05 UTC

[jira] [Created] (OOZIE-2310) If the Hadoop configuration is not configured, you get a NullPointerException on job submission

Robert Kanter created OOZIE-2310:
------------------------------------

             Summary: If the Hadoop configuration is not configured, you get a NullPointerException on job submission
                 Key: OOZIE-2310
                 URL: https://issues.apache.org/jira/browse/OOZIE-2310
             Project: Oozie
          Issue Type: Bug
          Components: core
    Affects Versions: 4.1.0
            Reporter: Robert Kanter
            Priority: Blocker


A user reported an NPE on startup here:
http://mail-archives.apache.org/mod_mbox/oozie-user/201507.mbox/%3cCALBGZ8oZ0GZ+hf76nQYKxiATHH5g2gbQ_0sQ78uQv_=r4Hct=Q@mail.gmail.com%3e

I did some digging and the problem is that Oozie is trying to load the Sharelib from but the {{FileSystem}} class variable is {{null}} because the {{ShareLibService}} wasn't able to create it on {{init}}.  That would normally cause Oozie to fail on startup, but the default value of {{oozie.service.ShareLibService.fail.fast.on.startup}} is {{false}}, so it gets ignored.

The code in question is this:
{code:java}
try {
            fs = FileSystem.get(has.createJobConf(uri.getAuthority()));
            //cache action key sharelib conf list
            cacheActionKeySharelibConfList();
            updateLauncherLib();
            updateShareLib();
        }
        catch (Throwable e) {
            if (failOnfailure) {
                LOG.error("Sharelib initialization fails", e);
                throw new ServiceException(ErrorCode.E0104, getClass().getName(), "Sharelib initialization fails. ", e);
            }
            else {
                // We don't want to actually fail init by throwing an Exception, so only create the ServiceException and
                // log it
                ServiceException se = new ServiceException(ErrorCode.E0104, getClass().getName(),
                        "Not able to cache sharelib. An Admin needs to install the sharelib with oozie-setup.sh and issue the "
                                + "'oozie admin' CLI command to update the sharelib", e);
                LOG.error(se);
            }
        }
{code}
where {{failOnfailure}} is {{false}} by default.  So, {{fs}} ends up being {{null}}, and if anything later tries to use it, you get an NPE.

I think we should do two things here:
# Creating the {{FileSystem}} should be in a different try-catch so that the {{failOnfailure}} doesn't affect it.  The original intention of that behavior was to ignore ShareLib failures, not Hadoop failures.
# We should improve the default Hadoop configuration (i.e. {{oozie.service.HadoopAccessorService.hadoop.configurations}}).  This has been a problem for a while now where out-of-the-box, Oozie doesn't work even for a local psuedo-cluster because of this config's default.  If that's not possible, we need to make it more obvious that user's must configure this before doing anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)