You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2015/07/17 20:46:05 UTC
[jira] [Created] (OOZIE-2310) If the Hadoop configuration is not
configured, you get a NullPointerException on job submission
Robert Kanter created OOZIE-2310:
------------------------------------
Summary: If the Hadoop configuration is not configured, you get a NullPointerException on job submission
Key: OOZIE-2310
URL: https://issues.apache.org/jira/browse/OOZIE-2310
Project: Oozie
Issue Type: Bug
Components: core
Affects Versions: 4.1.0
Reporter: Robert Kanter
Priority: Blocker
A user reported an NPE on startup here:
http://mail-archives.apache.org/mod_mbox/oozie-user/201507.mbox/%3cCALBGZ8oZ0GZ+hf76nQYKxiATHH5g2gbQ_0sQ78uQv_=r4Hct=Q@mail.gmail.com%3e
I did some digging and the problem is that Oozie is trying to load the Sharelib from but the {{FileSystem}} class variable is {{null}} because the {{ShareLibService}} wasn't able to create it on {{init}}. That would normally cause Oozie to fail on startup, but the default value of {{oozie.service.ShareLibService.fail.fast.on.startup}} is {{false}}, so it gets ignored.
The code in question is this:
{code:java}
try {
fs = FileSystem.get(has.createJobConf(uri.getAuthority()));
//cache action key sharelib conf list
cacheActionKeySharelibConfList();
updateLauncherLib();
updateShareLib();
}
catch (Throwable e) {
if (failOnfailure) {
LOG.error("Sharelib initialization fails", e);
throw new ServiceException(ErrorCode.E0104, getClass().getName(), "Sharelib initialization fails. ", e);
}
else {
// We don't want to actually fail init by throwing an Exception, so only create the ServiceException and
// log it
ServiceException se = new ServiceException(ErrorCode.E0104, getClass().getName(),
"Not able to cache sharelib. An Admin needs to install the sharelib with oozie-setup.sh and issue the "
+ "'oozie admin' CLI command to update the sharelib", e);
LOG.error(se);
}
}
{code}
where {{failOnfailure}} is {{false}} by default. So, {{fs}} ends up being {{null}}, and if anything later tries to use it, you get an NPE.
I think we should do two things here:
# Creating the {{FileSystem}} should be in a different try-catch so that the {{failOnfailure}} doesn't affect it. The original intention of that behavior was to ignore ShareLib failures, not Hadoop failures.
# We should improve the default Hadoop configuration (i.e. {{oozie.service.HadoopAccessorService.hadoop.configurations}}). This has been a problem for a while now where out-of-the-box, Oozie doesn't work even for a local psuedo-cluster because of this config's default. If that's not possible, we need to make it more obvious that user's must configure this before doing anything.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)