You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Srivastava Rachna - rasriv <Ra...@acxiom.com> on 2014/10/09 15:32:44 UTC
What are the causes of Oozie failure at the Prep State
Hi,
I am trying to test a simple mapreduce sample workflow, the action is stuck in Prep phase. When I run the same mapreduce program outside oozie it works fine. No dashboards logs are generated. Do not see any error.
Excerpt from oozie-cmf-oozie1-OOZIE_SERVER-localhost.localdomain.log.out.
2014-10-09 06:16:26,678 WARN org.apache.hadoop.security.authentication.server.AuthenticationFilter: AuthenticationToken ignored: AuthenticationToken expired
2014-10-09 06:16:58,136 INFO org.apache.oozie.service.CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable: USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION
[-] CoordMaterializeTriggerService - Curr Date= Thu Oct 09 06:21:58 PDT 2014, Num jobs to materialize = 0
2014-10-09 06:16:59,876 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Acquired lock for [org.apache.oozie.service.Statu
sTransitService]
2014-10-09 06:16:59,876 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Running coordinator status service from last inst
ance time = 2014-10-08T21:30Z
2014-10-09 06:16:59,879 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Running bundle status service from last instance
time = 2014-10-08T21:30Z
2014-10-09 06:16:59,880 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Released lock for [org.apache.oozie.service.Statu
sTransitService]
2014-10-09 06:17:02,577 INFO org.apache.oozie.service.PauseTransitService: USER[-] GROUP[-] Acquired lock for [org.apache.oozie.service.PauseTransitService]
2014-10-09 06:17:02,584 INFO org.apache.oozie.service.PauseTransitService: USER[-] GROUP[-] Released lock for [org.apache.oozie.service.PauseTransitService]
2014-10-09 06:17:59,880 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Acquired lock for [org.apache.oozie.service.Statu
sTransitService]
2014-10-09 06:17:59,881 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Running coordinator status
service from last instance time = 2014-10-09T13:16Z
2014-10-09 06:17:59,883 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Running bundle status service from last instance time = 2014-10-09T13:16Z
2014-10-09 06:17:59,885 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Released lock for [org.apache.oozie.service.StatusTransitService]
2014-10-09 06:18:02,584 INFO org.apache.oozie.service.PauseTransitService: USER[-] GROUP[-] Acquired lock for [org.apache.oozie.service.PauseTransitService]
2014-10-09 06:18:02,594 INFO org.apache.oozie.service.PauseTransitService: USER[-] GROUP[-] Released lock for [org.apache.oozie.service.PauseTransitService]
2014-10-09 06:18:59,885 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Acquired lock for [org.apache.oozie.service.StatusTransitService]
2014-10-09 06:18:59,885 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Running coordinator status service from last instance time = 2014-10-09T13:17Z
2014-10-09 06:18:59,887 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Running bundle status service from last instance time = 2014-10-09T13:17Z
2014-10-09 06:18:59,889 INFO org.apache.oozie.service.StatusTransitService$StatusTransitRunnable: USER[-] GROUP[-] Released lock for [org.apache.oozie.service.StatusTransitService]
Catalina server has started
Oct 8, 2014 8:01:17 AM org.apache.coyote.http11.Http11Protocol start
INFO: Starting Coyote HTTP/1.1 on http-11000
Oct 8, 2014 8:01:17 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 12244 ms
Job.properties
[cloudera@localhost oozieProject]$ cat job.properties
nameNode=hdfs\://localhost.localdomain\:8020
jobTracker=localhost.localdomain\:8021
queueName=default
oozie.use.system.libpath=true
oozieProjectRoot=${nameNode}/user/${user.name}/oozieProject
oozie.wf.application.path=${oozieProjectRoot}/
outputDir=oozieProject
workflow.xml
[cloudera@localhost oozieProject]$ cat workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.2" name="java-main-wf">
<start to="java-node-one"/>
<action name="java-node-one">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<main-class>com.acxiom.oozieproject.ChangeCase</main-class>
</java>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Java failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
Command to invoke oozie
hadoop jar oozieProject/lib/LogEventCount.jar sample.LogEventCount oozieProject/input/testdata oozieProject/output
only error I could find under /var/log/oozie are these:
[cloudera@localhost oozie]$ grep ERROR *
oozie-audit.log:2014-10-08 06:54:13,005 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [java-main-wf], JOBID [0000000-141007184507168-oozie-oozi-W], OPERATION [kill], PARAMETER [0000000-141007184507168-oozie-oozi-W], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]
oozie-audit.log:2014-10-08 07:04:32,688 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [pig-app-hue-script], JOBID [0000002-141007184507168-oozie-oozi-W], OPERATION [start], PARAMETER [0000002-141007184507168-oozie-oozi-W], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]
oozie-audit.log:2014-10-08 07:09:51,035 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [java-main-wf], JOBID [0000001-141007184507168-oozie-oozi-W], OPERATION [kill], PARAMETER [0000001-141007184507168-oozie-oozi-W], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]
oozie-audit.log:2014-10-08 07:42:21,419 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [WorkflowJavaMainAction], JOBID [0000003-141007184507168-oozie-oozi-W], OPERATION [kill], PARAMETER [0000003-141007184507168-oozie-oozi-W], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]
oozie-audit.log:2014-10-08 07:58:56,599 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [WorkflowJavaMainAction], JOBID [0000004-141007184507168-oozie-oozi-W], OPERATION [kill], PARAMETER [0000004-141007184507168-oozie-oozi-W], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]
oozie-audit.log:2014-10-08 08:58:40,822 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [WorkflowJavaMainAction], JOBID [0000000-141008080106562-oozie-oozi-W], OPERATION [kill], PARAMETER [0000000-141008080106562-oozie-oozi-W], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]
oozie-audit.log:2014-10-08 13:14:53,245 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [WorkFlowJavaMapReduceAction], JOBID [0000001-141008080106562-oozie-oozi-W], OPERATION [kill], PARAMETER [0000001-141008080106562-oozie-oozi-W], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]
oozie-audit.log:2014-10-08 13:29:41,458 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [WorkFlowJavaMapReduceAction], JOBID [0000002-141008080106562-oozie-oozi-W], OPERATION [kill], PARAMETER [0000002-141008080106562-oozie-oozi-W], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]
oozie-audit.log:2014-10-08 14:08:39,993 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [null], JOBID [null], OPERATION [start], PARAMETER [null], STATUS [FAILED], HTTPCODE [401], ERRORCODE [E0901], ERRORMESSAGE [E0901: Namenode [debian:8020] not allowed, not in Oozies whitelist]
oozie-audit.log:2014-10-08 14:11:00,746 INFO oozieaudit:539 - USER [cloudera], GROUP [null], APP [null], JOBID [null], OPERATION [start], PARAMETER [null], STATUS [FAILED], HTTPCODE [401], ERRORCODE [E0504], ERRORMESSAGE [E0504: App directory [hdfs://localhost.localdomain:8020/workflows/oozie-examples] does not exist]
oozie-cmf-oozie1-OOZIE_SERVER-localhost.localdomain.log.out.2014-10-08-07:2014-10-08 07:05:01,549 WARN org.apache.oozie.action.hadoop.PigActionExecutor: USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000002-141007184507168-oozie-oozi-W] ACTION[0000002-141007184507168-oozie-oozi-W@pig] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]
oozie-cmf-oozie1-OOZIE_SERVER-localhost.localdomain.log.out.2014-10-08-07:2014-10-08 07:05:01,627 INFO org.apache.oozie.command.wf.ActionEndXCommand: USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000002-141007184507168-oozie-oozi-W] ACTION[0000002-141007184507168-oozie-oozi-W@pig] ERROR is considered as FAILED for SLA
[cloudera@localhost oozie]$
Thanks for your input.
Rachana
***************************************************************************
The information contained in this communication is confidential, is
intended only for the use of the recipient named above, and may be legally
privileged.
If the reader of this message is not the intended recipient, you are
hereby notified that any dissemination, distribution or copying of this
communication is strictly prohibited.
If you have received this communication in error, please resend this
communication to the sender and delete the original message or any copy
of it from your computer system.
Thank You.
****************************************************************************