You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "pavan kumar kolamuri (JIRA)" <ji...@apache.org> on 2015/07/13 08:35:04 UTC

[jira] [Commented] (OOZIE-2253) Spark Job is failing when it is running in standalone server

    [ https://issues.apache.org/jira/browse/OOZIE-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14624264#comment-14624264 ] 

pavan kumar kolamuri commented on OOZIE-2253:
---------------------------------------------

[~rkanter] [~shwethags] Please review and merge this .  I have addressed comments from shwetha 

> Spark Job is failing when it is running in standalone server
> ------------------------------------------------------------
>
>                 Key: OOZIE-2253
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2253
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: pavan kumar kolamuri
>            Assignee: pavan kumar kolamuri
>         Attachments: OOZIE-2253-v1.patch, OOZIE-2253.patch
>
>
> When Spark Job is running in spark standalone cluster the job is getting launched and succedded and infinite jobs are getting launched in spark cluster. Oozie workflow will be in running state forever as spark is launching job infinite times. 
> This might be because in spark when job succeeds and it always do System.exit(0) . In LauncherSecurityManager  exception is thrown for this. It looks like spark(through akka framework)  is catching that and launching one more attempt for the same job. It is happening infinitely .
> {noformat}
> Sending launch command to spark://inmobi-Precision-T3610:7077
> Driver successfully submitted as driver-20150526105806-0000
> ... waiting before polling master for driver state
> ... polling master for driver state
> State of driver-20150526105806-0000 is SUBMITTED
> Sending launch command to spark://inmobi-Precision-T3610:7077
> Driver successfully submitted as driver-20150526105811-0001
> ... waiting before polling master for driver state
> ... polling master for driver state
> State of driver-20150526105811-0001 is SUBMITTED
> Sending launch command to spark://inmobi-Precision-T3610:7077
> Driver successfully submitted as driver-20150526105816-0002
> ... waiting before polling master for driver state
> ... polling master for driver state
> State of driver-20150526105816-0002 is SUBMITTED
> Sending launch command to spark://inmobi-Precision-T3610:7077
> Driver successfully submitted as driver-20150526105821-0003
> ... waiting before polling master for driver state
> ... polling master for driver state
> State of driver-20150526105821-0003 is SUBMITTED
> Sending launch command to spark://inmobi-Precision-T3610:7077
> Driver successfully submitted as driver-20150526105826-0004
> ... waiting before polling master for driver state
> {noformat}
> {noformat}
> 2015-05-26 10:58:11,573 ERROR [driverClient-akka.actor.default-dispatcher-4] akka.actor.OneForOneStrategy: Intercepted System.exit(0)
> java.lang.SecurityException: Intercepted System.exit(0)
> 	at org.apache.oozie.action.hadoop.LauncherSecurityManager.checkExit(LauncherMapper.java:601)
> 	at java.lang.Runtime.exit(Runtime.java:107)
> 	at java.lang.System.exit(System.java:962)
> 	at org.apache.spark.deploy.ClientActor.pollAndReportStatus(Client.scala:115)
> 	at org.apache.spark.deploy.ClientActor$$anonfun$receiveWithLogging$1.applyOrElse(Client.scala:123)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
> 	at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
> 	at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:53)
> 	at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
> 	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
> 	at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
> 	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
> 	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
> 	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
> 	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
> 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
> 	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> 	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)