You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xu Chen (JIRA)" <ji...@apache.org> on 2015/12/08 07:15:10 UTC

[jira] [Commented] (SPARK-11487) Spark Master shutdown automatically after some applications execution

    [ https://issues.apache.org/jira/browse/SPARK-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046438#comment-15046438 ] 

Xu Chen commented on SPARK-11487:
---------------------------------


{code}
java.lang.OutOfMemoryError: Java heap space 
{code}

Increment Master heap memory


> Spark Master shutdown automatically after some applications execution
> ---------------------------------------------------------------------
>
>                 Key: SPARK-11487
>                 URL: https://issues.apache.org/jira/browse/SPARK-11487
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.0
>         Environment: Spark Standalone on CentOS 6.6, 
> One Master and 5 worker nodes cluster (Each Node Memory: > 150 GB each, 72 cores each)
>            Reporter: Sandeep Pal
>              Labels: master
>
> The master logs are as follow after the spark automatic shutdown:
> 15/11/02 20:50:01 INFO master.Master: Registering app PythonWordCount
> 15/11/02 20:50:01 INFO master.Master: Registered app PythonWordCount with ID app-20151102205001-0025
> 15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/0 on worker worker-20151030135450-x.x.x.76-42502
> 15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/1 on worker worker-20151030135450-x.x.x.86-51916
> 15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/2 on worker worker-20151030135450-x.x.x.85-47388
> 15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/3 on worker worker-20151030125450-x.x.x.69-51604
> 15/11/02 20:50:01 INFO master.Master: Launching executor app-20151102205001-0025/4 on worker worker-20151030135450-x.x.x.87-35705
> 15/11/02 20:57:35 INFO master.Master: Received unregister request from application app-20151102205001-0025
> 15/11/02 20:57:35 INFO master.Master: Removing app app-20151102205001-0025
> 15/11/02 20:57:35 WARN master.Master: Application PythonWordCount is still in progress, it may be terminated abnormally.
> 15/11/02 20:57:35 INFO spark.SecurityManager: Changing view acls to: root
> 15/11/02 20:57:35 INFO spark.SecurityManager: Changing modify acls to: root
> 15/11/02 20:57:35 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
> 15/11/02 20:57:43 INFO master.Master: x.x.x.x:47502 got disassociated, removing it.
> 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/4
> 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/3
> 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/0
> 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/2
> 15/11/02 20:57:43 WARN master.Master: Got status update for unknown executor app-20151102205001-0025/1
> 15/11/02 20:58:28 INFO master.Master: Registering app App Test
> 15/11/02 20:58:28 INFO master.Master: Registered app App Test with ID app-20151102205828-0026
> 15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/0 on worker worker-20151030135450-x.x.x.76-42502
> 15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/1 on worker worker-20151030135450-x.x.x.86-51916
> 15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/2 on worker worker-20151030135450-x.x.x.85-47388
> 15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/3 on worker worker-20151030125450-x.x.x.69-51604
> 15/11/02 20:58:28 INFO master.Master: Launching executor app-20151102205828-0026/4 on worker worker-20151030135450-x.x.x.87-35705
> 15/11/02 20:59:35 INFO master.Master: Received unregister request from application app-20151102205828-0026
> 15/11/02 20:59:35 INFO master.Master: Removing app app-20151102205828-0026
> 15/11/02 20:59:35 WARN master.Master: Application App Test is still in progress, it may be terminated abnormally.
> 15/11/02 20:59:35 INFO spark.SecurityManager: Changing view acls to: root
> 15/11/02 20:59:35 INFO spark.SecurityManager: Changing modify acls to: root
> 15/11/02 20:59:35 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
> 15/11/02 21:17:46 INFO master.Master: x.x.x.x:40954 got disassociated, removing it.
> 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/3
> 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/1
> 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/0
> 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/2
> 15/11/02 21:17:46 WARN master.Master: Got status update for unknown executor app-20151102205828-0026/4
> 15/11/02 21:17:46 INFO master.Master: x.x.x.x:37676 got disassociated, removing it.
> 15/11/02 21:17:48 ERROR akka.ErrorMonitor: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-3] shutting down ActorSystem [sparkMaster]
> java.lang.OutOfMemoryError: Java heap space
>         at com.fasterxml.jackson.core.util.BufferRecycler.calloc(BufferRecycler.java:156)
>         at com.fasterxml.jackson.core.util.BufferRecycler.allocCharBuffer(BufferRecycler.java:124)
>         at com.fasterxml.jackson.core.io.IOContext.allocTokenBuffer(IOContext.java:181)
>         at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:830)
>         at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2161)
>         at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19)
>         at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44)
>         at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58)
>         at org.apache.spark.deploy.master.Master.rebuildSparkUI(Master.scala:950)
>         at org.apache.spark.deploy.master.Master.removeApplication(Master.scala:812)
>         at org.apache.spark.deploy.master.Master.org$apache$spark$deploy$master$Master$$finishApplication(Master.scala:790)
>         at org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382)
>         at org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382)
>         at scala.Option.foreach(Option.scala:236)
>         at org.apache.spark.deploy.master.Master$$anonfun$receive$1.applyOrElse(Master.scala:382)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125)
>         at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>         at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>         at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>         at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
>         at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
>         at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>         at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:220)
> 15/11/02 21:17:48 ERROR actor.ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-3] shutting down ActorSystem [sparkMaster]
> java.lang.OutOfMemoryError: Java heap space
>         at com.fasterxml.jackson.core.util.BufferRecycler.calloc(BufferRecycler.java:156)
>         at com.fasterxml.jackson.core.util.BufferRecycler.allocCharBuffer(BufferRecycler.java:124)
>         at com.fasterxml.jackson.core.io.IOContext.allocTokenBuffer(IOContext.java:181)
>         at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:830)
>         at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2161)
>         at org.json4s.jackson.JsonMethods$class.parse(JsonMethods.scala:19)
>         at org.json4s.jackson.JsonMethods$.parse(JsonMethods.scala:44)
>         at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:58)
>         at org.apache.spark.deploy.master.Master.rebuildSparkUI(Master.scala:950)
>         at org.apache.spark.deploy.master.Master.removeApplication(Master.scala:812)
>         at org.apache.spark.deploy.master.Master.org$apache$spark$deploy$master$Master$$finishApplication(Master.scala:790)
>         at org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382)
>         at org.apache.spark.deploy.master.Master$$anonfun$receive$1$$anonfun$applyOrElse$21.apply(Master.scala:382)
>         at scala.Option.foreach(Option.scala:236)
>         at org.apache.spark.deploy.master.Master$$anonfun$receive$1.applyOrElse(Master.scala:382)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125)
>         at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>         at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>         at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>         at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
>         at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
>         at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>         at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
>         at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:220)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org