You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dan Osipov (JIRA)" <ji...@apache.org> on 2014/08/19 01:19:18 UTC

[jira] [Created] (SPARK-3113) Using local spark-submit with an EC2 cluster fails to execute job

Dan Osipov created SPARK-3113:
---------------------------------

             Summary: Using local spark-submit with an EC2 cluster fails to execute job
                 Key: SPARK-3113
                 URL: https://issues.apache.org/jira/browse/SPARK-3113
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.0.2
            Reporter: Dan Osipov


Steps taken:
* Start a new cluster using EC2 script. Command:
{code}
./spark-ec2 -k keypairname -i ~/path/tokeypair.pem -s 2 launch sp-test
{code}
* SSH into the master, execute a job -> Success. Command:
{code}
/root/spark/bin/spark-submit --verbose --executor-memory 6G --master spark://ec2-174-129-92-3.compute-1.amazonaws.com:7077 --class JobApp --name Job /root/Job-1.0.0.jar s3n://input-bucket/logs/year=2014/month=6/day=21/* s3n://input-bucket/output/
{code}
* On a local build of spark, execute the same command -> Failure:
{code}
./spark-submit --verbose --executor-memory 6G --master spark://ec2-174-129-92-3.compute-1.amazonaws.com:7077 --class JobApp --name Job ~/local/path/to/Job-1.0.0.jar s3n://input-bucket/logs/year=2014/month=6/day=21/* s3n://input-bucket/output/
{code}

Local output:
{code}
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/08/18 16:04:22 INFO SecurityManager: Changing view acls to: daniil.osipov,
14/08/18 16:04:22 INFO SecurityManager: Changing modify acls to: daniil.osipov,
14/08/18 16:04:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(daniil.osipov, ); users with modify permissions: Set(daniil.osipov, )
14/08/18 16:04:22 INFO Slf4jLogger: Slf4jLogger started
14/08/18 16:04:22 INFO Remoting: Starting remoting
14/08/18 16:04:22 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark@192.168.115.108:57076]
14/08/18 16:04:22 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark@192.168.115.108:57076]
14/08/18 16:04:22 INFO Utils: Successfully started service 'spark' on port 57076.
14/08/18 16:04:22 INFO SparkEnv: Registering MapOutputTracker
14/08/18 16:04:22 INFO SparkEnv: Registering BlockManagerMaster
14/08/18 16:04:22 INFO DiskBlockManager: Created local directory at /var/folders/cs/651p8b5x0pb4ytl7zsv2fb7r0000gp/T/spark-local-20140818160422-5a2d
14/08/18 16:04:22 INFO Utils: Successfully started service 'Connection manager for block manager' on port 57077.
14/08/18 16:04:22 INFO ConnectionManager: Bound socket to port 57077 with id = ConnectionManagerId(192.168.115.108,57077)
14/08/18 16:04:22 INFO MemoryStore: MemoryStore started with capacity 265.1 MB
14/08/18 16:04:22 INFO BlockManagerMaster: Trying to register BlockManager
14/08/18 16:04:22 INFO BlockManagerMasterActor: Registering block manager 192.168.115.108:57077 with 265.1 MB RAM
14/08/18 16:04:22 INFO BlockManagerMaster: Registered BlockManager
14/08/18 16:04:22 INFO HttpFileServer: HTTP File server directory is /var/folders/cs/651p8b5x0pb4ytl7zsv2fb7r0000gp/T/spark-1c5079ae-eb09-457d-bfc3-a7724fb15768
14/08/18 16:04:22 INFO HttpServer: Starting HTTP Server
14/08/18 16:04:23 INFO Utils: Successfully started service 'HTTP file server' on port 57078.
14/08/18 16:04:23 INFO Utils: Successfully started service 'SparkUI' on port 4040.
14/08/18 16:04:23 INFO SparkUI: Started SparkUI at http://192.168.115.108:4040
14/08/18 16:04:23 INFO SparkContext: Added JAR file:/.../path/to/target/scala-2.10/Job-1.0.0.jar at http://192.168.115.108:57078/jars/Job-1.0.0.jar with timestamp 1408403063777
14/08/18 16:04:23 INFO AppClient$ClientActor: Connecting to master spark://ec2-174-129-92-3.compute-1.amazonaws.com:7077...
14/08/18 16:04:23 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
processing s3n://input-bucket/logs/year=2014/month=6/day=21/*
14/08/18 16:04:24 INFO MemoryStore: ensureFreeSpace(34006) called with curMem=0, maxMem=278019440
14/08/18 16:04:24 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 33.2 KB, free 265.1 MB)
14/08/18 16:04:24 INFO MemoryStore: ensureFreeSpace(56) called with curMem=34006, maxMem=278019440
14/08/18 16:04:24 INFO MemoryStore: Block broadcast_0_meta stored as values in memory (estimated size 56.0 B, free 265.1 MB)
14/08/18 16:04:24 INFO BlockManagerInfo: Added broadcast_0_meta in memory on 192.168.115.108:57077 (size: 56.0 B, free: 265.1 MB)
14/08/18 16:04:24 INFO BlockManagerMaster: Updated info of block broadcast_0_meta
14/08/18 16:04:24 INFO MemoryStore: ensureFreeSpace(3824) called with curMem=34062, maxMem=278019440
14/08/18 16:04:24 INFO MemoryStore: Block broadcast_0_piece0 stored as values in memory (estimated size 3.7 KB, free 265.1 MB)
14/08/18 16:04:24 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.115.108:57077 (size: 3.7 KB, free: 265.1 MB)
14/08/18 16:04:24 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
14/08/18 16:04:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/08/18 16:04:24 WARN LoadSnappy: Snappy native library not loaded
14/08/18 16:04:30 INFO FileInputFormat: Total input paths to process : 144
14/08/18 16:04:33 INFO SparkContext: Starting job: saveAsTextFile at JobApp.scala:65
14/08/18 16:04:33 INFO DAGScheduler: Registering RDD 6 (groupBy at JobApp.scala:32)
14/08/18 16:04:33 INFO DAGScheduler: Registering RDD 10 (groupBy at JobApp.scala:48)
14/08/18 16:04:33 INFO DAGScheduler: Got job 0 (saveAsTextFile at JobApp.scala:65) with 288 output partitions (allowLocal=false)
14/08/18 16:04:33 INFO DAGScheduler: Final stage: Stage 0(saveAsTextFile at RecRateApp.scala:65)
14/08/18 16:04:33 INFO DAGScheduler: Parents of final stage: List(Stage 2)
14/08/18 16:04:33 INFO DAGScheduler: Missing parents: List(Stage 2)
14/08/18 16:04:33 INFO DAGScheduler: Submitting Stage 1 (MappedRDD[6] at groupBy at JobApp.scala:32), which has no missing parents
14/08/18 16:04:33 INFO MemoryStore: ensureFreeSpace(3888) called with curMem=37886, maxMem=278019440
14/08/18 16:04:33 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.8 KB, free 265.1 MB)
14/08/18 16:04:33 INFO MemoryStore: ensureFreeSpace(56) called with curMem=41774, maxMem=278019440
14/08/18 16:04:33 INFO MemoryStore: Block broadcast_1_meta stored as values in memory (estimated size 56.0 B, free 265.1 MB)
14/08/18 16:04:33 INFO BlockManagerInfo: Added broadcast_1_meta in memory on 192.168.115.108:57077 (size: 56.0 B, free: 265.1 MB)
14/08/18 16:04:33 INFO BlockManagerMaster: Updated info of block broadcast_1_meta
14/08/18 16:04:33 INFO MemoryStore: ensureFreeSpace(2376) called with curMem=41830, maxMem=278019440
14/08/18 16:04:33 INFO MemoryStore: Block broadcast_1_piece0 stored as values in memory (estimated size 2.3 KB, free 265.1 MB)
14/08/18 16:04:33 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.115.108:57077 (size: 2.3 KB, free: 265.1 MB)
14/08/18 16:04:33 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
14/08/18 16:04:33 INFO DAGScheduler: Submitting 144 missing tasks from Stage 1 (MappedRDD[6] at groupBy at JobApp.scala:32)
14/08/18 16:04:33 INFO TaskSchedulerImpl: Adding task set 1.0 with 144 tasks
14/08/18 16:04:43 INFO AppClient$ClientActor: Connecting to master spark://ec2-174-129-92-3.compute-1.amazonaws.com:7077...
14/08/18 16:04:48 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
14/08/18 16:05:03 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
14/08/18 16:05:03 INFO AppClient$ClientActor: Connecting to master spark://ec2-174-129-92-3.compute-1.amazonaws.com:7077...
14/08/18 16:05:18 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
14/08/18 16:05:23 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
14/08/18 16:05:23 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
14/08/18 16:05:23 INFO TaskSchedulerImpl: Cancelling stage 1
14/08/18 16:05:23 INFO DAGScheduler: Failed to run saveAsTextFile at RecRateApp.scala:65
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: All masters are unresponsive! Giving up.
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1153)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1142)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1141)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1141)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:682)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:682)
	at scala.Option.foreach(Option.scala:236)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:682)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1359)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
	at akka.actor.ActorCell.invoke(ActorCell.scala:456)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
	at akka.dispatch.Mailbox.run(Mailbox.scala:219)
	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{code}

Server log:
{code}
14/08/18 23:04:29 INFO master.Master: akka.tcp://spark@192.168.115.108:57076 got disassociated, removing it.
14/08/18 23:04:29 INFO actor.LocalActorRef: Message [akka.remote.transport.AssociationHandle$Disassociated] from Actor[akka://sparkMaster/deadLetters] to Actor[akka://sparkMaster/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fspark%40192.168.115.108%3A57076-12/endpointWriter/endpointReader-akka.tcp%3A%2F%2Fspark%40192.168.115.108%3A57076-0#1367767755] was not delivered. [13] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/08/18 23:04:29 INFO actor.LocalActorRef: Message [akka.remote.transport.AssociationHandle$Disassociated] from Actor[akka://sparkMaster/deadLetters] to Actor[akka://sparkMaster/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkMaster%4050.201.227.222%3A57082-55#48069465] was not delivered. [14] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/08/18 23:04:47 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56477]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56477]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56477]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56477
]
14/08/18 23:04:47 INFO master.Master: akka.tcp://spark@192.168.115.108:56477 got disassociated, removing it.
14/08/18 23:05:04 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56502]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56502]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56502]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56502
]
14/08/18 23:05:04 INFO master.Master: akka.tcp://spark@192.168.115.108:56502 got disassociated, removing it.
14/08/18 23:05:32 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:57076]: Error [Association failed with [akka.tcp://spark@192.168.115.108:57076]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:57076]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:57076
]
14/08/18 23:05:32 INFO master.Master: akka.tcp://spark@192.168.115.108:57076 got disassociated, removing it.
14/08/18 23:05:50 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56477]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56477]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56477]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56477
]
14/08/18 23:05:50 INFO master.Master: akka.tcp://spark@192.168.115.108:56477 got disassociated, removing it.
14/08/18 23:06:07 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56502]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56502]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56502]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56502
]
14/08/18 23:06:07 INFO master.Master: akka.tcp://spark@192.168.115.108:56502 got disassociated, removing it.
14/08/18 23:06:35 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:57076]: Error [Association failed with [akka.tcp://spark@192.168.115.108:57076]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:57076]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:57076
]
14/08/18 23:06:35 INFO master.Master: akka.tcp://spark@192.168.115.108:57076 got disassociated, removing it.
14/08/18 23:06:53 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56477]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56477]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56477]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56477
]
14/08/18 23:06:53 INFO master.Master: akka.tcp://spark@192.168.115.108:56477 got disassociated, removing it.
14/08/18 23:07:10 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56502]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56502]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56502]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56502
]
14/08/18 23:07:10 INFO master.Master: akka.tcp://spark@192.168.115.108:56502 got disassociated, removing it.
14/08/18 23:07:38 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:57076]: Error [Association failed with [akka.tcp://spark@192.168.115.108:57076]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:57076]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:57076
]
14/08/18 23:07:38 INFO master.Master: akka.tcp://spark@192.168.115.108:57076 got disassociated, removing it.
14/08/18 23:07:56 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56477]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56477]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56477]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56477
]
14/08/18 23:07:56 INFO master.Master: akka.tcp://spark@192.168.115.108:56477 got disassociated, removing it.
14/08/18 23:08:13 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56502]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56502]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56502]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56502
]
14/08/18 23:08:13 INFO master.Master: akka.tcp://spark@192.168.115.108:56502 got disassociated, removing it.
14/08/18 23:08:41 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:57076]: Error [Association failed with [akka.tcp://spark@192.168.115.108:57076]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:57076]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:57076
]
14/08/18 23:08:41 INFO master.Master: akka.tcp://spark@192.168.115.108:57076 got disassociated, removing it.
14/08/18 23:08:59 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56477]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56477]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56477]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56477
]
14/08/18 23:08:59 INFO master.Master: akka.tcp://spark@192.168.115.108:56477 got disassociated, removing it.
14/08/18 23:09:16 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56502]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56502]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56502]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56502
]
14/08/18 23:09:16 INFO master.Master: akka.tcp://spark@192.168.115.108:56502 got disassociated, removing it.
14/08/18 23:09:44 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:57076]: Error [Association failed with [akka.tcp://spark@192.168.115.108:57076]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:57076]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:57076
]
14/08/18 23:09:44 INFO master.Master: akka.tcp://spark@192.168.115.108:57076 got disassociated, removing it.
14/08/18 23:10:02 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56477]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56477]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56477]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56477
]
14/08/18 23:10:02 INFO master.Master: akka.tcp://spark@192.168.115.108:56477 got disassociated, removing it.
14/08/18 23:10:19 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56502]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56502]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56502]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56502
]
14/08/18 23:10:19 INFO master.Master: akka.tcp://spark@192.168.115.108:56502 got disassociated, removing it.
14/08/18 23:10:47 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:57076]: Error [Association failed with [akka.tcp://spark@192.168.115.108:57076]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:57076]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:57076
]
14/08/18 23:10:47 INFO master.Master: akka.tcp://spark@192.168.115.108:57076 got disassociated, removing it.
14/08/18 23:11:05 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56477]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56477]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56477]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56477
]
14/08/18 23:11:05 INFO master.Master: akka.tcp://spark@192.168.115.108:56477 got disassociated, removing it.
14/08/18 23:11:22 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56502]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56502]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56502]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56502
]
14/08/18 23:11:22 INFO master.Master: akka.tcp://spark@192.168.115.108:56502 got disassociated, removing it.
14/08/18 23:11:51 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:57076]: Error [Association failed with [akka.tcp://spark@192.168.115.108:57076]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:57076]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:57076
]
14/08/18 23:11:51 INFO master.Master: akka.tcp://spark@192.168.115.108:57076 got disassociated, removing it.
14/08/18 23:12:08 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkMaster@ec2-174-129-92-3.compute-1.amazonaws.com:7077] -> [akka.tcp://spark@192.168.115.108:56477]: Error [Association failed with [akka.tcp://spark@192.168.115.108:56477]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.115.108:56477]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection timed out: /192.168.115.108:56477
]
14/08/18 23:12:08 INFO master.Master: akka.tcp://spark@192.168.115.108:56477 got disassociated, removing it.
{code}

Is this not the way the local script is supposed to be used, to submit a job to a remote cluster?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org