You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ognen Duzlevski <og...@nengoiksvelzud.com> on 2014/01/19 13:08:30 UTC

Problems with Spark (Lost executor error)

Hello,

I am trying to read in a 20GB file from an S3 bucket. I have verified I can
read small files from my cluster. The cluster itself has 15 slaves and a
master, each slave has 16GB of RAM, the machines are Amazon m1.xlarge
instances.

All I am doing is below, however a minute into execution I get the ERROR
and the subsequent WARNings. Anyone have any ideas what is going on? Why is
this so difficult? ;)

Thanks!
Ognen

scala> val f =
sc.textFile("s3n://ognen-data-pipeline/large_data/2013-11-30.json")
f: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at
<console>:12

scala> f.count
14/01/19 12:03:31 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/01/19 12:03:31 WARN LoadSnappy: Snappy native library not loaded
14/01/19 12:04:23 ERROR Client$ClientActor: Master removed our application:
FAILED; stopping client
14/01/19 12:04:23 WARN SparkDeploySchedulerBackend: Disconnected from Spark
cluster! Waiting for reconnection...
14/01/19 12:04:24 ERROR ClusterScheduler: Lost executor 10 on 10.10.0.200:
remote Akka client shutdown
14/01/19 12:04:24 WARN ClusterTaskSetManager: Lost TID 3 (task 0.0:3)
14/01/19 12:04:24 WARN ClusterTaskSetManager: Lost TID 1 (task 0.0:1)
14/01/19 12:04:24 WARN ClusterTaskSetManager: Lost TID 2 (task 0.0:2)
14/01/19 12:04:24 WARN ClusterTaskSetManager: Lost TID 0 (task 0.0:0)