You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Shangyu Luo <ls...@gmail.com> on 2013/10/29 02:52:20 UTC

Questions about the files that Spark will produce during its running

Hello,
I have some questions about the files that Spark will create and use during
its running.
(1) I am running a python program on Spark with a cluster of EC2. The data
comes from hdfs file system.  I have met the following error in the console
of the master node:
*java.io.FileNotFoundException:
/data2/tmp/spark-local-20131029003412-c340/1b/shuffle_1_527_79 (No space
left on device)*
        at java.io.FileOutputStream.openAppend(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:207)
        at
org.apache.spark.storage.DiskStore$DiskBlockObjectWriter.open(DiskStore.scala:58)
        at
org.apache.spark.storage.DiskStore$DiskBlockObjectWriter.write(DiskStore.scala:107)
        at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$run$1.apply(ShuffleMapTask.scala:152)
        at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$run$1.apply(ShuffleMapTask.scala:149)
        at scala.collection.Iterator$class.foreach(Iterator.scala:772)
        at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
        at
org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:149)
        at
org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:88)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:679)
I set spark.local.dir=*/data2/tmp* in spark-env.sh and there is about
*800G*space in data2 directory. I have checked the space of data2 and
it is just
used about 3G.
So why Spark thinks that there is no space left on device?

(2) Moreover, *I am wondering if Spark will create some files  under other
directories other than spark.local.dir*? Presently I use a=b.map.*
persist(storage.disk_only)* in some part of my program, where will the
persisted data be stored?

(3) Lastly, I also had "*Removing BlockManager xxx with no recent heart
beats: xxxxxms exceeds 45000ms* " error sometimes. I have set the
corresponding parameters in spark-env.sh:
SPARK_JAVA_OPTS+="-Dspark.akka.timeout=300000 "
SPARK_JAVA_OPTS+="-Dspark.worker.timeout=300000 "
SPARK_JAVA_OPTS+="-Dspark.akka.askTimeout=3000 "
SPARK_JAVA_OPTS+="-Dspark.storage.blockManagerHeartBeatMs=300000 "
SPARK_JAVA_OPTS+="-Dspark.akka.retry.wait=300000 "
But it does no help. Can someone gives me some suggestion about solving
this problem?

Any help will be appreciated!
Thanks!

Best,
Shangyu

Re: Questions about the files that Spark will produce during its running

Posted by Shangyu Luo <ls...@gmail.com>.

Yes,
I broadcast the spark-env.sh file to all worker nodes before I run my
program and then execute bin/stop-all.sh, bin/start-all.sh.
I have also viewed the size of data2 directory on each worker node and it
is also about 800G.
Thanks!


2013/10/29 Matei Zaharia <ma...@gmail.com>

> The error is from a worker node -- did you check that /data2 is set up
> properly on the worker nodes too? In general that should be the only
> directory used.
>
> Matei
>
> On Oct 28, 2013, at 6:52 PM, Shangyu Luo <ls...@gmail.com> wrote:
>
> Hello,
> I have some questions about the files that Spark will create and use
> during its running.
> (1) I am running a python program on Spark with a cluster of EC2. The data
> comes from hdfs file system.  I have met the following error in the console
> of the master node:
> *java.io.FileNotFoundException:
> /data2/tmp/spark-local-20131029003412-c340/1b/shuffle_1_527_79 (No space
> left on device)*
>         at java.io.FileOutputStream.openAppend(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:207)
>         at
> org.apache.spark.storage.DiskStore$DiskBlockObjectWriter.open(DiskStore.scala:58)
>         at
> org.apache.spark.storage.DiskStore$DiskBlockObjectWriter.write(DiskStore.scala:107)
>         at
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$run$1.apply(ShuffleMapTask.scala:152)
>         at
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$run$1.apply(ShuffleMapTask.scala:149)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:772)
>         at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:149)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:88)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:679)
> I set spark.local.dir=*/data2/tmp* in spark-env.sh and there is about *
> 800G* space in data2 directory. I have checked the space of data2 and it
> is just used about 3G.
> So why Spark thinks that there is no space left on device?
>
> (2) Moreover, *I am wondering if Spark will create some files  under
> other directories other than spark.local.dir*? Presently I use a=b.map.*
> persist(storage.disk_only)* in some part of my program, where will the
> persisted data be stored?
>
> (3) Lastly, I also had "*Removing BlockManager xxx with no recent heart
> beats: xxxxxms exceeds 45000ms* " error sometimes. I have set the
> corresponding parameters in spark-env.sh:
> SPARK_JAVA_OPTS+="-Dspark.akka.timeout=300000 "
> SPARK_JAVA_OPTS+="-Dspark.worker.timeout=300000 "
> SPARK_JAVA_OPTS+="-Dspark.akka.askTimeout=3000 "
> SPARK_JAVA_OPTS+="-Dspark.storage.blockManagerHeartBeatMs=300000 "
> SPARK_JAVA_OPTS+="-Dspark.akka.retry.wait=300000 "
> But it does no help. Can someone gives me some suggestion about solving
> this problem?
>
> Any help will be appreciated!
> Thanks!
>
> Best,
> Shangyu
>
>
>


-- 
--

Shangyu, Luo
Department of Computer Science
Rice University

--
Not Just Think About It, But Do It!
--
Success is never final.
--
Losers always whine about their best

Re: Questions about the files that Spark will produce during its running

Posted by Matei Zaharia <ma...@gmail.com>.

The error is from a worker node -- did you check that /data2 is set up properly on the worker nodes too? In general that should be the only directory used.

Matei

On Oct 28, 2013, at 6:52 PM, Shangyu Luo <ls...@gmail.com> wrote:

> Hello,
> I have some questions about the files that Spark will create and use during its running.
> (1) I am running a python program on Spark with a cluster of EC2. The data comes from hdfs file system.  I have met the following error in the console of the master node:
> java.io.FileNotFoundException: /data2/tmp/spark-local-20131029003412-c340/1b/shuffle_1_527_79 (No space left on device)
>         at java.io.FileOutputStream.openAppend(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:207)
>         at org.apache.spark.storage.DiskStore$DiskBlockObjectWriter.open(DiskStore.scala:58)
>         at org.apache.spark.storage.DiskStore$DiskBlockObjectWriter.write(DiskStore.scala:107)
>         at org.apache.spark.scheduler.ShuffleMapTask$$anonfun$run$1.apply(ShuffleMapTask.scala:152)
>         at org.apache.spark.scheduler.ShuffleMapTask$$anonfun$run$1.apply(ShuffleMapTask.scala:149)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:772)
>         at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
>         at org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:149)
>         at org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:88)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:679)
> I set spark.local.dir=/data2/tmp in spark-env.sh and there is about 800G space in data2 directory. I have checked the space of data2 and it is just used about 3G.
> So why Spark thinks that there is no space left on device?
> 
> (2) Moreover, I am wondering if Spark will create some files  under other directories other than spark.local.dir? Presently I use a=b.map.persist(storage.disk_only) in some part of my program, where will the persisted data be stored?
> 
> (3) Lastly, I also had "Removing BlockManager xxx with no recent heart beats: xxxxxms exceeds 45000ms " error sometimes. I have set the corresponding parameters in spark-env.sh:
> SPARK_JAVA_OPTS+="-Dspark.akka.timeout=300000 "
> SPARK_JAVA_OPTS+="-Dspark.worker.timeout=300000 "
> SPARK_JAVA_OPTS+="-Dspark.akka.askTimeout=3000 "
> SPARK_JAVA_OPTS+="-Dspark.storage.blockManagerHeartBeatMs=300000 "
> SPARK_JAVA_OPTS+="-Dspark.akka.retry.wait=300000 "
> But it does no help. Can someone gives me some suggestion about solving this problem?
> 
> Any help will be appreciated!
> Thanks!
> 
> Best,
> Shangyu
>