You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ding (JIRA)" <ji...@apache.org> on 2015/01/27 03:04:35 UTC
[jira] [Created] (SPARK-5418) Output directory for shuffle should
consider left space of each directory set in conf
ding created SPARK-5418:
---------------------------
Summary: Output directory for shuffle should consider left space of each directory set in conf
Key: SPARK-5418
URL: https://issues.apache.org/jira/browse/SPARK-5418
Project: Spark
Issue Type: Bug
Components: Shuffle
Affects Versions: 1.2.0
Environment: Ubuntu, others should be similar
Reporter: ding
Priority: Minor
I set multiple directorys in conf spark.local.dir as "scratch" space, one of them(eg. /mnt/disk1) have 30G left space while others(eg./mnt/disk2) have 100G. In current version, spark use hash to figure out which directory is used for "scratch" space. It means each directory has the same chance. After hounds of iteration of pagerank, there is "No space left" exception and driver crashes. It does not make sense since there is still 70G+ left space in other directorys. We should take consider left space on each directorys when figure out which directory should be map output dir. I will send a PR for this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org