You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ding (JIRA)" <ji...@apache.org> on 2015/01/27 03:04:35 UTC

[jira] [Created] (SPARK-5418) Output directory for shuffle should consider left space of each directory set in conf

ding created SPARK-5418:
---------------------------

             Summary: Output directory for shuffle should consider left space of each directory set in conf
                 Key: SPARK-5418
                 URL: https://issues.apache.org/jira/browse/SPARK-5418
             Project: Spark
          Issue Type: Bug
          Components: Shuffle
    Affects Versions: 1.2.0
         Environment: Ubuntu, others should be similar
            Reporter: ding
            Priority: Minor


I set multiple directorys in conf spark.local.dir as "scratch" space, one of them(eg. /mnt/disk1) have 30G left space while others(eg./mnt/disk2) have 100G. In current version, spark use hash to figure out which directory is used for "scratch" space. It means each directory has the same chance. After hounds of iteration of pagerank, there is "No space left" exception and driver crashes. It does not make sense since there is still 70G+ left space in other directorys. We should take consider left space on each directorys when figure out which directory should be map output dir. I will send a PR for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org