You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "shane knapp (JIRA)" <ji...@apache.org> on 2015/12/18 20:17:46 UTC

[jira] [Created] (SPARK-12427) spark builds filling up jenkins' disk

shane knapp created SPARK-12427:
-----------------------------------

             Summary: spark builds filling up jenkins' disk
                 Key: SPARK-12427
                 URL: https://issues.apache.org/jira/browse/SPARK-12427
             Project: Spark
          Issue Type: Bug
          Components: Build
            Reporter: shane knapp
            Priority: Critical


problem summary:

a few spark builds are filling up the jenkins master's disk with millions of little log files as build artifacts.  

currently, we have a raid10 array set up with 5.4T of storage.  we're currently using 4.0T, 99.9% of which is spark unit test and junit logs.

the worst offenders, with more than 100G of disk usage per job, are:
193G    ./Spark-1.6-Maven-with-YARN
194G    ./Spark-1.5-Maven-with-YARN
205G    ./Spark-1.6-Maven-pre-YARN
216G    ./Spark-1.5-Maven-pre-YARN
387G    ./Spark-Master-Maven-with-YARN
420G    ./Spark-Master-Maven-pre-YARN
520G    ./Spark-1.6-SBT
733G    ./Spark-1.5-SBT
812G    ./Spark-Master-SBT

i have attached a full report w/all builds listed as well.

each of these builds is keeping their build history for 90 days.

keep in mind that for each new matrix build, we're looking at another 200-500G per for the SBT/pre-YARN/with-YARN jobs.

a straw man, back of napkin estimate for spark 1.7 is 2T of additional disk usage.

on the hardware config side, we can move from raid10 to raid 5 and get ~3T additional storage.  if we ditch raid altogether and put in bigger disks, we can get a total of 16-20T storage on master.  another option is to have a NFS mount to a deep storage server.  all of these options will require significant downtime.

quesitons:
* can we lower the number of days that we keep build information?
* there are other options in jenkins that we can set as well:  max number of builds to keep, max # days to keep artifacts, max # of builds to keep w/artifacts
* can we make the junit and unit test logs smaller (probably not)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org