You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Chris Lambertus <cm...@apache.org> on 2018/04/24 23:13:14 UTC

purging of old job artifacts

Hi builders,

We are in the initial phases of speccing out a replacement Jenkins server using NVMe disks for improved performance. At this time, the jobs/ directory on jenkins-master has artifacts dating back to at least 2012, and these artifacts are taking up a considerable amount of space on the master, and equate to a significant expense in purchasing a new machine.

On 30 April 2018 Infra will be purging ALL job artifacts older than 180 days on the jenkins master.  This will include any saved builds, logs, and any binary artifacts on the master older than 180 days.

If anyone has concerns over this course of action, please reply here.

Thanks,
Chris
ASF Infra


Re: purging of old job artifacts

Posted by Andrew Bayer <an...@gmail.com>.
Yeah, fwiw, any build that doesn't have a build.xml (and therefore can't be
loaded by Jenkins/displayed in the UI/etc) or any job directory that
doesn't have a config.xml should just be rm'd. That's just eating disk
space with no way of being used.

A.

On Wed, Apr 25, 2018 at 11:07 AM, Chris Lambertus <cm...@apache.org> wrote:

>
>
> > On Apr 25, 2018, at 7:49 AM, Allen Wittenauer <aw...@effectivemachines.com>
> wrote:
> >
> >> Using the yetus jobs as a reference, yetus-java builds 480 and 481 are
> nearly a year old, but only contain a few kilobytes of data. While removing
> them saves no space, they also provide no value,
> >
> >       … to infra.
> >
> >       The value to the communities that any job services is really up to
> those communities to decide.
>
>
> I mean builds 480 and 481 literally provide no value, since they are not
> accessible via the jenkins UI and only contain a polling.log file from
> 2017. I recognize that some projects may want to retain specific older
> builds, but I personally question the utility of data kept older than 6
> months. For build data that’s significantly outside the nominal 30 day / 10
> job window, we’d ask that this be exported and managed locally by the
> project rather than remaining “live” on the master. Is there a reason other
> than convenience for it to remain live?
>
> Just based on some initial review, excluding the build logs and job
> metadata is probably do-able for an initial pass at purging old data, but
> I’ll want to generate some data on how many old build histories exist. As I
> stated earlier the main goal here is to remove dead jobs and binary
> artifacts. There do appear to be a fair few jobs which no longer exist as
> mentioned else-thread, so hopefully we’ll realize some notable performance
> and space improvements by culling the low hanging fruit before a more
> drastic approach is required.
>
> -Chris
>
>

Re: purging of old job artifacts

Posted by Chris Lambertus <cm...@apache.org>.

> On Apr 25, 2018, at 7:49 AM, Allen Wittenauer <aw...@effectivemachines.com> wrote:
> 
>> Using the yetus jobs as a reference, yetus-java builds 480 and 481 are nearly a year old, but only contain a few kilobytes of data. While removing them saves no space, they also provide no value,
> 
> 	… to infra.
> 
> 	The value to the communities that any job services is really up to those communities to decide.


I mean builds 480 and 481 literally provide no value, since they are not accessible via the jenkins UI and only contain a polling.log file from 2017. I recognize that some projects may want to retain specific older builds, but I personally question the utility of data kept older than 6 months. For build data that’s significantly outside the nominal 30 day / 10 job window, we’d ask that this be exported and managed locally by the project rather than remaining “live” on the master. Is there a reason other than convenience for it to remain live?

Just based on some initial review, excluding the build logs and job metadata is probably do-able for an initial pass at purging old data, but I’ll want to generate some data on how many old build histories exist. As I stated earlier the main goal here is to remove dead jobs and binary artifacts. There do appear to be a fair few jobs which no longer exist as mentioned else-thread, so hopefully we’ll realize some notable performance and space improvements by culling the low hanging fruit before a more drastic approach is required.

-Chris


Re: purging of old job artifacts

Posted by Greg Stein <gs...@gmail.com>.
On Wed, Apr 25, 2018 at 9:49 AM, Allen Wittenauer <aw...@effectivemachines.com>
wrote:
>...

>         I apologize.  I took Greg’s reply as Infra’s official “Go Pound
> Sand” response to what I felt was a reasonable request for more information.
>

It was "Please stop asking ChrisL for more work. The todo work is upon
users of the build system to find/justify retention.". So look through the
areas you're responsible for, find anything that needs to be kept on the
master (as opposed to offloaded), and then we can discuss those items.

Cheers,
-g

Re: purging of old job artifacts

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
> On Apr 25, 2018, at 12:04 AM, Chris Lambertus <cm...@apache.org> wrote:
> 
> The artifacts do not need to be kept in perpetuity. When every project does this, there are significant costs in both disk space and performance. Our policy has been 30 days or 10 jobs retention. 

	That policy wasn’t always in place.

> Please dispense with the passive aggressive “unwilling to provide” nonsense. This is inflammatory and anti-Infra for no valid reason. This process is meant to be a pragmatic approach to cleaning up and improving a service used by a large number of projects. The fact that I didn’t have time to post the job list in the 4 hours since my last reply does not need to be construed as reticence on Infra’s part to provide it.

	I apologize.  I took Greg’s reply as Infra’s official “Go Pound Sand” response to what I felt was a reasonable request for more information.


> Using the yetus jobs as a reference, yetus-java builds 480 and 481 are nearly a year old, but only contain a few kilobytes of data. While removing them saves no space, they also provide no value,

	… to infra.

	The value to the communities that any job services is really up to those communities to decide.  

	Thank you for providing the data.  Now the projects can determine what they need to save and perhaps change process/procedures before infra wipes it out.




Re: purging of old job artifacts

Posted by Greg Stein <gs...@gmail.com>.
On Thu, Apr 26, 2018 at 5:08 PM, Alex Harui <ah...@adobe.com.invalid>
wrote:

> On 4/26/18, 2:57 PM, "Greg Stein" <gs...@gmail.com> wrote:
>     On Thu, Apr 26, 2018, 16:42 Greg Stein <gs...@gmail.com> wrote:
>
>     > On Thu, Apr 26, 2018, 15:06 Alex Harui <ah...@adobe.com.invalid>
> wrote:
>     >
>     >> Perhaps I wasn't clear.  I am not concerned about Jenkins projects
> being
>     >> removed, just the artifacts of those projects.  I am trying to make
> two
>     >> points:
>     >>
>     >> 1)  Old artifacts might have value as a known good build for a job
> that
>     >> may not get run for years.  So please don't run a clean up that
> deletes all
>     >> artifacts older than N days.
>     >
>     > Nope. Download those artifacts if you want to keep them. The Jenkins
>     > Master is not an archival system. It exists for our projects to do
> builds.
>     > We have identified the overabundance of old artifacts as detrimental
> to the
>     > service, which affects everyone.
>
>     To be clearer here, Chris earlier said/asked:
>     we’d ask that this be exported and managed locally by the project
> rather
>     than remaining “live” on the master. Is there a reason other than
>     convenience for it to remain live?
>
> I thought we were allowed to provide feedback instead of following strict
> orders against old artifacts.


Sigh. And there is the "us vs them" crap again. This is not about
"following strict orders" from Infra. Yes, feedback is and has been
requested, and discussion even better.

So: provide that ... is there a good reason to keep the artifacts there?
That is the entire nature of Chris' original email. "We need to delete
these to improve the service. Is there a concern/reason we should not?" And
that has not been answered.

  Sorry for my misunderstanding.  For sure, lots of resources are being
> consumed by old artifacts.  I was just hoping we could save one old build
> per Jenkins project regardless of age for convenience.  But if not, I guess
> we will take the time to figure out where to store it and document where we
> put it.  I guess we have to create a repo for these so it isn't under some
> committer's home.a.o account or on somebody's machine.  Or is there other
> ASF shared storage I'm not remembering?
>

You've listed them. We don't have a generalized, unversioned file storage
mechanism. home.a.o is likely the best. Even if the committer who uploads
them goes AWOL, they'll still remain accessible.

Will old files in Jenkins workspaces for projects also get purged over time?
>

Somebody else will need to answer this. I dunno if we're purging old files
from the workspaces, or how that is different from what *is* being purged.

Cheers,
-g

Re: purging of old job artifacts

Posted by Alex Harui <ah...@adobe.com.INVALID>.

On 4/26/18, 2:57 PM, "Greg Stein" <gs...@gmail.com> wrote:

    On Thu, Apr 26, 2018, 16:42 Greg Stein <gs...@gmail.com> wrote:
    
    > On Thu, Apr 26, 2018, 15:06 Alex Harui <ah...@adobe.com.invalid> wrote:
    >
    >> Perhaps I wasn't clear.  I am not concerned about Jenkins projects being
    >> removed, just the artifacts of those projects.  I am trying to make two
    >> points:
    >>
    >> 1)  Old artifacts might have value as a known good build for a job that
    >> may not get run for years.  So please don't run a clean up that deletes all
    >> artifacts older than N days.
    >>
    >
    > Nope. Download those artifacts if you want to keep them. The Jenkins
    > Master is not an archival system. It exists for our projects to do builds.
    > We have identified the overabundance of old artifacts as detrimental to the
    > service, which affects everyone.
    >
    
    To be clearer here, Chris earlier said/asked:
    we’d ask that this be exported and managed locally by the project rather
    than remaining “live” on the master. Is there a reason other than
    convenience for it to remain live?
    
I thought we were allowed to provide feedback instead of following strict orders against old artifacts.  Sorry for my misunderstanding.  For sure, lots of resources are being consumed by old artifacts.  I was just hoping we could save one old build per Jenkins project regardless of age for convenience.  But if not, I guess we will take the time to figure out where to store it and document where we put it.  I guess we have to create a repo for these so it isn't under some committer's home.a.o account or on somebody's machine.  Or is there other ASF shared storage I'm not remembering?

Will old files in Jenkins workspaces for projects also get purged over time?

Sorry for the confusion,
-Alex
    
    
    > 2)  Somebody else pointed this out as well, but there appear to be folders
    >> listed for which there is no current Jenkins job.
    >>
    >
    > Knew about that. I simply read your note as "please keep these jobs". Some
    > disabled, some not working, etc. It seemed you were talking about jobs
    > rather than artifacts within them.
    >
    > Cheers,
    > -g
    >
    >
    >> Thanks,
    >> -Alex
    >>
    >> On 4/26/18, 11:07 AM, "Greg Stein" <gs...@gmail.com> wrote:
    >>
    >>     Note that jobs will remain.
    >>
    >>     This is only about deleting build artifacts.
    >>
    >>     On Thu, Apr 26, 2018 at 12:40 PM, Alex Harui <aharui@adobe.com.invalid
    >> >
    >>     wrote:
    >>
    >>     > HI Chris,
    >>     >
    >>     > Thanks for the list.
    >>     >
    >>     > I’m going through the Flex-related jobs and have some feedback:
    >>     >
    >>     > flex-blazeds (maven)  We’ve kept this build around even though it
    >> hasn’t
    >>     > run in a while in case we need to do another release of blazeds.  I
    >> would
    >>     > like to keep at least one known good build in case we have trouble
    >>     > resurrecting it later if we need it, even though it may sit idle
    >> for years.
    >>     >
    >>     > flex-flexunit_1
    >>     > flex-sdk_1
    >>     > flex-sdk_pixelbender_1
    >>     > flex-sdk_release_1
    >>     > flex-tlf_1
    >>     > flex_sdk_version  I cannot find a project with these names in
    >> Jenkins.  So
    >>     > feel free to toss it.
    >>     >
    >>     >
    >>     > flex-flexunit (maven)  This project was never completed to build
    >>     > successfully, but it would be nice to keep it around in case we
    >> need it.
    >>     >
    >>     > FlexJS Compiler (maven)
    >>     > FlexJS Framework (maven)
    >>     > FlexJS Pipeline
    >>     > FlexJS Typedefs (maven)  Looks like we never set the build limit,
    >> so I
    >>     > just did that.  The project is disabled, we are keeping it around as
    >>     > archival for Royale, so not sure it will clean itself up.
    >>     >
    >>     > flex-productdashboard I deleted this project.
    >>     >
    >>     > flex-tool-api (maven)
    >>     > flex-sdk-converter (maven)  I’m not seeing old artifacts in these
    >>     > projects, but they may also sit idle for years until some bug needs
    >> fixing.
    >>     >
    >>     > Flex-Site (Maven) This project never took off, but again it would
    >> be nice
    >>     > to keep it around in case it gets revived
    >>     >
    >>     > In sum, a project like Flex may have several kinds of “products”
    >> with
    >>     > varying activity levels and thus may have jobs that are idle for
    >> years and
    >>     > it can be helpful to keep at least the last build around as a
    >> reference in
    >>     > case the next time we run the build there is a failure.  Please
    >> notify us
    >>     > if we miss limiting the number of old builds.  I think I fixed the
    >> ones
    >>     > that didn’t have limits.  But there does seem to be folders left
    >> around for
    >>     > builds I think we deleted.
    >>     >
    >>     > Thanks,
    >>     > -Alex
    >>     >
    >>     >
    >>     > From: Chris Lambertus <cm...@apache.org>
    >>     > Reply-To: <bu...@apache.org>
    >>     > Date: Wednesday, April 25, 2018 at 12:04 AM
    >>     > To: <bu...@apache.org>
    >>     > Subject: Re: purging of old job artifacts
    >>     >
    >>     >
    >>     >
    >>     >
    >>     > On Apr 24, 2018, at 8:04 PM, Allen Wittenauer <
    >> aw@effectivemachines.com<
    >>     > mailto:aw@effectivemachines.com>> wrote:
    >>     >
    >>     >
    >>     >
    >>     > On Apr 24, 2018, at 5:01 PM, Greg Stein <gstein@gmail.com<mailto:
    >> gstei
    >>     > n@gmail.com>> wrote:
    >>     >
    >>     > Let's go back to the start: stuff older than six months will be
    >> deleted.
    >>     > What could possibly need to be retained?
    >>     >
    >>     >                 - Not every job runs every day.  Some are extremely
    >>     > situational.
    >>     >
    >>     > The artifacts do not need to be kept in perpetuity. When every
    >> project
    >>     > does this, there are significant costs in both disk space and
    >> performance.
    >>     > Our policy has been 30 days or 10 jobs retention.
    >>     >
    >>     >
    >>     >
    >>     >
    >>     >                 - Some users might have specifically marked certain
    >> data
    >>     > to be retained for very specific reasons.
    >>     >
    >>     >                 I know in my case I marked some logs to not be
    >> deleted
    >>     > because I was using them to debug the systemic Jenkins build node
    >> crashes.
    >>     > I want to keep the data to see if the usage numbers, etc, go down
    >> over time.
    >>     >
    >>     >
    >>     > Part of the systemic problems are due to copious amounts of
    >> historical
    >>     > data which are loaded into jenkins on startup, inflating the memory
    >> usage
    >>     > and startup times. Again, when every job does this, it adds up, and
    >> many of
    >>     > the problems we’re facing appear to be rooted in the very large
    >> number of
    >>     > artifacts we have.
    >>     >
    >>     >
    >>     >
    >>     >                 So yes, there may be some value to some of that
    >> data that
    >>     > will not be obvious to an outside observer.
    >>     >
    >>     >
    >>     > Assume all jobs will be touched.
    >>     >
    >>     >                 … which is why giving a directory listing of just
    >> the base
    >>     > directory would be useful to see who needs to look. If INFRA is
    >> unwilling
    >>     > to provide that data, then keep any directories that reference:
    >>     >
    >>     >
    >>     > Please dispense with the passive aggressive “unwilling to provide”
    >>     > nonsense. This is inflammatory and anti-Infra for no valid reason.
    >> This
    >>     > process is meant to be a pragmatic approach to cleaning up and
    >> improving a
    >>     > service used by a large number of projects. The fact that I didn’t
    >> have
    >>     > time to post the job list in the 4 hours since my last reply does
    >> not need
    >>     > to be construed as reticence on Infra’s part to provide it.
    >>     >
    >>     > The top-level list of jobs is available here:
    >>     >
    >> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpaste.apache.org%2Fr37e&data=02%7C01%7Caharui%40adobe.com%7C1099543a29ab494553cc08d5aba08f41%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636603628504178337&sdata=dZOVSreSfQIqbL%2FScGSEW5m93SmDcKeb2%2FZMbPaIUzA%3D&reserved=0
    >>     >
    >>     > I am happy to provide further information, however, due to the disk
    >> IO
    >>     > issues on jenkins-master and the size of the jobs/ dir, multiple
    >> scans and
    >>     > data analytics are difficult to provide due to the timescale.
    >>     >
    >>     >
    >>     > As I previously mentioned, the list of actual artifacts currently
    >> slated
    >>     > for deletion is 590MB and took several hours to generate. I also
    >> misspoke
    >>     > earlier, that list is for artifacts over one year old. The space
    >> which
    >>     > would be freed up is over 480GB. The list of artifacts over 180
    >> days old is
    >>     > going to be much longer, but I can look into making it available
    >> somewhere.
    >>     > I question the utility though, as the 1 year data is over 3 million
    >> lines.
    >>     >
    >>     >
    >>     >
    >>     >
    >>     >                 - precommit
    >>     >                 - hadoop
    >>     >                 - yarn
    >>     >                 - hdfs
    >>     >                 - mapreduce
    >>     >                 - hbase
    >>     >                 - yetus
    >>     >
    >>     >
    >>     > We will not be cherry-picking jobs to exclude from the purge unless
    >> there
    >>     > is a compelling operational reason to do so. Jenkins is a shared
    >> resource,
    >>     > and all projects are affected equally.
    >>     >
    >>     >
    >>     > Let me do some further research and compare the size and file
    >> counts for
    >>     > artifacts vs. build metadata (logs, etc.)
    >>     >
    >>     > The main things we want to purge are:
    >>     >
    >>     > - all artifacts and metadata where the job/project longer exists
    >>     > - binary artifacts with no value older than 180 days
    >>     >
    >>     > and, to a lesser extent, jobs which fall outside our general 30
    >> day/10
    >>     > jobs retention policy.
    >>     >
    >>     >
    >>     > As an example of ancient binary artifacts, there are 22MB of
    >> javadocs from
    >>     > 2013 in /x1/jenkins/jenkins-home/jobs/ManifoldCF-mvn
    >>     >
    >>     > Using the yetus jobs as a reference, yetus-java builds 480 and 481
    >> are
    >>     > nearly a year old, but only contain a few kilobytes of data. While
    >> removing
    >>     > them saves no space, they also provide no value, but are still
    >>     > loaded/parsed by jenkins. Since they don’t contain valid jenkins
    >> objects,
    >>     > they don’t even show up in the build history, but are still part of
    >> the
    >>     > constant scanning of the jobs/ directory that jenkins does, and
    >> contribute
    >>     > to high load and disk IO. Those two are the only +180 day artifacts
    >> for
    >>     > yetus with the exception of a zero-byte legacyIds file for -qbt.
    >>     >
    >>     > root@jenkins-master:/x1/jenkins/jenkins-home/jobs# find yetus-*
    >> -mtime
    >>     > +180 -ls
    >>     >  69210803      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12
    >> 2017
    >>     > yetus-java/builds/481
    >>     >  69210815      4 -rw-r--r--   1 jenkins  jenkins       457 Jul  8
    >> 2017
    >>     > yetus-java/builds/481/polling.log
    >>     >  65813999      0 lrwxrwxrwx   1 jenkins  jenkins         2 May 23
    >> 2016
    >>     > yetus-java/builds/lastUnstableBuild -> -1
    >>     >  65814012      0 -rw-r--r--   1 jenkins  jenkins         0 May 23
    >> 2016
    >>     > yetus-java/builds/legacyIds
    >>     >  69210796      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12
    >> 2017
    >>     > yetus-java/builds/480
    >>     >  69210810      4 -rw-r--r--   1 jenkins  jenkins       456 Jul  7
    >> 2017
    >>     > yetus-java/builds/480/polling.log
    >>     >  23725477      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15
    >> 2017
    >>     > yetus-qbt/builds/lastStableBuild -> -1
    >>     >  23741645      0 lrwxrwxrwx   1 jenkins  jenkins         2 Apr 14
    >> 2016
    >>     > yetus-qbt/builds/lastUnstableBuild -> -1
    >>     >  23725478      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15
    >> 2017
    >>     > yetus-qbt/builds/lastSuccessfulBuild -> -1
    >>     >  23741647      0 -rw-r--r--   1 jenkins  jenkins         0 Apr 14
    >> 2016
    >>     > yetus-qbt/builds/legacyIds
    >>     >
    >>     > For mapreduce, there is an empty Mapreduce-Patch-vesta.apache.org<
    >>     >
    >> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2FMapreduce-Patch-vesta.apache.org&data=02%7C01%7Caharui%40adobe.com%7C1099543a29ab494553cc08d5aba08f41%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636603628504178337&sdata=KxgEx5GbH1omcfeO7sw0to9qBk7eJ%2BY3zhuM4k%2FlyUA%3D&reserved=0>
    >> from 2010, and a bunch of jobs
    >>     > from June 2017 for PreCommit-MAPREDUCE-Build (6999-7006.) Again,
    >> while they
    >>     > take up very little space, they are still loaded into jenkins and
    >> scanned
    >>     > by the threads which watch the jobs/ dir for changes. Multiply this
    >> times
    >>     > 2381 top level job configs, and you can see why we’re hoping this
    >> type of
    >>     > purge will help improve jenkins performance and the frequent
    >> crashing.
    >>     >
    >>     >
    >>     > Since we are looking to move to expensive NVMe disks (nearly 4TB
    >> worth) we
    >>     > also need to perform due diligence to insure that we are not
    >> migrating and
    >>     > maintaining ancient data.
    >>     >
    >>     > -Chris
    >>     >
    >>     >
    >>     >
    >>     >
    >>
    >>
    >>
    


Re: purging of old job artifacts

Posted by Greg Stein <gs...@gmail.com>.
On Thu, Apr 26, 2018, 16:42 Greg Stein <gs...@gmail.com> wrote:

> On Thu, Apr 26, 2018, 15:06 Alex Harui <ah...@adobe.com.invalid> wrote:
>
>> Perhaps I wasn't clear.  I am not concerned about Jenkins projects being
>> removed, just the artifacts of those projects.  I am trying to make two
>> points:
>>
>> 1)  Old artifacts might have value as a known good build for a job that
>> may not get run for years.  So please don't run a clean up that deletes all
>> artifacts older than N days.
>>
>
> Nope. Download those artifacts if you want to keep them. The Jenkins
> Master is not an archival system. It exists for our projects to do builds.
> We have identified the overabundance of old artifacts as detrimental to the
> service, which affects everyone.
>

To be clearer here, Chris earlier said/asked:
we’d ask that this be exported and managed locally by the project rather
than remaining “live” on the master. Is there a reason other than
convenience for it to remain live?

Cheers,
-g


> 2)  Somebody else pointed this out as well, but there appear to be folders
>> listed for which there is no current Jenkins job.
>>
>
> Knew about that. I simply read your note as "please keep these jobs". Some
> disabled, some not working, etc. It seemed you were talking about jobs
> rather than artifacts within them.
>
> Cheers,
> -g
>
>
>> Thanks,
>> -Alex
>>
>> On 4/26/18, 11:07 AM, "Greg Stein" <gs...@gmail.com> wrote:
>>
>>     Note that jobs will remain.
>>
>>     This is only about deleting build artifacts.
>>
>>     On Thu, Apr 26, 2018 at 12:40 PM, Alex Harui <aharui@adobe.com.invalid
>> >
>>     wrote:
>>
>>     > HI Chris,
>>     >
>>     > Thanks for the list.
>>     >
>>     > I’m going through the Flex-related jobs and have some feedback:
>>     >
>>     > flex-blazeds (maven)  We’ve kept this build around even though it
>> hasn’t
>>     > run in a while in case we need to do another release of blazeds.  I
>> would
>>     > like to keep at least one known good build in case we have trouble
>>     > resurrecting it later if we need it, even though it may sit idle
>> for years.
>>     >
>>     > flex-flexunit_1
>>     > flex-sdk_1
>>     > flex-sdk_pixelbender_1
>>     > flex-sdk_release_1
>>     > flex-tlf_1
>>     > flex_sdk_version  I cannot find a project with these names in
>> Jenkins.  So
>>     > feel free to toss it.
>>     >
>>     >
>>     > flex-flexunit (maven)  This project was never completed to build
>>     > successfully, but it would be nice to keep it around in case we
>> need it.
>>     >
>>     > FlexJS Compiler (maven)
>>     > FlexJS Framework (maven)
>>     > FlexJS Pipeline
>>     > FlexJS Typedefs (maven)  Looks like we never set the build limit,
>> so I
>>     > just did that.  The project is disabled, we are keeping it around as
>>     > archival for Royale, so not sure it will clean itself up.
>>     >
>>     > flex-productdashboard I deleted this project.
>>     >
>>     > flex-tool-api (maven)
>>     > flex-sdk-converter (maven)  I’m not seeing old artifacts in these
>>     > projects, but they may also sit idle for years until some bug needs
>> fixing.
>>     >
>>     > Flex-Site (Maven) This project never took off, but again it would
>> be nice
>>     > to keep it around in case it gets revived
>>     >
>>     > In sum, a project like Flex may have several kinds of “products”
>> with
>>     > varying activity levels and thus may have jobs that are idle for
>> years and
>>     > it can be helpful to keep at least the last build around as a
>> reference in
>>     > case the next time we run the build there is a failure.  Please
>> notify us
>>     > if we miss limiting the number of old builds.  I think I fixed the
>> ones
>>     > that didn’t have limits.  But there does seem to be folders left
>> around for
>>     > builds I think we deleted.
>>     >
>>     > Thanks,
>>     > -Alex
>>     >
>>     >
>>     > From: Chris Lambertus <cm...@apache.org>
>>     > Reply-To: <bu...@apache.org>
>>     > Date: Wednesday, April 25, 2018 at 12:04 AM
>>     > To: <bu...@apache.org>
>>     > Subject: Re: purging of old job artifacts
>>     >
>>     >
>>     >
>>     >
>>     > On Apr 24, 2018, at 8:04 PM, Allen Wittenauer <
>> aw@effectivemachines.com<
>>     > mailto:aw@effectivemachines.com>> wrote:
>>     >
>>     >
>>     >
>>     > On Apr 24, 2018, at 5:01 PM, Greg Stein <gstein@gmail.com<mailto:
>> gstei
>>     > n@gmail.com>> wrote:
>>     >
>>     > Let's go back to the start: stuff older than six months will be
>> deleted.
>>     > What could possibly need to be retained?
>>     >
>>     >                 - Not every job runs every day.  Some are extremely
>>     > situational.
>>     >
>>     > The artifacts do not need to be kept in perpetuity. When every
>> project
>>     > does this, there are significant costs in both disk space and
>> performance.
>>     > Our policy has been 30 days or 10 jobs retention.
>>     >
>>     >
>>     >
>>     >
>>     >                 - Some users might have specifically marked certain
>> data
>>     > to be retained for very specific reasons.
>>     >
>>     >                 I know in my case I marked some logs to not be
>> deleted
>>     > because I was using them to debug the systemic Jenkins build node
>> crashes.
>>     > I want to keep the data to see if the usage numbers, etc, go down
>> over time.
>>     >
>>     >
>>     > Part of the systemic problems are due to copious amounts of
>> historical
>>     > data which are loaded into jenkins on startup, inflating the memory
>> usage
>>     > and startup times. Again, when every job does this, it adds up, and
>> many of
>>     > the problems we’re facing appear to be rooted in the very large
>> number of
>>     > artifacts we have.
>>     >
>>     >
>>     >
>>     >                 So yes, there may be some value to some of that
>> data that
>>     > will not be obvious to an outside observer.
>>     >
>>     >
>>     > Assume all jobs will be touched.
>>     >
>>     >                 … which is why giving a directory listing of just
>> the base
>>     > directory would be useful to see who needs to look. If INFRA is
>> unwilling
>>     > to provide that data, then keep any directories that reference:
>>     >
>>     >
>>     > Please dispense with the passive aggressive “unwilling to provide”
>>     > nonsense. This is inflammatory and anti-Infra for no valid reason.
>> This
>>     > process is meant to be a pragmatic approach to cleaning up and
>> improving a
>>     > service used by a large number of projects. The fact that I didn’t
>> have
>>     > time to post the job list in the 4 hours since my last reply does
>> not need
>>     > to be construed as reticence on Infra’s part to provide it.
>>     >
>>     > The top-level list of jobs is available here:
>>     >
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpaste.apache.org%2Fr37e&data=02%7C01%7Caharui%40adobe.com%7C1099543a29ab494553cc08d5aba08f41%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636603628504178337&sdata=dZOVSreSfQIqbL%2FScGSEW5m93SmDcKeb2%2FZMbPaIUzA%3D&reserved=0
>>     >
>>     > I am happy to provide further information, however, due to the disk
>> IO
>>     > issues on jenkins-master and the size of the jobs/ dir, multiple
>> scans and
>>     > data analytics are difficult to provide due to the timescale.
>>     >
>>     >
>>     > As I previously mentioned, the list of actual artifacts currently
>> slated
>>     > for deletion is 590MB and took several hours to generate. I also
>> misspoke
>>     > earlier, that list is for artifacts over one year old. The space
>> which
>>     > would be freed up is over 480GB. The list of artifacts over 180
>> days old is
>>     > going to be much longer, but I can look into making it available
>> somewhere.
>>     > I question the utility though, as the 1 year data is over 3 million
>> lines.
>>     >
>>     >
>>     >
>>     >
>>     >                 - precommit
>>     >                 - hadoop
>>     >                 - yarn
>>     >                 - hdfs
>>     >                 - mapreduce
>>     >                 - hbase
>>     >                 - yetus
>>     >
>>     >
>>     > We will not be cherry-picking jobs to exclude from the purge unless
>> there
>>     > is a compelling operational reason to do so. Jenkins is a shared
>> resource,
>>     > and all projects are affected equally.
>>     >
>>     >
>>     > Let me do some further research and compare the size and file
>> counts for
>>     > artifacts vs. build metadata (logs, etc.)
>>     >
>>     > The main things we want to purge are:
>>     >
>>     > - all artifacts and metadata where the job/project longer exists
>>     > - binary artifacts with no value older than 180 days
>>     >
>>     > and, to a lesser extent, jobs which fall outside our general 30
>> day/10
>>     > jobs retention policy.
>>     >
>>     >
>>     > As an example of ancient binary artifacts, there are 22MB of
>> javadocs from
>>     > 2013 in /x1/jenkins/jenkins-home/jobs/ManifoldCF-mvn
>>     >
>>     > Using the yetus jobs as a reference, yetus-java builds 480 and 481
>> are
>>     > nearly a year old, but only contain a few kilobytes of data. While
>> removing
>>     > them saves no space, they also provide no value, but are still
>>     > loaded/parsed by jenkins. Since they don’t contain valid jenkins
>> objects,
>>     > they don’t even show up in the build history, but are still part of
>> the
>>     > constant scanning of the jobs/ directory that jenkins does, and
>> contribute
>>     > to high load and disk IO. Those two are the only +180 day artifacts
>> for
>>     > yetus with the exception of a zero-byte legacyIds file for -qbt.
>>     >
>>     > root@jenkins-master:/x1/jenkins/jenkins-home/jobs# find yetus-*
>> -mtime
>>     > +180 -ls
>>     >  69210803      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12
>> 2017
>>     > yetus-java/builds/481
>>     >  69210815      4 -rw-r--r--   1 jenkins  jenkins       457 Jul  8
>> 2017
>>     > yetus-java/builds/481/polling.log
>>     >  65813999      0 lrwxrwxrwx   1 jenkins  jenkins         2 May 23
>> 2016
>>     > yetus-java/builds/lastUnstableBuild -> -1
>>     >  65814012      0 -rw-r--r--   1 jenkins  jenkins         0 May 23
>> 2016
>>     > yetus-java/builds/legacyIds
>>     >  69210796      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12
>> 2017
>>     > yetus-java/builds/480
>>     >  69210810      4 -rw-r--r--   1 jenkins  jenkins       456 Jul  7
>> 2017
>>     > yetus-java/builds/480/polling.log
>>     >  23725477      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15
>> 2017
>>     > yetus-qbt/builds/lastStableBuild -> -1
>>     >  23741645      0 lrwxrwxrwx   1 jenkins  jenkins         2 Apr 14
>> 2016
>>     > yetus-qbt/builds/lastUnstableBuild -> -1
>>     >  23725478      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15
>> 2017
>>     > yetus-qbt/builds/lastSuccessfulBuild -> -1
>>     >  23741647      0 -rw-r--r--   1 jenkins  jenkins         0 Apr 14
>> 2016
>>     > yetus-qbt/builds/legacyIds
>>     >
>>     > For mapreduce, there is an empty Mapreduce-Patch-vesta.apache.org<
>>     >
>> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2FMapreduce-Patch-vesta.apache.org&data=02%7C01%7Caharui%40adobe.com%7C1099543a29ab494553cc08d5aba08f41%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636603628504178337&sdata=KxgEx5GbH1omcfeO7sw0to9qBk7eJ%2BY3zhuM4k%2FlyUA%3D&reserved=0>
>> from 2010, and a bunch of jobs
>>     > from June 2017 for PreCommit-MAPREDUCE-Build (6999-7006.) Again,
>> while they
>>     > take up very little space, they are still loaded into jenkins and
>> scanned
>>     > by the threads which watch the jobs/ dir for changes. Multiply this
>> times
>>     > 2381 top level job configs, and you can see why we’re hoping this
>> type of
>>     > purge will help improve jenkins performance and the frequent
>> crashing.
>>     >
>>     >
>>     > Since we are looking to move to expensive NVMe disks (nearly 4TB
>> worth) we
>>     > also need to perform due diligence to insure that we are not
>> migrating and
>>     > maintaining ancient data.
>>     >
>>     > -Chris
>>     >
>>     >
>>     >
>>     >
>>
>>
>>

Re: purging of old job artifacts

Posted by Greg Stein <gs...@gmail.com>.
On Thu, Apr 26, 2018, 15:06 Alex Harui <ah...@adobe.com.invalid> wrote:

> Perhaps I wasn't clear.  I am not concerned about Jenkins projects being
> removed, just the artifacts of those projects.  I am trying to make two
> points:
>
> 1)  Old artifacts might have value as a known good build for a job that
> may not get run for years.  So please don't run a clean up that deletes all
> artifacts older than N days.
>

Nope. Download those artifacts if you want to keep them. The Jenkins Master
is not an archival system. It exists for our projects to do builds. We have
identified the overabundance of old artifacts as detrimental to the
service, which affects everyone.

2)  Somebody else pointed this out as well, but there appear to be folders
> listed for which there is no current Jenkins job.
>

Knew about that. I simply read your note as "please keep these jobs". Some
disabled, some not working, etc. It seemed you were talking about jobs
rather than artifacts within them.

Cheers,
-g


> Thanks,
> -Alex
>
> On 4/26/18, 11:07 AM, "Greg Stein" <gs...@gmail.com> wrote:
>
>     Note that jobs will remain.
>
>     This is only about deleting build artifacts.
>
>     On Thu, Apr 26, 2018 at 12:40 PM, Alex Harui <aharui@adobe.com.invalid
> >
>     wrote:
>
>     > HI Chris,
>     >
>     > Thanks for the list.
>     >
>     > I’m going through the Flex-related jobs and have some feedback:
>     >
>     > flex-blazeds (maven)  We’ve kept this build around even though it
> hasn’t
>     > run in a while in case we need to do another release of blazeds.  I
> would
>     > like to keep at least one known good build in case we have trouble
>     > resurrecting it later if we need it, even though it may sit idle for
> years.
>     >
>     > flex-flexunit_1
>     > flex-sdk_1
>     > flex-sdk_pixelbender_1
>     > flex-sdk_release_1
>     > flex-tlf_1
>     > flex_sdk_version  I cannot find a project with these names in
> Jenkins.  So
>     > feel free to toss it.
>     >
>     >
>     > flex-flexunit (maven)  This project was never completed to build
>     > successfully, but it would be nice to keep it around in case we need
> it.
>     >
>     > FlexJS Compiler (maven)
>     > FlexJS Framework (maven)
>     > FlexJS Pipeline
>     > FlexJS Typedefs (maven)  Looks like we never set the build limit, so
> I
>     > just did that.  The project is disabled, we are keeping it around as
>     > archival for Royale, so not sure it will clean itself up.
>     >
>     > flex-productdashboard I deleted this project.
>     >
>     > flex-tool-api (maven)
>     > flex-sdk-converter (maven)  I’m not seeing old artifacts in these
>     > projects, but they may also sit idle for years until some bug needs
> fixing.
>     >
>     > Flex-Site (Maven) This project never took off, but again it would be
> nice
>     > to keep it around in case it gets revived
>     >
>     > In sum, a project like Flex may have several kinds of “products” with
>     > varying activity levels and thus may have jobs that are idle for
> years and
>     > it can be helpful to keep at least the last build around as a
> reference in
>     > case the next time we run the build there is a failure.  Please
> notify us
>     > if we miss limiting the number of old builds.  I think I fixed the
> ones
>     > that didn’t have limits.  But there does seem to be folders left
> around for
>     > builds I think we deleted.
>     >
>     > Thanks,
>     > -Alex
>     >
>     >
>     > From: Chris Lambertus <cm...@apache.org>
>     > Reply-To: <bu...@apache.org>
>     > Date: Wednesday, April 25, 2018 at 12:04 AM
>     > To: <bu...@apache.org>
>     > Subject: Re: purging of old job artifacts
>     >
>     >
>     >
>     >
>     > On Apr 24, 2018, at 8:04 PM, Allen Wittenauer <
> aw@effectivemachines.com<
>     > mailto:aw@effectivemachines.com>> wrote:
>     >
>     >
>     >
>     > On Apr 24, 2018, at 5:01 PM, Greg Stein <gstein@gmail.com<mailto:
> gstei
>     > n@gmail.com>> wrote:
>     >
>     > Let's go back to the start: stuff older than six months will be
> deleted.
>     > What could possibly need to be retained?
>     >
>     >                 - Not every job runs every day.  Some are extremely
>     > situational.
>     >
>     > The artifacts do not need to be kept in perpetuity. When every
> project
>     > does this, there are significant costs in both disk space and
> performance.
>     > Our policy has been 30 days or 10 jobs retention.
>     >
>     >
>     >
>     >
>     >                 - Some users might have specifically marked certain
> data
>     > to be retained for very specific reasons.
>     >
>     >                 I know in my case I marked some logs to not be
> deleted
>     > because I was using them to debug the systemic Jenkins build node
> crashes.
>     > I want to keep the data to see if the usage numbers, etc, go down
> over time.
>     >
>     >
>     > Part of the systemic problems are due to copious amounts of
> historical
>     > data which are loaded into jenkins on startup, inflating the memory
> usage
>     > and startup times. Again, when every job does this, it adds up, and
> many of
>     > the problems we’re facing appear to be rooted in the very large
> number of
>     > artifacts we have.
>     >
>     >
>     >
>     >                 So yes, there may be some value to some of that data
> that
>     > will not be obvious to an outside observer.
>     >
>     >
>     > Assume all jobs will be touched.
>     >
>     >                 … which is why giving a directory listing of just
> the base
>     > directory would be useful to see who needs to look. If INFRA is
> unwilling
>     > to provide that data, then keep any directories that reference:
>     >
>     >
>     > Please dispense with the passive aggressive “unwilling to provide”
>     > nonsense. This is inflammatory and anti-Infra for no valid reason.
> This
>     > process is meant to be a pragmatic approach to cleaning up and
> improving a
>     > service used by a large number of projects. The fact that I didn’t
> have
>     > time to post the job list in the 4 hours since my last reply does
> not need
>     > to be construed as reticence on Infra’s part to provide it.
>     >
>     > The top-level list of jobs is available here:
>     >
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpaste.apache.org%2Fr37e&data=02%7C01%7Caharui%40adobe.com%7C1099543a29ab494553cc08d5aba08f41%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636603628504178337&sdata=dZOVSreSfQIqbL%2FScGSEW5m93SmDcKeb2%2FZMbPaIUzA%3D&reserved=0
>     >
>     > I am happy to provide further information, however, due to the disk
> IO
>     > issues on jenkins-master and the size of the jobs/ dir, multiple
> scans and
>     > data analytics are difficult to provide due to the timescale.
>     >
>     >
>     > As I previously mentioned, the list of actual artifacts currently
> slated
>     > for deletion is 590MB and took several hours to generate. I also
> misspoke
>     > earlier, that list is for artifacts over one year old. The space
> which
>     > would be freed up is over 480GB. The list of artifacts over 180 days
> old is
>     > going to be much longer, but I can look into making it available
> somewhere.
>     > I question the utility though, as the 1 year data is over 3 million
> lines.
>     >
>     >
>     >
>     >
>     >                 - precommit
>     >                 - hadoop
>     >                 - yarn
>     >                 - hdfs
>     >                 - mapreduce
>     >                 - hbase
>     >                 - yetus
>     >
>     >
>     > We will not be cherry-picking jobs to exclude from the purge unless
> there
>     > is a compelling operational reason to do so. Jenkins is a shared
> resource,
>     > and all projects are affected equally.
>     >
>     >
>     > Let me do some further research and compare the size and file counts
> for
>     > artifacts vs. build metadata (logs, etc.)
>     >
>     > The main things we want to purge are:
>     >
>     > - all artifacts and metadata where the job/project longer exists
>     > - binary artifacts with no value older than 180 days
>     >
>     > and, to a lesser extent, jobs which fall outside our general 30
> day/10
>     > jobs retention policy.
>     >
>     >
>     > As an example of ancient binary artifacts, there are 22MB of
> javadocs from
>     > 2013 in /x1/jenkins/jenkins-home/jobs/ManifoldCF-mvn
>     >
>     > Using the yetus jobs as a reference, yetus-java builds 480 and 481
> are
>     > nearly a year old, but only contain a few kilobytes of data. While
> removing
>     > them saves no space, they also provide no value, but are still
>     > loaded/parsed by jenkins. Since they don’t contain valid jenkins
> objects,
>     > they don’t even show up in the build history, but are still part of
> the
>     > constant scanning of the jobs/ directory that jenkins does, and
> contribute
>     > to high load and disk IO. Those two are the only +180 day artifacts
> for
>     > yetus with the exception of a zero-byte legacyIds file for -qbt.
>     >
>     > root@jenkins-master:/x1/jenkins/jenkins-home/jobs# find yetus-*
> -mtime
>     > +180 -ls
>     >  69210803      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12
> 2017
>     > yetus-java/builds/481
>     >  69210815      4 -rw-r--r--   1 jenkins  jenkins       457 Jul  8
> 2017
>     > yetus-java/builds/481/polling.log
>     >  65813999      0 lrwxrwxrwx   1 jenkins  jenkins         2 May 23
> 2016
>     > yetus-java/builds/lastUnstableBuild -> -1
>     >  65814012      0 -rw-r--r--   1 jenkins  jenkins         0 May 23
> 2016
>     > yetus-java/builds/legacyIds
>     >  69210796      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12
> 2017
>     > yetus-java/builds/480
>     >  69210810      4 -rw-r--r--   1 jenkins  jenkins       456 Jul  7
> 2017
>     > yetus-java/builds/480/polling.log
>     >  23725477      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15
> 2017
>     > yetus-qbt/builds/lastStableBuild -> -1
>     >  23741645      0 lrwxrwxrwx   1 jenkins  jenkins         2 Apr 14
> 2016
>     > yetus-qbt/builds/lastUnstableBuild -> -1
>     >  23725478      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15
> 2017
>     > yetus-qbt/builds/lastSuccessfulBuild -> -1
>     >  23741647      0 -rw-r--r--   1 jenkins  jenkins         0 Apr 14
> 2016
>     > yetus-qbt/builds/legacyIds
>     >
>     > For mapreduce, there is an empty Mapreduce-Patch-vesta.apache.org<
>     >
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2FMapreduce-Patch-vesta.apache.org&data=02%7C01%7Caharui%40adobe.com%7C1099543a29ab494553cc08d5aba08f41%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636603628504178337&sdata=KxgEx5GbH1omcfeO7sw0to9qBk7eJ%2BY3zhuM4k%2FlyUA%3D&reserved=0>
> from 2010, and a bunch of jobs
>     > from June 2017 for PreCommit-MAPREDUCE-Build (6999-7006.) Again,
> while they
>     > take up very little space, they are still loaded into jenkins and
> scanned
>     > by the threads which watch the jobs/ dir for changes. Multiply this
> times
>     > 2381 top level job configs, and you can see why we’re hoping this
> type of
>     > purge will help improve jenkins performance and the frequent
> crashing.
>     >
>     >
>     > Since we are looking to move to expensive NVMe disks (nearly 4TB
> worth) we
>     > also need to perform due diligence to insure that we are not
> migrating and
>     > maintaining ancient data.
>     >
>     > -Chris
>     >
>     >
>     >
>     >
>
>
>

Re: purging of old job artifacts

Posted by Alex Harui <ah...@adobe.com.INVALID>.
Perhaps I wasn't clear.  I am not concerned about Jenkins projects being removed, just the artifacts of those projects.  I am trying to make two points:

1)  Old artifacts might have value as a known good build for a job that may not get run for years.  So please don't run a clean up that deletes all artifacts older than N days.
2)  Somebody else pointed this out as well, but there appear to be folders listed for which there is no current Jenkins job.

Thanks,
-Alex

On 4/26/18, 11:07 AM, "Greg Stein" <gs...@gmail.com> wrote:

    Note that jobs will remain.
    
    This is only about deleting build artifacts.
    
    On Thu, Apr 26, 2018 at 12:40 PM, Alex Harui <ah...@adobe.com.invalid>
    wrote:
    
    > HI Chris,
    >
    > Thanks for the list.
    >
    > I’m going through the Flex-related jobs and have some feedback:
    >
    > flex-blazeds (maven)  We’ve kept this build around even though it hasn’t
    > run in a while in case we need to do another release of blazeds.  I would
    > like to keep at least one known good build in case we have trouble
    > resurrecting it later if we need it, even though it may sit idle for years.
    >
    > flex-flexunit_1
    > flex-sdk_1
    > flex-sdk_pixelbender_1
    > flex-sdk_release_1
    > flex-tlf_1
    > flex_sdk_version  I cannot find a project with these names in Jenkins.  So
    > feel free to toss it.
    >
    >
    > flex-flexunit (maven)  This project was never completed to build
    > successfully, but it would be nice to keep it around in case we need it.
    >
    > FlexJS Compiler (maven)
    > FlexJS Framework (maven)
    > FlexJS Pipeline
    > FlexJS Typedefs (maven)  Looks like we never set the build limit, so I
    > just did that.  The project is disabled, we are keeping it around as
    > archival for Royale, so not sure it will clean itself up.
    >
    > flex-productdashboard I deleted this project.
    >
    > flex-tool-api (maven)
    > flex-sdk-converter (maven)  I’m not seeing old artifacts in these
    > projects, but they may also sit idle for years until some bug needs fixing.
    >
    > Flex-Site (Maven) This project never took off, but again it would be nice
    > to keep it around in case it gets revived
    >
    > In sum, a project like Flex may have several kinds of “products” with
    > varying activity levels and thus may have jobs that are idle for years and
    > it can be helpful to keep at least the last build around as a reference in
    > case the next time we run the build there is a failure.  Please notify us
    > if we miss limiting the number of old builds.  I think I fixed the ones
    > that didn’t have limits.  But there does seem to be folders left around for
    > builds I think we deleted.
    >
    > Thanks,
    > -Alex
    >
    >
    > From: Chris Lambertus <cm...@apache.org>
    > Reply-To: <bu...@apache.org>
    > Date: Wednesday, April 25, 2018 at 12:04 AM
    > To: <bu...@apache.org>
    > Subject: Re: purging of old job artifacts
    >
    >
    >
    >
    > On Apr 24, 2018, at 8:04 PM, Allen Wittenauer <aw@effectivemachines.com<
    > mailto:aw@effectivemachines.com>> wrote:
    >
    >
    >
    > On Apr 24, 2018, at 5:01 PM, Greg Stein <gstein@gmail.com<mailto:gstei
    > n@gmail.com>> wrote:
    >
    > Let's go back to the start: stuff older than six months will be deleted.
    > What could possibly need to be retained?
    >
    >                 - Not every job runs every day.  Some are extremely
    > situational.
    >
    > The artifacts do not need to be kept in perpetuity. When every project
    > does this, there are significant costs in both disk space and performance.
    > Our policy has been 30 days or 10 jobs retention.
    >
    >
    >
    >
    >                 - Some users might have specifically marked certain data
    > to be retained for very specific reasons.
    >
    >                 I know in my case I marked some logs to not be deleted
    > because I was using them to debug the systemic Jenkins build node crashes.
    > I want to keep the data to see if the usage numbers, etc, go down over time.
    >
    >
    > Part of the systemic problems are due to copious amounts of historical
    > data which are loaded into jenkins on startup, inflating the memory usage
    > and startup times. Again, when every job does this, it adds up, and many of
    > the problems we’re facing appear to be rooted in the very large number of
    > artifacts we have.
    >
    >
    >
    >                 So yes, there may be some value to some of that data that
    > will not be obvious to an outside observer.
    >
    >
    > Assume all jobs will be touched.
    >
    >                 … which is why giving a directory listing of just the base
    > directory would be useful to see who needs to look. If INFRA is unwilling
    > to provide that data, then keep any directories that reference:
    >
    >
    > Please dispense with the passive aggressive “unwilling to provide”
    > nonsense. This is inflammatory and anti-Infra for no valid reason. This
    > process is meant to be a pragmatic approach to cleaning up and improving a
    > service used by a large number of projects. The fact that I didn’t have
    > time to post the job list in the 4 hours since my last reply does not need
    > to be construed as reticence on Infra’s part to provide it.
    >
    > The top-level list of jobs is available here:
    > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpaste.apache.org%2Fr37e&data=02%7C01%7Caharui%40adobe.com%7C1099543a29ab494553cc08d5aba08f41%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636603628504178337&sdata=dZOVSreSfQIqbL%2FScGSEW5m93SmDcKeb2%2FZMbPaIUzA%3D&reserved=0
    >
    > I am happy to provide further information, however, due to the disk IO
    > issues on jenkins-master and the size of the jobs/ dir, multiple scans and
    > data analytics are difficult to provide due to the timescale.
    >
    >
    > As I previously mentioned, the list of actual artifacts currently slated
    > for deletion is 590MB and took several hours to generate. I also misspoke
    > earlier, that list is for artifacts over one year old. The space which
    > would be freed up is over 480GB. The list of artifacts over 180 days old is
    > going to be much longer, but I can look into making it available somewhere.
    > I question the utility though, as the 1 year data is over 3 million lines.
    >
    >
    >
    >
    >                 - precommit
    >                 - hadoop
    >                 - yarn
    >                 - hdfs
    >                 - mapreduce
    >                 - hbase
    >                 - yetus
    >
    >
    > We will not be cherry-picking jobs to exclude from the purge unless there
    > is a compelling operational reason to do so. Jenkins is a shared resource,
    > and all projects are affected equally.
    >
    >
    > Let me do some further research and compare the size and file counts for
    > artifacts vs. build metadata (logs, etc.)
    >
    > The main things we want to purge are:
    >
    > - all artifacts and metadata where the job/project longer exists
    > - binary artifacts with no value older than 180 days
    >
    > and, to a lesser extent, jobs which fall outside our general 30 day/10
    > jobs retention policy.
    >
    >
    > As an example of ancient binary artifacts, there are 22MB of javadocs from
    > 2013 in /x1/jenkins/jenkins-home/jobs/ManifoldCF-mvn
    >
    > Using the yetus jobs as a reference, yetus-java builds 480 and 481 are
    > nearly a year old, but only contain a few kilobytes of data. While removing
    > them saves no space, they also provide no value, but are still
    > loaded/parsed by jenkins. Since they don’t contain valid jenkins objects,
    > they don’t even show up in the build history, but are still part of the
    > constant scanning of the jobs/ directory that jenkins does, and contribute
    > to high load and disk IO. Those two are the only +180 day artifacts for
    > yetus with the exception of a zero-byte legacyIds file for -qbt.
    >
    > root@jenkins-master:/x1/jenkins/jenkins-home/jobs# find yetus-* -mtime
    > +180 -ls
    >  69210803      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12  2017
    > yetus-java/builds/481
    >  69210815      4 -rw-r--r--   1 jenkins  jenkins       457 Jul  8  2017
    > yetus-java/builds/481/polling.log
    >  65813999      0 lrwxrwxrwx   1 jenkins  jenkins         2 May 23  2016
    > yetus-java/builds/lastUnstableBuild -> -1
    >  65814012      0 -rw-r--r--   1 jenkins  jenkins         0 May 23  2016
    > yetus-java/builds/legacyIds
    >  69210796      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12  2017
    > yetus-java/builds/480
    >  69210810      4 -rw-r--r--   1 jenkins  jenkins       456 Jul  7  2017
    > yetus-java/builds/480/polling.log
    >  23725477      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15  2017
    > yetus-qbt/builds/lastStableBuild -> -1
    >  23741645      0 lrwxrwxrwx   1 jenkins  jenkins         2 Apr 14  2016
    > yetus-qbt/builds/lastUnstableBuild -> -1
    >  23725478      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15  2017
    > yetus-qbt/builds/lastSuccessfulBuild -> -1
    >  23741647      0 -rw-r--r--   1 jenkins  jenkins         0 Apr 14  2016
    > yetus-qbt/builds/legacyIds
    >
    > For mapreduce, there is an empty Mapreduce-Patch-vesta.apache.org<
    > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2FMapreduce-Patch-vesta.apache.org&data=02%7C01%7Caharui%40adobe.com%7C1099543a29ab494553cc08d5aba08f41%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636603628504178337&sdata=KxgEx5GbH1omcfeO7sw0to9qBk7eJ%2BY3zhuM4k%2FlyUA%3D&reserved=0> from 2010, and a bunch of jobs
    > from June 2017 for PreCommit-MAPREDUCE-Build (6999-7006.) Again, while they
    > take up very little space, they are still loaded into jenkins and scanned
    > by the threads which watch the jobs/ dir for changes. Multiply this times
    > 2381 top level job configs, and you can see why we’re hoping this type of
    > purge will help improve jenkins performance and the frequent crashing.
    >
    >
    > Since we are looking to move to expensive NVMe disks (nearly 4TB worth) we
    > also need to perform due diligence to insure that we are not migrating and
    > maintaining ancient data.
    >
    > -Chris
    >
    >
    >
    >
    


Re: purging of old job artifacts

Posted by Greg Stein <gs...@gmail.com>.
Note that jobs will remain.

This is only about deleting build artifacts.

On Thu, Apr 26, 2018 at 12:40 PM, Alex Harui <ah...@adobe.com.invalid>
wrote:

> HI Chris,
>
> Thanks for the list.
>
> I’m going through the Flex-related jobs and have some feedback:
>
> flex-blazeds (maven)  We’ve kept this build around even though it hasn’t
> run in a while in case we need to do another release of blazeds.  I would
> like to keep at least one known good build in case we have trouble
> resurrecting it later if we need it, even though it may sit idle for years.
>
> flex-flexunit_1
> flex-sdk_1
> flex-sdk_pixelbender_1
> flex-sdk_release_1
> flex-tlf_1
> flex_sdk_version  I cannot find a project with these names in Jenkins.  So
> feel free to toss it.
>
>
> flex-flexunit (maven)  This project was never completed to build
> successfully, but it would be nice to keep it around in case we need it.
>
> FlexJS Compiler (maven)
> FlexJS Framework (maven)
> FlexJS Pipeline
> FlexJS Typedefs (maven)  Looks like we never set the build limit, so I
> just did that.  The project is disabled, we are keeping it around as
> archival for Royale, so not sure it will clean itself up.
>
> flex-productdashboard I deleted this project.
>
> flex-tool-api (maven)
> flex-sdk-converter (maven)  I’m not seeing old artifacts in these
> projects, but they may also sit idle for years until some bug needs fixing.
>
> Flex-Site (Maven) This project never took off, but again it would be nice
> to keep it around in case it gets revived
>
> In sum, a project like Flex may have several kinds of “products” with
> varying activity levels and thus may have jobs that are idle for years and
> it can be helpful to keep at least the last build around as a reference in
> case the next time we run the build there is a failure.  Please notify us
> if we miss limiting the number of old builds.  I think I fixed the ones
> that didn’t have limits.  But there does seem to be folders left around for
> builds I think we deleted.
>
> Thanks,
> -Alex
>
>
> From: Chris Lambertus <cm...@apache.org>
> Reply-To: <bu...@apache.org>
> Date: Wednesday, April 25, 2018 at 12:04 AM
> To: <bu...@apache.org>
> Subject: Re: purging of old job artifacts
>
>
>
>
> On Apr 24, 2018, at 8:04 PM, Allen Wittenauer <aw@effectivemachines.com<
> mailto:aw@effectivemachines.com>> wrote:
>
>
>
> On Apr 24, 2018, at 5:01 PM, Greg Stein <gstein@gmail.com<mailto:gstei
> n@gmail.com>> wrote:
>
> Let's go back to the start: stuff older than six months will be deleted.
> What could possibly need to be retained?
>
>                 - Not every job runs every day.  Some are extremely
> situational.
>
> The artifacts do not need to be kept in perpetuity. When every project
> does this, there are significant costs in both disk space and performance.
> Our policy has been 30 days or 10 jobs retention.
>
>
>
>
>                 - Some users might have specifically marked certain data
> to be retained for very specific reasons.
>
>                 I know in my case I marked some logs to not be deleted
> because I was using them to debug the systemic Jenkins build node crashes.
> I want to keep the data to see if the usage numbers, etc, go down over time.
>
>
> Part of the systemic problems are due to copious amounts of historical
> data which are loaded into jenkins on startup, inflating the memory usage
> and startup times. Again, when every job does this, it adds up, and many of
> the problems we’re facing appear to be rooted in the very large number of
> artifacts we have.
>
>
>
>                 So yes, there may be some value to some of that data that
> will not be obvious to an outside observer.
>
>
> Assume all jobs will be touched.
>
>                 … which is why giving a directory listing of just the base
> directory would be useful to see who needs to look. If INFRA is unwilling
> to provide that data, then keep any directories that reference:
>
>
> Please dispense with the passive aggressive “unwilling to provide”
> nonsense. This is inflammatory and anti-Infra for no valid reason. This
> process is meant to be a pragmatic approach to cleaning up and improving a
> service used by a large number of projects. The fact that I didn’t have
> time to post the job list in the 4 hours since my last reply does not need
> to be construed as reticence on Infra’s part to provide it.
>
> The top-level list of jobs is available here:
> https://paste.apache.org/r37e
>
> I am happy to provide further information, however, due to the disk IO
> issues on jenkins-master and the size of the jobs/ dir, multiple scans and
> data analytics are difficult to provide due to the timescale.
>
>
> As I previously mentioned, the list of actual artifacts currently slated
> for deletion is 590MB and took several hours to generate. I also misspoke
> earlier, that list is for artifacts over one year old. The space which
> would be freed up is over 480GB. The list of artifacts over 180 days old is
> going to be much longer, but I can look into making it available somewhere.
> I question the utility though, as the 1 year data is over 3 million lines.
>
>
>
>
>                 - precommit
>                 - hadoop
>                 - yarn
>                 - hdfs
>                 - mapreduce
>                 - hbase
>                 - yetus
>
>
> We will not be cherry-picking jobs to exclude from the purge unless there
> is a compelling operational reason to do so. Jenkins is a shared resource,
> and all projects are affected equally.
>
>
> Let me do some further research and compare the size and file counts for
> artifacts vs. build metadata (logs, etc.)
>
> The main things we want to purge are:
>
> - all artifacts and metadata where the job/project longer exists
> - binary artifacts with no value older than 180 days
>
> and, to a lesser extent, jobs which fall outside our general 30 day/10
> jobs retention policy.
>
>
> As an example of ancient binary artifacts, there are 22MB of javadocs from
> 2013 in /x1/jenkins/jenkins-home/jobs/ManifoldCF-mvn
>
> Using the yetus jobs as a reference, yetus-java builds 480 and 481 are
> nearly a year old, but only contain a few kilobytes of data. While removing
> them saves no space, they also provide no value, but are still
> loaded/parsed by jenkins. Since they don’t contain valid jenkins objects,
> they don’t even show up in the build history, but are still part of the
> constant scanning of the jobs/ directory that jenkins does, and contribute
> to high load and disk IO. Those two are the only +180 day artifacts for
> yetus with the exception of a zero-byte legacyIds file for -qbt.
>
> root@jenkins-master:/x1/jenkins/jenkins-home/jobs# find yetus-* -mtime
> +180 -ls
>  69210803      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12  2017
> yetus-java/builds/481
>  69210815      4 -rw-r--r--   1 jenkins  jenkins       457 Jul  8  2017
> yetus-java/builds/481/polling.log
>  65813999      0 lrwxrwxrwx   1 jenkins  jenkins         2 May 23  2016
> yetus-java/builds/lastUnstableBuild -> -1
>  65814012      0 -rw-r--r--   1 jenkins  jenkins         0 May 23  2016
> yetus-java/builds/legacyIds
>  69210796      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12  2017
> yetus-java/builds/480
>  69210810      4 -rw-r--r--   1 jenkins  jenkins       456 Jul  7  2017
> yetus-java/builds/480/polling.log
>  23725477      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15  2017
> yetus-qbt/builds/lastStableBuild -> -1
>  23741645      0 lrwxrwxrwx   1 jenkins  jenkins         2 Apr 14  2016
> yetus-qbt/builds/lastUnstableBuild -> -1
>  23725478      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15  2017
> yetus-qbt/builds/lastSuccessfulBuild -> -1
>  23741647      0 -rw-r--r--   1 jenkins  jenkins         0 Apr 14  2016
> yetus-qbt/builds/legacyIds
>
> For mapreduce, there is an empty Mapreduce-Patch-vesta.apache.org<
> http://Mapreduce-Patch-vesta.apache.org> from 2010, and a bunch of jobs
> from June 2017 for PreCommit-MAPREDUCE-Build (6999-7006.) Again, while they
> take up very little space, they are still loaded into jenkins and scanned
> by the threads which watch the jobs/ dir for changes. Multiply this times
> 2381 top level job configs, and you can see why we’re hoping this type of
> purge will help improve jenkins performance and the frequent crashing.
>
>
> Since we are looking to move to expensive NVMe disks (nearly 4TB worth) we
> also need to perform due diligence to insure that we are not migrating and
> maintaining ancient data.
>
> -Chris
>
>
>
>

Re: purging of old job artifacts

Posted by Alex Harui <ah...@adobe.com.INVALID>.
HI Chris,

Thanks for the list.

I’m going through the Flex-related jobs and have some feedback:

flex-blazeds (maven)  We’ve kept this build around even though it hasn’t run in a while in case we need to do another release of blazeds.  I would like to keep at least one known good build in case we have trouble resurrecting it later if we need it, even though it may sit idle for years.

flex-flexunit_1
flex-sdk_1
flex-sdk_pixelbender_1
flex-sdk_release_1
flex-tlf_1
flex_sdk_version  I cannot find a project with these names in Jenkins.  So feel free to toss it.


flex-flexunit (maven)  This project was never completed to build successfully, but it would be nice to keep it around in case we need it.

FlexJS Compiler (maven)
FlexJS Framework (maven)
FlexJS Pipeline
FlexJS Typedefs (maven)  Looks like we never set the build limit, so I just did that.  The project is disabled, we are keeping it around as archival for Royale, so not sure it will clean itself up.

flex-productdashboard I deleted this project.

flex-tool-api (maven)
flex-sdk-converter (maven)  I’m not seeing old artifacts in these projects, but they may also sit idle for years until some bug needs fixing.

Flex-Site (Maven) This project never took off, but again it would be nice to keep it around in case it gets revived

In sum, a project like Flex may have several kinds of “products” with varying activity levels and thus may have jobs that are idle for years and it can be helpful to keep at least the last build around as a reference in case the next time we run the build there is a failure.  Please notify us if we miss limiting the number of old builds.  I think I fixed the ones that didn’t have limits.  But there does seem to be folders left around for builds I think we deleted.

Thanks,
-Alex


From: Chris Lambertus <cm...@apache.org>
Reply-To: <bu...@apache.org>
Date: Wednesday, April 25, 2018 at 12:04 AM
To: <bu...@apache.org>
Subject: Re: purging of old job artifacts




On Apr 24, 2018, at 8:04 PM, Allen Wittenauer <aw...@effectivemachines.com>> wrote:



On Apr 24, 2018, at 5:01 PM, Greg Stein <gs...@gmail.com>> wrote:

Let's go back to the start: stuff older than six months will be deleted.
What could possibly need to be retained?

                - Not every job runs every day.  Some are extremely situational.

The artifacts do not need to be kept in perpetuity. When every project does this, there are significant costs in both disk space and performance. Our policy has been 30 days or 10 jobs retention.




                - Some users might have specifically marked certain data to be retained for very specific reasons.

                I know in my case I marked some logs to not be deleted because I was using them to debug the systemic Jenkins build node crashes. I want to keep the data to see if the usage numbers, etc, go down over time.


Part of the systemic problems are due to copious amounts of historical data which are loaded into jenkins on startup, inflating the memory usage and startup times. Again, when every job does this, it adds up, and many of the problems we’re facing appear to be rooted in the very large number of artifacts we have.



                So yes, there may be some value to some of that data that will not be obvious to an outside observer.


Assume all jobs will be touched.

                … which is why giving a directory listing of just the base directory would be useful to see who needs to look. If INFRA is unwilling to provide that data, then keep any directories that reference:


Please dispense with the passive aggressive “unwilling to provide” nonsense. This is inflammatory and anti-Infra for no valid reason. This process is meant to be a pragmatic approach to cleaning up and improving a service used by a large number of projects. The fact that I didn’t have time to post the job list in the 4 hours since my last reply does not need to be construed as reticence on Infra’s part to provide it.

The top-level list of jobs is available here: https://paste.apache.org/r37e

I am happy to provide further information, however, due to the disk IO issues on jenkins-master and the size of the jobs/ dir, multiple scans and data analytics are difficult to provide due to the timescale.


As I previously mentioned, the list of actual artifacts currently slated for deletion is 590MB and took several hours to generate. I also misspoke earlier, that list is for artifacts over one year old. The space which would be freed up is over 480GB. The list of artifacts over 180 days old is going to be much longer, but I can look into making it available somewhere. I question the utility though, as the 1 year data is over 3 million lines.




                - precommit
                - hadoop
                - yarn
                - hdfs
                - mapreduce
                - hbase
                - yetus


We will not be cherry-picking jobs to exclude from the purge unless there is a compelling operational reason to do so. Jenkins is a shared resource, and all projects are affected equally.


Let me do some further research and compare the size and file counts for artifacts vs. build metadata (logs, etc.)

The main things we want to purge are:

- all artifacts and metadata where the job/project longer exists
- binary artifacts with no value older than 180 days

and, to a lesser extent, jobs which fall outside our general 30 day/10 jobs retention policy.


As an example of ancient binary artifacts, there are 22MB of javadocs from 2013 in /x1/jenkins/jenkins-home/jobs/ManifoldCF-mvn

Using the yetus jobs as a reference, yetus-java builds 480 and 481 are nearly a year old, but only contain a few kilobytes of data. While removing them saves no space, they also provide no value, but are still loaded/parsed by jenkins. Since they don’t contain valid jenkins objects, they don’t even show up in the build history, but are still part of the constant scanning of the jobs/ directory that jenkins does, and contribute to high load and disk IO. Those two are the only +180 day artifacts for yetus with the exception of a zero-byte legacyIds file for -qbt.

root@jenkins-master:/x1/jenkins/jenkins-home/jobs# find yetus-* -mtime +180 -ls
 69210803      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12  2017 yetus-java/builds/481
 69210815      4 -rw-r--r--   1 jenkins  jenkins       457 Jul  8  2017 yetus-java/builds/481/polling.log
 65813999      0 lrwxrwxrwx   1 jenkins  jenkins         2 May 23  2016 yetus-java/builds/lastUnstableBuild -> -1
 65814012      0 -rw-r--r--   1 jenkins  jenkins         0 May 23  2016 yetus-java/builds/legacyIds
 69210796      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12  2017 yetus-java/builds/480
 69210810      4 -rw-r--r--   1 jenkins  jenkins       456 Jul  7  2017 yetus-java/builds/480/polling.log
 23725477      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15  2017 yetus-qbt/builds/lastStableBuild -> -1
 23741645      0 lrwxrwxrwx   1 jenkins  jenkins         2 Apr 14  2016 yetus-qbt/builds/lastUnstableBuild -> -1
 23725478      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15  2017 yetus-qbt/builds/lastSuccessfulBuild -> -1
 23741647      0 -rw-r--r--   1 jenkins  jenkins         0 Apr 14  2016 yetus-qbt/builds/legacyIds

For mapreduce, there is an empty Mapreduce-Patch-vesta.apache.org<http://Mapreduce-Patch-vesta.apache.org> from 2010, and a bunch of jobs from June 2017 for PreCommit-MAPREDUCE-Build (6999-7006.) Again, while they take up very little space, they are still loaded into jenkins and scanned by the threads which watch the jobs/ dir for changes. Multiply this times 2381 top level job configs, and you can see why we’re hoping this type of purge will help improve jenkins performance and the frequent crashing.


Since we are looking to move to expensive NVMe disks (nearly 4TB worth) we also need to perform due diligence to insure that we are not migrating and maintaining ancient data.

-Chris




Re: purging of old job artifacts

Posted by Daniel Kulp <dk...@apache.org>.

> On Apr 25, 2018, at 3:04 AM, Chris Lambertus <cm...@apache.org> wrote:
> 
> The top-level list of jobs is available here: https://paste.apache.org/r37e
> 

That list includes:

	• CXF-2.0-deploy
	• CXF-2.0.x-JDK15
	• CXF-2.1-deploy
	• CXF-2.1.x-JDK15

Those jobs were deleted from Jenkins years ago.    They aren’t available from the front end at all.    Is it possible that more of the directories on the list don’t have jobs associated with them anymore and could just be wiped out?


-- 
Daniel Kulp
dkulp@apache.org - http://dankulp.com/blog
Talend Community Coder - http://coders.talend.com


Re: purging of old job artifacts

Posted by Chris Lambertus <cm...@apache.org>.

> On Apr 24, 2018, at 8:04 PM, Allen Wittenauer <aw...@effectivemachines.com> wrote:
> 
> 
>> On Apr 24, 2018, at 5:01 PM, Greg Stein <gs...@gmail.com> wrote:
>> 
>> Let's go back to the start: stuff older than six months will be deleted.
>> What could possibly need to be retained?
> 
> 	- Not every job runs every day.  Some are extremely situational.

The artifacts do not need to be kept in perpetuity. When every project does this, there are significant costs in both disk space and performance. Our policy has been 30 days or 10 jobs retention.



> 	- Some users might have specifically marked certain data to be retained for very specific reasons.
> 
> 	I know in my case I marked some logs to not be deleted because I was using them to debug the systemic Jenkins build node crashes. I want to keep the data to see if the usage numbers, etc, go down over time.


Part of the systemic problems are due to copious amounts of historical data which are loaded into jenkins on startup, inflating the memory usage and startup times. Again, when every job does this, it adds up, and many of the problems we’re facing appear to be rooted in the very large number of artifacts we have.


> 
> 	So yes, there may be some value to some of that data that will not be obvious to an outside observer.
> 
>> Assume all jobs will be touched.
> 
> 	… which is why giving a directory listing of just the base directory would be useful to see who needs to look. If INFRA is unwilling to provide that data, then keep any directories that reference:


Please dispense with the passive aggressive “unwilling to provide” nonsense. This is inflammatory and anti-Infra for no valid reason. This process is meant to be a pragmatic approach to cleaning up and improving a service used by a large number of projects. The fact that I didn’t have time to post the job list in the 4 hours since my last reply does not need to be construed as reticence on Infra’s part to provide it.

The top-level list of jobs is available here: https://paste.apache.org/r37e <https://paste.apache.org/r37e>

I am happy to provide further information, however, due to the disk IO issues on jenkins-master and the size of the jobs/ dir, multiple scans and data analytics are difficult to provide due to the timescale.


As I previously mentioned, the list of actual artifacts currently slated for deletion is 590MB and took several hours to generate. I also misspoke earlier, that list is for artifacts over one year old. The space which would be freed up is over 480GB. The list of artifacts over 180 days old is going to be much longer, but I can look into making it available somewhere. I question the utility though, as the 1 year data is over 3 million lines.


> 
> 	- precommit
> 	- hadoop
> 	- yarn
> 	- hdfs
> 	- mapreduce
> 	- hbase
> 	- yetus



We will not be cherry-picking jobs to exclude from the purge unless there is a compelling operational reason to do so. Jenkins is a shared resource, and all projects are affected equally.


Let me do some further research and compare the size and file counts for artifacts vs. build metadata (logs, etc.)

The main things we want to purge are:

- all artifacts and metadata where the job/project longer exists
- binary artifacts with no value older than 180 days

and, to a lesser extent, jobs which fall outside our general 30 day/10 jobs retention policy.


As an example of ancient binary artifacts, there are 22MB of javadocs from 2013 in /x1/jenkins/jenkins-home/jobs/ManifoldCF-mvn

Using the yetus jobs as a reference, yetus-java builds 480 and 481 are nearly a year old, but only contain a few kilobytes of data. While removing them saves no space, they also provide no value, but are still loaded/parsed by jenkins. Since they don’t contain valid jenkins objects, they don’t even show up in the build history, but are still part of the constant scanning of the jobs/ directory that jenkins does, and contribute to high load and disk IO. Those two are the only +180 day artifacts for yetus with the exception of a zero-byte legacyIds file for -qbt.

root@jenkins-master:/x1/jenkins/jenkins-home/jobs# find yetus-* -mtime +180 -ls
 69210803      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12  2017 yetus-java/builds/481
 69210815      4 -rw-r--r--   1 jenkins  jenkins       457 Jul  8  2017 yetus-java/builds/481/polling.log
 65813999      0 lrwxrwxrwx   1 jenkins  jenkins         2 May 23  2016 yetus-java/builds/lastUnstableBuild -> -1
 65814012      0 -rw-r--r--   1 jenkins  jenkins         0 May 23  2016 yetus-java/builds/legacyIds
 69210796      4 drwxr-xr-x   2 jenkins  jenkins      4096 Jul 12  2017 yetus-java/builds/480
 69210810      4 -rw-r--r--   1 jenkins  jenkins       456 Jul  7  2017 yetus-java/builds/480/polling.log
 23725477      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15  2017 yetus-qbt/builds/lastStableBuild -> -1
 23741645      0 lrwxrwxrwx   1 jenkins  jenkins         2 Apr 14  2016 yetus-qbt/builds/lastUnstableBuild -> -1
 23725478      0 lrwxrwxrwx   1 jenkins  jenkins         2 Jun 15  2017 yetus-qbt/builds/lastSuccessfulBuild -> -1
 23741647      0 -rw-r--r--   1 jenkins  jenkins         0 Apr 14  2016 yetus-qbt/builds/legacyIds

For mapreduce, there is an empty Mapreduce-Patch-vesta.apache.org <http://mapreduce-patch-vesta.apache.org/> from 2010, and a bunch of jobs from June 2017 for PreCommit-MAPREDUCE-Build (6999-7006.) Again, while they take up very little space, they are still loaded into jenkins and scanned by the threads which watch the jobs/ dir for changes. Multiply this times 2381 top level job configs, and you can see why we’re hoping this type of purge will help improve jenkins performance and the frequent crashing.


Since we are looking to move to expensive NVMe disks (nearly 4TB worth) we also need to perform due diligence to insure that we are not migrating and maintaining ancient data.

-Chris




Re: purging of old job artifacts

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
> On Apr 24, 2018, at 5:01 PM, Greg Stein <gs...@gmail.com> wrote:
> 
> Let's go back to the start: stuff older than six months will be deleted.
> What could possibly need to be retained?

	- Not every job runs every day.  Some are extremely situational.

	- Some users might have specifically marked certain data to be retained for very specific reasons.

	I know in my case I marked some logs to not be deleted because I was using them to debug the systemic Jenkins build node crashes. I want to keep the data to see if the usage numbers, etc, go down over time.

	So yes, there may be some value to some of that data that will not be obvious to an outside observer.

> Assume all jobs will be touched.

	… which is why giving a directory listing of just the base directory would be useful to see who needs to look. If INFRA is unwilling to provide that data, then keep any directories that reference:

	- precommit
	- hadoop
	- yarn
	- hdfs
	- mapreduce
	- hbase
	- yetus

Thanks!


Re: purging of old job artifacts

Posted by Greg Stein <gs...@gmail.com>.
Let's go back to the start: stuff older than six months will be deleted.
What could possibly need to be retained?

Assume all jobs will be touched.


On Tue, Apr 24, 2018, 18:32 Allen Wittenauer <aw...@effectivemachines.com>
wrote:

>
> > On Apr 24, 2018, at 4:27 PM, Chris Lambertus <cm...@apache.org> wrote:
> >
> > The initial artifact list is over 3 million lines long and 590MB.
>
>         Yikes. OK.  How big is the list of jobs?  [IIRC, that should be
> the second part of the file path. e.g., test-ulimit ]  That’d give us some
> sort of scope, who is actually impacted, and hopefully allow everyone to
> clean up their stuff. :)
>
>
> Thanks

Re: purging of old job artifacts

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
> On Apr 24, 2018, at 4:27 PM, Chris Lambertus <cm...@apache.org> wrote:
> 
> The initial artifact list is over 3 million lines long and 590MB.

	Yikes. OK.  How big is the list of jobs?  [IIRC, that should be the second part of the file path. e.g., test-ulimit ]  That’d give us some sort of scope, who is actually impacted, and hopefully allow everyone to clean up their stuff. :)


Thanks

Re: purging of old job artifacts

Posted by Chris Lambertus <cm...@apache.org>.

> On Apr 24, 2018, at 4:23 PM, Allen Wittenauer <aw...@effectivemachines.com> wrote:
> 
> 
>> On Apr 24, 2018, at 4:13 PM, Chris Lambertus <cm...@apache.org> wrote:
>> 
>> If anyone has concerns over this course of action, please reply here.
> 
> 	Could we get a list?
> 
> 	Thanks!
> 


The initial artifact list is over 3 million lines long and 590MB. Here’s a brief random sampling:

 30807873      4 drwxr-xr-x   2 jenkins  jenkins      4096 Mar  7  2013 ./oozie-trunk-w-hadoop-1/modules/org.apache.oozie$oozie-hadoop-distcp/builds/2012-08-07_04-40-46


 74462300       4 -rw-r--r--    1 jenkins  jenkins        1990 Aug 26  2016 ./airavata/modules/org.apache.airavata$airavata-api/builds/4/log
 74724835       4 drwxr-xr-x    3 jenkins  jenkins        4096 Aug 26  2016 ./airavata/modules/org.apache.airavata$airavata-api/builds/4/archive
 74871554       4 drwxr-xr-x    3 jenkins  jenkins        4096 Aug 26  2016 ./airavata/modules/org.apache.airavata$airavata-api/builds/4/archive/org.apache.airavata
 74994300       4 drwxr-xr-x    3 jenkins  jenkins        4096 Aug 26  2016 ./airavata/modules/org.apache.airavata$airavata-api/builds/4/archive/org.apache.airavata/airavata-api
 75118070       4 drwxr-xr-x    2 jenkins  jenkins        4096 Aug 26  2016 ./airavata/modules/org.apache.airavata$airavata-api/builds/4/archive/org.apache.airavata/airavata-api/0.17-SNAPSHOT
 75118071       4 -rw-r--r--    1 jenkins  jenkins        2141 Aug 24  2016 ./airavata/modules/org.apache.airavata$airavata-api/builds/4/archive/org.apache.airavata/airavata-api/0.17-SNAPSHOT/airavata-api-0.17-SNAPSHOT.pom


  2681769       4 -rw-r--r--    1 jenkins  jenkins         266 May  8  2014 ./test-ulimit/disk-usage.xml
  2681760       0 lrwxrwxrwx    1 jenkins  jenkins          26 Oct  2  2014 ./test-ulimit/lastSuccessful -> builds/lastSuccessfulBuild
  2681770       4 -rw-r--r--    1 jenkins  jenkins           3 Oct  2  2014 ./test-ulimit/nextBuildNumber
  2681759       0 lrwxrwxrwx    1 jenkins  jenkins          22 Oct  2  2014 ./test-ulimit/lastStable -> builds/lastStableBuild

 76299516       4 drwxr-xr-x    2 jenkins  jenkins        4096 Mar  7  2013 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-02-02_02-05-27
 76299517      16 -rw-r--r--    1 jenkins  jenkins       12459 Aug  9  2012 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-02-02_02-05-27/build.xml
 75781607       4 drwxr-xr-x    2 jenkins  jenkins        4096 Mar  7  2013 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-01-29_02-19-13
 75781608      16 -rw-r--r--    1 jenkins  jenkins       12460 Aug  9  2012 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-01-29_02-19-13/build.xml
 76567455       4 drwxr-xr-x    2 jenkins  jenkins        4096 Mar  7  2013 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-02-04_05-44-59
 76567456      16 -rw-r--r--    1 jenkins  jenkins       12460 Aug  9  2012 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-02-04_05-44-59/build.xml
 76443978       4 drwxr-xr-x    2 jenkins  jenkins        4096 Mar  7  2013 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-02-03_02-27-01
 76443979      16 -rw-r--r--    1 jenkins  jenkins       12471 Aug  9  2012 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-02-03_02-27-01/build.xml
 76173989       4 drwxr-xr-x    2 jenkins  jenkins        4096 Mar  7  2013 ./Shindig\ 1.0.x\ branch/modules/org.apache.shindig$shindig-common/builds/2010-02-01_01-13-55

  3551748       8 -rw-r--r--    1 jenkins  jenkins        5933 Feb 21  2015 ./maven-project-resources/modules/org.apache.resources.sample$resources-bundles-sample/builds/47/log
  3588219       4 drwxr-xr-x    3 jenkins  jenkins        4096 Feb 21  2015 ./maven-project-resources/modules/org.apache.resources.sample$resources-bundles-sample/builds/47/archive
  3588220       4 drwxr-xr-x    3 jenkins  jenkins        4096 Feb 21  2015 ./maven-project-resources/modules/org.apache.resources.sample$resources-bundles-sample/builds/47/archive/org.apache.resources.sample
  3588221       4 drwxr-xr-x    3 jenkins  jenkins        4096 Feb 21  2015 ./maven-project-resources/modules/org.apache.resources.sample$resources-bundles-sample/builds/47/archive/org.apache.resources.sample/resources-bundles-sample
  3588222       4 drwxr-xr-x    2 jenkins  jenkins        4096 Feb 21  2015 ./maven-project-resources/modules/org.apache.resources.sample$resources-bundles-sample/builds/47/archive/org.apache.resources.sample/resources-bundles-sample/1.0-SNAPSHOT
  3551749      16 -rw-r--r--    1 jenkins  jenkins       14366 Feb 21  2015 ./maven-project-resources/modules/org.apache.resources.sample$resources-bundles-sample/builds/47/archive/org.apache.resources.sample/resources-bundles-sample/1.0-SNAPSHOT/resources-bu




Re: purging of old job artifacts

Posted by Allen Wittenauer <aw...@effectivemachines.com>.
> On Apr 24, 2018, at 4:13 PM, Chris Lambertus <cm...@apache.org> wrote:
> 
> If anyone has concerns over this course of action, please reply here.

	Could we get a list?

	Thanks!