You are viewing a plain text version of this content. The canonical link for it is here.
Posted to repository@apache.org by Brett Porter <br...@apache.org> on 2008/09/02 08:15:48 UTC

Re: Monitoring the snapshot repo

Joe,

I've only done a little perl, so I didn't really get the scripts,  
however the rules you've described and the output look spot on (at a  
cursory glance, I checked a couple of artifacts).

I'd be happy for us to run this regularly.

Cheers,
Brett

On 18/08/2008, at 4:38 AM, Joe Schaefer wrote:

> I wrote:
>
>> Wendy wrote:
>
>>>> We'll get some sort of automated notification going to  
>>>> repository@ so
>>>> the volunteers there can keep an eye on the size of the snapshot  
>>>> repo
>>>> before it causes problems for the rest of the infra team.
>
>>> Any volunteers for this part?  I know Hen already has some scripts
>>> running against the repos looking for new items, it might make sense
>>> to bring those into infra svn somewhere along with this (and  
>>> possibly
>>> Henk's signature checking scripts as well.)
>
>> How about a cron that runs once a month that just does
>
>> du -sh /x1/www/people.apache.org/repo/m2-snapshot-repository/org/ 
>> apache/*
>
>>>> And we'll figure out whether it makes sense to automatically purge
>>>> this repo using some code that understands how to fix the  
>>>> metadata and
>>>> keep the latest snapshot.  (There might be a prerequisite to that,
>>>> getting people to fix the permissions in that repo.)
>
>>> I believe Brett sent Joe one script to do this, and there's also  
>>> code
>>> in Archiva and/or Continuum that knows how to purge a repo, keeping
>>> the latest snapshot, fixing the metadata, etc.
>
>> The script Brett sent me doesn't preserve snapshots older than 1  
>> month,
>> and it doesn't do anything with the metadata.
>
>>> Again, volunteers to figure out the best way to do this and get it
>>> in place are welcome.
>
>> I'd be more than happy to write a simple script for our committers'
>> use which cleans up their old snapshots, if folks here would be
>> willing to actually spec it out.
>
> Well I took a crack at it sans-spec, after talking with Wendy a bit on
> #asfinfra.  Rather than decide whether we should create a central cron
> that monitors the entire repo, or give committers the tools  
> necessary to
> clean up after themselves, I chose to take both roads for now.
>
> In ~joes/bin on people.apache.org there are 2 scripts:
>
>   list_stale_snapshots.pl
>   find_leaf_dirs.pl
>
> The first script is meant for committers to use to clean up their  
> snapshot
> dirs themselves.  You feed it a list of directories to monitor on  
> stdin,
> and pass it an argument which represents the number of days worth of  
> snapshots you'd like to keep, and it will list all the files in  
> those dirs
> which are stale, while preserving the most recent set of snapshot  
> artifacts
> should all the files in a dir be considered stale.
>
> The second script is meant for us to use to list the directories  
> where we
> might find snapshot artifacts.  I'm assuming that snapshots are only
> located at the ends of the filesystem, within directories that  
> contain no
> subdirs.  The argument you pass the script is the base directory where
> the search begins.
>
> I've posted today's output of
>
> %  find_leaf_dirs.pl \
>    /x1/www/people.apache.org/repo/m2-snapshot-repository/org/apache \
>    | list_stale_snapshots.pl 30
>
> at http://people.apache.org/~joes/stale_snapshots.txt
> Note this output represents a list of all snapshots
> currently older than 30 days, and the listing is already
> over 4000 lines long.
>
> Please look over the output for errors and the scripts themselves
> (if you can read perl), give them a try on a few directories here
> and there, and let's work together towards a solution to the
> snapshot growth problem that we can all be happy with.
>
>
>
>
>
>

--
Brett Porter
brett@apache.org
http://blogs.exist.com/bporter/