You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Joshua McKinnon <jm...@gmail.com> on 2011/11/29 21:00:21 UTC

Unreferenced pristines behavior in 1.7

Can anyone comment on the state of unreferenced items not being deleted
in the pristine area in 1.7? The release notes state there are plans to
purge unreferenced items automatically, but the issue link appears to
be a placeholder (excerpt from 1.7 release notes) :


Note: In 1.7, we recommend to run svn cleanup periodically in order to
claim back the disk space of unreferenced pristines. We expect a
future Subversion release to purge unreferenced (and thus unused)
pristines automatically; see issue #XXX for details.


Having it happen automatically instead of needing to regularly run
"svn cleanup" is definitely preferable. I've never had to run it in
the past except in the event of a problem. (e.g. a command did not
complete properly, or something else got borked)

Is there an issue tracking this that I can follow? I did a quick search
but didn't find one.

----------
Joshua

Re: Unreferenced pristines behavior in 1.7

Posted by Mark Phippard <ma...@gmail.com>.
On Tue, Nov 29, 2011 at 3:57 PM, Joshua McKinnon <jm...@gmail.com> wrote:
> On Tue, Nov 29, 2011 at 3:06 PM, Mark Phippard <ma...@gmail.com> wrote:
>> Note that the difference is that now your pristines are shared.  So if
>> you have files in your working copy that are identical there is only a
>> single pristine.  Imagine a checkout of an entire repository,
>> including tags and branches.  For all of those files in your tags and
>> branches that would are just duplicates of each other there is only a
>> single pristine version stored now where as before every file had a
>> pristine.
>
> Oh the new working copy format is absolutely great. The point is only
> that the pristine files appear to build up over time, which seems new.
> Any extra build up is removed with an svn cleanup, but I think a lot of
> people (like me) will not realize they need to perform that step
> regularly.
>
> I am actually in the process of doing an all-branches checkout right
> now, to try and take advantage of the consolidation available in the
> new working copy format. When using SSDs, disk usage matters.
>
>> I could not find any issue # for this either.
>
> If no one else replies within a few days, should I create an issue to
> track this?

I filed an issue and updated the release notes:

http://subversion.apache.org/docs/release-notes/1.7.html#wc-pristines

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Unreferenced pristines behavior in 1.7

Posted by Les Mikesell <le...@gmail.com>.
On Wed, Nov 30, 2011 at 9:45 AM, Gleason, Todd <To...@elekta.com> wrote:
>
> Something similar occurred to me some time ago.  It seems problematic that a user might move or copy WCs outside the root location though; Subversion would have no easy way to track that and might end up needing to re-fetch all the pristines somehow.

I think it would be great to have an option to fetch the pristines
_only_ on demand.

-- 
  Les Mikesell
      lesmikesell@gmail.com

RE: Unreferenced pristines behavior in 1.7

Posted by "Gleason, Todd" <To...@elekta.com>.
> -----Original Message-----
> From: Mark Phippard [mailto:markphip@gmail.com]
> Sent: Tuesday, November 29, 2011 5:14 PM
> To: Talden
> Cc: Joshua McKinnon; users@subversion.apache.org
> Subject: Re: Unreferenced pristines behavior in 1.7
>
> On Tue, Nov 29, 2011 at 6:28 PM, Talden <ta...@gmail.com> wrote:
> > I'd actually like the ability to separate the pristine-store from the
> > WC root since I'd like to have several WCs for the same trunk or
> > branch with various pieces of work-in-progress - sharing pristines
> > there would be great.
> >
> > Maybe something like the Bazaar shared-repositories. Just look up the
> > path until you hit a .svn that contains a pristine-store.
>
> Interesting idea.  The top-level .svn folder would need to include the
> wc.db and the pristines, so maybe the other .svn folders would just be
> empty with some kind of pointer file.
>
> A central pristine store needs a central wc.db to go with it though,
> or else the ref counting could not work.

Something similar occurred to me some time ago.  It seems problematic that a user might move or copy WCs outside the root location though; Subversion would have no easy way to track that and might end up needing to re-fetch all the pristines somehow.  You could put the store into the user's profile or something to make it fully user-global, but you would still have the problem of dealing with users moving or copying WCs around.

To deal with this you might want a more sophisticated management system, where the user could specify locations containing WCs, and Subversion could interrogate their top-level .svn directories to perform a global cleanup (fixing reference counts and whatnot).

Another option would be an MRU-style behavior where you update the global store with indicators of how recently you have seen files (and I have no idea how you could make this scale), and toss pristines that seem stale.  This would be more automatic but it seems like it wouldn't work all that well.

A third option might be a little of a hybrid; you would have Subversion implicitly determine where your WCs were and have it scan around those locations during certain operations, attempting to update reference counts and whatnot.  Still, if you moved a WC then its pristine might get wiped...unless you access them from the new location within a certain period of time.

If you really wanted to make this work well, you'd probably need to install a daemon or service to monitor WC directory (and possibly other directory) locations.  A side advantage of this would be that you could detect changes in real time rather than doing file tree scans to check for modified files.

If you did any of this though, you might want to make optional behaviors such as "share this WCs pristine with other WCs" (aka global storage), "remove pristine if they have not been accessed in x days" (aka MRU behavior), and "watch filesystem for WC changes" (aka monitoring daemon/service).

--Todd


Please consider the environment before printing this e-mail.

The contents of this e-mail message (including any attachments) are confidential to and are intended to be conveyed for the use of the recipient to whom it is addressed only. If you receive this transmission in error, please notify the sender of this immediately and delete the message from your system.  Any distribution, reproduction or use of this message by someone other than recipient is not authorized and may be unlawful.

Re: Unreferenced pristines behavior in 1.7

Posted by Mark Phippard <ma...@gmail.com>.
On Tue, Nov 29, 2011 at 6:28 PM, Talden <ta...@gmail.com> wrote:
> I'd actually like the ability to separate the pristine-store from the
> WC root since I'd like to have several WCs for the same trunk or
> branch with various pieces of work-in-progress - sharing pristines
> there would be great.
>
> Maybe something like the Bazaar shared-repositories. Just look up the
> path until you hit a .svn that contains a pristine-store.

Interesting idea.  The top-level .svn folder would need to include the
wc.db and the pristines, so maybe the other .svn folders would just be
empty with some kind of pointer file.

A central pristine store needs a central wc.db to go with it though,
or else the ref counting could not work.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/

Re: Unreferenced pristines behavior in 1.7

Posted by Talden <ta...@gmail.com>.
> I am actually in the process of doing an all-branches checkout right
> now, to try and take advantage of the consolidation available in the
> new working copy format. When using SSDs, disk usage matters.

I used to work (pre 1.7) with many branches including the trunk in separate WCs

Now I check out the project path with branches and tags being depth
empty except for selective checkout and check out the trunk... all
nicely in the same WC (and hence shared pristines) without checking
everything out.

The shared pristines pay off very well here.  That said, an 'svn
cleanup' recently recovered >200mb and nearly 25000 files so automatic
pristine cleanup will be very nice when it arrives.

I'd actually like the ability to separate the pristine-store from the
WC root since I'd like to have several WCs for the same trunk or
branch with various pieces of work-in-progress - sharing pristines
there would be great.

Maybe something like the Bazaar shared-repositories. Just look up the
path until you hit a .svn that contains a pristine-store.

.../work/
    .svn/ <--- contains a pristine store, not a WC.  All of the
pristines of WCs below.
    projectX/
        default/
            .svn/ <-- the root for this WC
            trunk/ ...
            branches/ <-- @ depth empty - only the branches I want checked out.
                feature-123/ ...
                feature-456/ ...
        that-thing-i-am-doing-on-trunk/  <-- same trunk as
.../work/projectX/default/trunk
            .svn/ <-- the root for this WC
        that-other-thing-i-am-doing-on-trunk/  <-- same trunk as
.../work/projectX/default/trunk
            .svn/ <-- the root for this WC

I personally think this approach is better than the suggestion of a
pristine-store in your profile for the whole machine.  Sometimes you
do want separation. You can always link the folders off elsewhere.

--
Talden

Re: Unreferenced pristines behavior in 1.7

Posted by Joshua McKinnon <jm...@gmail.com>.
On Tue, Nov 29, 2011 at 4:06 PM, Andreas Krey <a....@gmx.de> wrote:
> They do. For every changed file that comes to exist in the sandbox
> a new pristine copy will be lying around; after committing twenty versions
> of a file you have nineteen unreferenced pristines there.
>
> ...
>> I am actually in the process of doing an all-branches checkout right
>> now, to try and take advantage of the consolidation available in the
>> new working copy format.
>
> Unfortunately the consolidation only affects the pristines; you still
> have separate copies of identicat files in the working copy.
>

Thanks for clarifying how the pristine copies work. Regarding the
working copy checkout savings, yes indeed on it only consolidating the
pristine area. In my case that alone is worthwhile savings. It may take
some experimenting to find the right mix of # of branches per checkout
for savings. (I don't need ALL branches, but was curious to see it on
a large sample size)

Also, thank you Mark for updating the release notes & creating the
issue!

----------
Joshua

Re: Unreferenced pristines behavior in 1.7

Posted by Andreas Krey <a....@gmx.de>.
On Tue, 29 Nov 2011 15:57:28 +0000, Joshua McKinnon wrote:
...
> Oh the new working copy format is absolutely great. The point is only
> that the pristine files appear to build up over time, which seems new.

They do. For every changed file that comes to exist in the sandbox
a new pristine copy will be lying around; after committing twenty versions
of a file you have nineteen unreferenced pristines there.

...
> I am actually in the process of doing an all-branches checkout right
> now, to try and take advantage of the consolidation available in the
> new working copy format.

Unfortunately the consolidation only affects the pristines; you still
have separate copies of identicat files in the working copy.

Andreas

-- 
"Totally trivial. Famous last words."
From: Linus Torvalds <torvalds@*.org>
Date: Fri, 22 Jan 2010 07:29:21 -0800

Re: Unreferenced pristines behavior in 1.7

Posted by Joshua McKinnon <jm...@gmail.com>.
On Tue, Nov 29, 2011 at 3:06 PM, Mark Phippard <ma...@gmail.com> wrote:
> Note that the difference is that now your pristines are shared.  So if
> you have files in your working copy that are identical there is only a
> single pristine.  Imagine a checkout of an entire repository,
> including tags and branches.  For all of those files in your tags and
> branches that would are just duplicates of each other there is only a
> single pristine version stored now where as before every file had a
> pristine.

Oh the new working copy format is absolutely great. The point is only
that the pristine files appear to build up over time, which seems new.
Any extra build up is removed with an svn cleanup, but I think a lot of
people (like me) will not realize they need to perform that step
regularly.

I am actually in the process of doing an all-branches checkout right
now, to try and take advantage of the consolidation available in the
new working copy format. When using SSDs, disk usage matters.

> I could not find any issue # for this either.

If no one else replies within a few days, should I create an issue to
track this?

----------
Joshua

Re: Unreferenced pristines behavior in 1.7

Posted by Mark Phippard <ma...@gmail.com>.
On Tue, Nov 29, 2011 at 3:00 PM, Joshua McKinnon <jm...@gmail.com> wrote:

> Having it happen automatically instead of needing to regularly run
> "svn cleanup" is definitely preferable. I've never had to run it in
> the past except in the event of a problem. (e.g. a command did not
> complete properly, or something else got borked)

Note that the difference is that now your pristines are shared.  So if
you have files in your working copy that are identical there is only a
single pristine.  Imagine a checkout of an entire repository,
including tags and branches.  For all of those files in your tags and
branches that would are just duplicates of each other there is only a
single pristine version stored now where as before every file had a
pristine.

I could not find any issue # for this either.

-- 
Thanks

Mark Phippard
http://markphip.blogspot.com/