You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Daniel Berlin <db...@dberlin.org> on 2005/02/09 15:24:06 UTC

Making blame even faster

The only complaint i have heard from gcc people so far is the speed of
blame/annotate.  I'm pretty sure that if i can make blame run at
reasonable speed, nobody will object to us switching.
I would appreciate *any* help people can offer me in implementing any of
the solutions below.

There are a couple things to note:
1. Blame on some of our files takes > 30 minutes.
2. Nobody cares whether their blame is exactly like cvs, as long as it
has some sane format.
3. Byte level blame is fine (IE we don't need to care about lines).

The current blame is slow because it actually expands revisions (using
get_file_revs), AFAICT.  This makes it badly O(n^2), at best.

There are a couple ways to fix this:

1. For a simple byte level blame, we should be able to get the answers
from the diff format alone, without actually expanding it.  You can
simply mark the affected byte ranges,  and stop when the whole file has
been affected.  What the output looks like is up for grabs. 

2. You could actually do a similar thing with line level blame.
Discover the line byte ranges of the current revision, throw it into an
interval tree, and then do queries to see which line a given diff
affects, based on it's byte ranges.  Stop when every line has been
affected.


2 seems a bit harder than 1.

I've been out of it for a while, so maybe there is a something about
implementing #1 i have missed.
Have i missed anything about #1 that makes it hard?

It seems like one can reuse some of the code that does delta
composition, since it probably is already doing a lot of this work.


--Dan



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by "C. Michael Pilato" <cm...@collab.net>.

Wow, Daniel.  Thanks for that excellent explanation!

/me calls it a day, having already learned something (and so early!)

Daniel Berlin <db...@dberlin.org> writes:

> > Doesn't it require a dump/load though?  I thought that means 2.0.
> 
> No, it doesn't.
> 
> The output of both vdelta/xdelta is converted into a binary diff
> format known as svndiff.  That is what is in your repositories, and
> sent over the wire.
> 
> Both of them generate valid svndiff, and the delta applier can apply
> all valid svndiff's.  So nothing notices.  IE It's just the actual
> operations that are in a given svndiff that will change.  Before you
> might have (human version of the svndiff):
> 
> copy from source 0, 50 bytes
> insert new data 4 bytes, "blah"
> copy from target 4 bytes
> 
> with xdelta you might have
> copy from source 0, 50 bytes
> insert new data 8 bytes, "blahblah"
> 
> Of course, your repo will still have svndiff generated by vdelta (which
> can trigger the bad behavior in the combiner) unless you
> dumped/reloaded, but it would still function fine.
> 
> The same is true of all clients and servers. A server could be giving
> you svndiff generated from vdelta, and you could send it back svndiff
> generated by xdelta, and neither would be any the wiser.
> 
> It's good that we didn't directly send xdelta/vdelta results over the
> wire, and actually convert it to an intermediate format, because it
> allows us to do this type of change without backwards compatibly
> problems.
> 
> --Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Daniel Berlin <db...@dberlin.org>.

On Thu, 2005-02-10 at 10:48 -0500, Mark Phippard wrote:
> cmpilato@localhost.localdomain wrote on 02/10/2005 10:43:40 AM:
> 
> > Branko Čibej <br...@xbc.nu> writes:
> > 
> > > The more I look at your numbers, the more it seems that xdelta is a
> > > better choice. The worse compression seems to be balanced by better
> > > performance, which makes sense. Unless I've done something really
> > > stupid in vdelta or the combiner, we should probably look switching to
> > > xdelta in 1.2.
> > 
> > +1.
> 
> Doesn't it require a dump/load though?  I thought that means 2.0.

No, it doesn't.

The output of both vdelta/xdelta is converted into a binary diff format
known as svndiff.
That is what is in your repositories, and sent over the wire.

Both of them generate valid svndiff, and the delta applier can apply all
valid svndiff's.
So nothing notices.
IE It's just the actual operations that are in a given svndiff that will
change.
Before you might have (human version of the svndiff):

copy from source 0, 50 bytes
insert new data 4 bytes, "blah"
copy from target 4 bytes

with xdelta you might have
copy from source 0, 50 bytes
insert new data 8 bytes, "blahblah"

Of course, your repo will still have svndiff generated by vdelta (which
can trigger the bad behavior in the combiner) unless you
dumped/reloaded, but it would still function fine.

The same is true of all clients and servers. A server could be giving
you svndiff generated from vdelta, and you could send it back svndiff
generated by xdelta, and neither would be any the wiser.

It's good that we didn't directly send xdelta/vdelta results over the
wire, and actually convert it to an intermediate format, because it
allows us to do this type of change without backwards compatibly
problems.

--Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Daniel Berlin <db...@dberlin.org>.

On Thu, 2005-02-10 at 08:26 -0800, Justin Erenkrantz wrote:
> --On Thursday, February 10, 2005 11:15 AM -0500 Daniel Berlin 
> <db...@dberlin.org> wrote:
> 
> > Again, you don't have to, since at that level, it's all svndiff, not
> > vdelta or xdelta.
> > I've just changed the svndiff generator, not the svndiff format.
> 
> Right, but in order to use xdelta on an already existing large repository, 
> you'd have to do a dump/load.  Or, am I missing something?  -- justin

Yes.
But that just will buy you some speed. It's not like your repo won't
work.




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Branko Čibej <br...@xbc.nu>.

Justin Erenkrantz wrote:

> --On Thursday, February 10, 2005 11:15 AM -0500 Daniel Berlin 
> <db...@dberlin.org> wrote:
>
>> Again, you don't have to, since at that level, it's all svndiff, not
>> vdelta or xdelta.
>> I've just changed the svndiff generator, not the svndiff format.
>
>
> Right, but in order to use xdelta on an already existing large 
> repository, you'd have to do a dump/load.  Or, am I missing 
> something?  -- justin

Yes. :-)

Once again: The repository stores svndiff, and we send svndiff over the 
wire. What algorithm we use to generate the svndiff doesn't matter, nor 
does it have to be the same algorithm for the whole repository. The code 
that uses the svndiff doesn't in any way depend on the code that 
generates it.

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Thursday, February 10, 2005 11:15 AM -0500 Daniel Berlin 
<db...@dberlin.org> wrote:

> Again, you don't have to, since at that level, it's all svndiff, not
> vdelta or xdelta.
> I've just changed the svndiff generator, not the svndiff format.

Right, but in order to use xdelta on an already existing large repository, 
you'd have to do a dump/load.  Or, am I missing something?  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Daniel Berlin <db...@dberlin.org>.

On Thu, 2005-02-10 at 10:06 -0600, C. Michael Pilato wrote:
> Mark Phippard <Ma...@softlanding.com> writes:
> > 
> > Doesn't it require a dump/load though?  I thought that means 2.0.
> 
> I suspect we can simply teach the filesystem code to recognize (and
> label) xdelta data in any new writes while continue to handle vdelta
> present in existing places.
Again, you don't have to, since at that level, it's all svndiff, not
vdelta or xdelta.
I've just changed the svndiff generator, not the svndiff format.





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by "C. Michael Pilato" <cm...@collab.net>.

Mark Phippard <Ma...@softlanding.com> writes:

> cmpilato@localhost.localdomain wrote on 02/10/2005 10:43:40 AM:
> 
> > Branko Čibej <br...@xbc.nu> writes:
> > 
> > > The more I look at your numbers, the more it seems that xdelta is a
> > > better choice. The worse compression seems to be balanced by better
> > > performance, which makes sense. Unless I've done something really
> > > stupid in vdelta or the combiner, we should probably look switching to
> > > xdelta in 1.2.
> > 
> > +1.
> 
> Doesn't it require a dump/load though?  I thought that means 2.0.

I suspect we can simply teach the filesystem code to recognize (and
label) xdelta data in any new writes while continue to handle vdelta
present in existing places.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by Daniel Berlin <db...@dberlin.org>.

On Sat, 2005-02-26 at 13:09 -0500, Daniel Berlin wrote:
> On Sat, 2005-02-26 at 12:43 -0500, Greg Hudson wrote:
> > On Tue, 2005-02-22 at 20:24, Daniel Berlin wrote:
> > > On Mon, 2005-02-21 at 23:03 -0500, Greg Hudson wrote:
> > > > On Mon, 2005-02-21 at 20:37, Daniel Berlin wrote:
> > > > > ChangeLog is > SVN_DELTA_WINDOW_SIZE when deltified, and you only
> > > > > auto-expand the first window
> > > > 
> > > > That doesn't agree with my reading of Peter's change.  It looks like he
> > > > auto-expands all windows of the last rep.
> > > > 
> > > > (Something is obviously wrong, but I'm not convinced this is it.)
> > > 
> > > Fair enough.
> > > I've put the repo in question (it's about 1/8th the total revisions of
> > > our changelog) at http://www.toolchain.org/~dberlin/repo.dump.bz2
> > 
> > As far as I can tell, what's wrong is the basic premise of this whole
> > optimization.  A chain of deltas with no "copy from target" operations
> > does not necessarily avoid copy_source_ops().
> 
> This is true from the current composition standpoint, however, with no
> overlaps, composition is trivial. Much more trivial than what we do now
> (we shouldn't need the search_offset_index portion, because we can just
> sort the offsets once and walk both windows at the same time).
> 
> Also, the idea was not to avoid copy_source_ops entirely in the current
> scheme, it was to avoid quadratic behavior we got from copy_source_ops,
> which it should.
> 
Just to clarify further, i believe my original message was incorrect,
because we should still see copy_source_ops activity (just not quadratic
behavior).

IOW, nothing to see here at the moment, move along :)

--Dan

PS that'll teach me to write emails while in the midst of stressing over
a bar exam.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by Daniel Berlin <db...@dberlin.org>.

On Sat, 2005-02-26 at 12:43 -0500, Greg Hudson wrote:
> On Tue, 2005-02-22 at 20:24, Daniel Berlin wrote:
> > On Mon, 2005-02-21 at 23:03 -0500, Greg Hudson wrote:
> > > On Mon, 2005-02-21 at 20:37, Daniel Berlin wrote:
> > > > ChangeLog is > SVN_DELTA_WINDOW_SIZE when deltified, and you only
> > > > auto-expand the first window
> > > 
> > > That doesn't agree with my reading of Peter's change.  It looks like he
> > > auto-expands all windows of the last rep.
> > > 
> > > (Something is obviously wrong, but I'm not convinced this is it.)
> > 
> > Fair enough.
> > I've put the repo in question (it's about 1/8th the total revisions of
> > our changelog) at http://www.toolchain.org/~dberlin/repo.dump.bz2
> 
> As far as I can tell, what's wrong is the basic premise of this whole
> optimization.  A chain of deltas with no "copy from target" operations
> does not necessarily avoid copy_source_ops().

This is true from the current composition standpoint, however, with no
overlaps, composition is trivial. Much more trivial than what we do now
(we shouldn't need the search_offset_index portion, because we can just
sort the offsets once and walk both windows at the same time).

Also, the idea was not to avoid copy_source_ops entirely in the current
scheme, it was to avoid quadratic behavior we got from copy_source_ops,
which it should.

:)






---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by Greg Hudson <gh...@MIT.EDU>.

On Tue, 2005-02-22 at 20:24, Daniel Berlin wrote:
> On Mon, 2005-02-21 at 23:03 -0500, Greg Hudson wrote:
> > On Mon, 2005-02-21 at 20:37, Daniel Berlin wrote:
> > > ChangeLog is > SVN_DELTA_WINDOW_SIZE when deltified, and you only
> > > auto-expand the first window
> > 
> > That doesn't agree with my reading of Peter's change.  It looks like he
> > auto-expands all windows of the last rep.
> > 
> > (Something is obviously wrong, but I'm not convinced this is it.)
> 
> Fair enough.
> I've put the repo in question (it's about 1/8th the total revisions of
> our changelog) at http://www.toolchain.org/~dberlin/repo.dump.bz2

As far as I can tell, what's wrong is the basic premise of this whole
optimization.  A chain of deltas with no "copy from target" operations
does not necessarily avoid copy_source_ops().

(This is from loading your sample repository and poking around in gdb
with a checkout of the head of ChangeLog.  I see copy_source_ops()
activity from the very first window composition.  window_B of this
composition comes from rev 4550 and has only four instructions, none of
which are "copy from target" instructions.)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by Daniel Berlin <db...@dberlin.org>.

On Mon, 2005-02-21 at 23:03 -0500, Greg Hudson wrote:
> On Mon, 2005-02-21 at 20:37, Daniel Berlin wrote:
> > ChangeLog is > SVN_DELTA_WINDOW_SIZE when deltified, and you only
> > auto-expand the first window
> 
> That doesn't agree with my reading of Peter's change.  It looks like he
> auto-expands all windows of the last rep.
> 
> (Something is obviously wrong, but I'm not convinced this is it.)

Fair enough.
I've put the repo in question (it's about 1/8th the total revisions of
our changelog) at http://www.toolchain.org/~dberlin/repo.dump.bz2

It's 1.9 meg compressed, 8 meg uncompressed dump file, and 55 meg in
repo form (in both bdb and fsfs. The FSFS is about 50% revprops space in
the case of changelog, of course)

svn blame on the bdb repo doesn't show any copy_source_ops activity.
svn blame on the fsfs repo does.
So something is up, not sure what.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by Greg Hudson <gh...@MIT.EDU>.

On Mon, 2005-02-21 at 20:37, Daniel Berlin wrote:
> ChangeLog is > SVN_DELTA_WINDOW_SIZE when deltified, and you only
> auto-expand the first window

That doesn't agree with my reading of Peter's change.  It looks like he
auto-expands all windows of the last rep.

(Something is obviously wrong, but I'm not convinced this is it.)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by Daniel Berlin <db...@dberlin.org>.

On Mon, 2005-02-14 at 14:34 +0100, Peter N. Lundblad wrote:
> On Sun, 13 Feb 2005, Daniel Berlin wrote:
> 
> >
> > > It depends on your repository, but probably yes.
> > >
> > Just to followup:
> >
> > Brane's timings on windows showed that for the gcc combine.c file (which
> > had ~1500 revisions, spread out across branches, etc), we had
> >
> > blame on an bdb repo with vdelta: 40 seconds
> > blame on an bdb repo with xdelta: 10 seconds
> >
> > blame on an fsfs repo with vdelta: 108 seconds
> > blame on an fsfs repo with xdelta: 105 seconds
> >
> >
> > To fully take advantage of the xdelta speedup on fsfs, you need to
> > implement greg hudson's suggestion of making sure the vdelta rep against
> > empty is *not* combined with other windows, or be willing to give up the
> > first rep being compressed (IE it will store one fulltext in the repo).
> >
> OK. Here we go. On my tests it is a big speedup. I tested on FSFS with and
> without xdelta when the dump was loaded. It seems like we win a lot even
> if vdelta was used to create the repo. Anyway, feel free to fill in the
> missing row above:-)


Okay, i noticed a problem with this patch after a lot of testing, and
trying to figure out why on ChangeLog, copy_source_ops was still at the
top of the list for fsfs.
It then hit me in the shower.

ChangeLog is > SVN_DELTA_WINDOW_SIZE when deltified, and you only
auto-expand the first window.  However, the entire file is vdelta'd,
which means all of the windows.  So the other windows with target ops
get combined, causing quadratic behavior again on this file.

I'm completely in law mode right now, so nothing pops into my head as a
good solution for this problem.

--Dan


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[PATCH] BDB speedup (was: FSFS speedup)

Posted by Branko Čibej <br...@xbc.nu>.

Peter N. Lundblad wrote:

>OK. Here we go. On my tests it is a big speedup. I tested on FSFS with and
>without xdelta when the dump was loaded. It seems like we win a lot even
>if vdelta was used to create the repo. Anyway, feel free to fill in the
>missing row above:-)
>  
>
Well, I've done something similar to the BDB combiner, except that it 
only pre-expands windows that are truly self-compressed. At the moment, 
this makes hardly any difference. However, it paves the way for storing 
fulltexts as self-compressed vdeltas in BDB, too, just as FSFS does it. 
This could significantly reduce the size of BDB repositories, especially 
where there are many active branches.

[[[
Treat self-compressed delta windows as virutal fulltexts in the 
delta combiner; expand them first instead of combining them with
the rest of the delta chain.

* subversion/libsvn_fs_base/reps-strings.c (compose_handler_baton):
   New member source_buf; holds expanded window data.
  (compose_handler): When handling a self-compressed window, expand
   it instead of combining it with the existing (combined) window.
  (rep_undeltify_range): If available, use the expanded window from
   the baton instead of the fulltext. Remove empty_buf.
]]]

-- Brane

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Mon, 14 Feb 2005, Greg Hudson wrote:

> This patch looks great.  Thanks for implementing it.
>
Thx for reviewing as usual.

> I noticed just one style issue: in many places you created lines of
> exactly 80 characters.  If you could tweak your editing environment such

I actually use Emacs in an 80x25 terminal, so I should have been annoyed
myself:-) Thanks for clarifying in hacking.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by Greg Hudson <gh...@MIT.EDU>.

This patch looks great.  Thanks for implementing it.

I noticed just one style issue: in many places you created lines of
exactly 80 characters.  If you could tweak your editing environment such
that you restrict lines to 79 characters instead of 80, that would be
good.  (HACKING just says "stay within 80 columns", but if the goal is
to display well on an 80-column terminal, we need to stop at 79
columns.)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by Daniel Berlin <db...@dberlin.org>.

On Mon, 2005-02-14 at 14:34 +0100, Peter N. Lundblad wrote:
> On Sun, 13 Feb 2005, Daniel Berlin wrote:
> 
> >
> > > It depends on your repository, but probably yes.
> > >
> > Just to followup:
> >
> > Brane's timings on windows showed that for the gcc combine.c file (which
> > had ~1500 revisions, spread out across branches, etc), we had
> >
> > blame on an bdb repo with vdelta: 40 seconds
> > blame on an bdb repo with xdelta: 10 seconds
> >
> > blame on an fsfs repo with vdelta: 108 seconds
> > blame on an fsfs repo with xdelta: 105 seconds
> >
> >
> > To fully take advantage of the xdelta speedup on fsfs, you need to
> > implement greg hudson's suggestion of making sure the vdelta rep against
> > empty is *not* combined with other windows, or be willing to give up the
> > first rep being compressed (IE it will store one fulltext in the repo).
> >
> OK. Here we go. On my tests it is a big speedup. I tested on FSFS with and
> without xdelta when the dump was loaded. It seems like we win a lot even
> if vdelta was used to create the repo. Anyway, feel free to fill in the
> missing row above:-)
> 


Just FYI, i get a large speedup from this as well, and no longer require
the "don't vdelta against empty source" patch i have locally in order to
get good speed.

With current trunk, without patch, blame on large file locally with repo
using fsfs:

real    1m24.982s
user    1m7.414s
sys     0m6.480

With patch:
real    0m47.240s
user    0m28.449s
sys     0m6.306s


(I have a progress bar printing out how far along it is in getting the
revisions for blame, and you can see it's being given fulltexts *MUCH*
quicker)



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[PATCH] FSFS speedup (was: Re: Making blame even faster)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Sun, 13 Feb 2005, Daniel Berlin wrote:

>
> > It depends on your repository, but probably yes.
> >
> Just to followup:
>
> Brane's timings on windows showed that for the gcc combine.c file (which
> had ~1500 revisions, spread out across branches, etc), we had
>
> blame on an bdb repo with vdelta: 40 seconds
> blame on an bdb repo with xdelta: 10 seconds
>
> blame on an fsfs repo with vdelta: 108 seconds
> blame on an fsfs repo with xdelta: 105 seconds
>
>
> To fully take advantage of the xdelta speedup on fsfs, you need to
> implement greg hudson's suggestion of making sure the vdelta rep against
> empty is *not* combined with other windows, or be willing to give up the
> first rep being compressed (IE it will store one fulltext in the repo).
>
OK. Here we go. On my tests it is a big speedup. I tested on FSFS with and
without xdelta when the dump was loaded. It seems like we win a lot even
if vdelta was used to create the repo. Anyway, feel free to fill in the
missing row above:-)

(BTW, while there, I replaced apr_pcalloc with apr_palloc. That in fact
gave us 0.5 seconds or so. That's nothing, but there's no reason to change
t back.)

Regards,
//Peter

Re: Making blame even faster

Posted by Vincent Lefevre <vi...@vinc17.org>.

On 2005-02-13 15:17:52 -0500, Daniel Berlin wrote:
> Brane's timings on windows showed that for the gcc combine.c file
> (which had ~1500 revisions, spread out across branches, etc), we had
> 
> blame on an bdb repo with vdelta: 40 seconds
> blame on an bdb repo with xdelta: 10 seconds
> 
> blame on an fsfs repo with vdelta: 108 seconds
> blame on an fsfs repo with xdelta: 105 seconds
[...]
> With that change in place, you will get numbers roughly the same on
> fsfs and bdb.

That is, with the example above, ~ 10 seconds?

> I can provide a patch that makes the first delta against empty a
> fulltext in fsfs as a temporary solution if you really need a fast
> blame and can't wait for Greg Hudson's solution to be implemented
> and tested.

I'm using Debian packages, so I'll wait...

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Branko Čibej <br...@xbc.nu>.

Daniel Berlin wrote:

>>It depends on your repository, but probably yes.
>>
>>-- Brane
>>    
>>
>
>
>Just to followup:
>
>Brane's timings on windows showed that for the gcc combine.c file (which
>had ~1500 revisions, spread out across branches, etc), we had
>
>blame on an bdb repo with vdelta: 40 seconds
>blame on an bdb repo with xdelta: 10 seconds
>
>blame on an fsfs repo with vdelta: 108 seconds
>blame on an fsfs repo with xdelta: 105 seconds
>  
>
I've put the timings on http://www.xbc.nu/svn/Xdelta_Performace.html.

There's no doubt that the increase in repository size is more than 
compensated by the increase in undeltification speed. With Peter's 
patch, this now holds for FSFS as well as BDB.

When comparing both speed and repository size, BDB still leads in this 
particular test (3.6 times faster vs. 2.3 times larger repo).

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Daniel Berlin <db...@dberlin.org>.

> It depends on your repository, but probably yes.
> 
> -- Brane


Just to followup:

Brane's timings on windows showed that for the gcc combine.c file (which
had ~1500 revisions, spread out across branches, etc), we had

blame on an bdb repo with vdelta: 40 seconds
blame on an bdb repo with xdelta: 10 seconds

blame on an fsfs repo with vdelta: 108 seconds
blame on an fsfs repo with xdelta: 105 seconds


I have mostly the same experience *except* thta blame on an fsfs repo
with xdelta with the patch that was committed was a bit faster than that
(maybe 10 seconds faster)).

To fully take advantage of the xdelta speedup on fsfs, you need to
implement greg hudson's suggestion of making sure the vdelta rep against
empty is *not* combined with other windows, or be willing to give up the
first rep being compressed (IE it will store one fulltext in the repo).

With that change in place, you will get numbers roughly the same on fsfs
and bdb.

I can provide a patch that makes the first delta against empty a
fulltext in fsfs as a temporary solution if you really need a fast blame
and can't wait for Greg Hudson's solution to be implemented and tested.

I should note that i've tried to implement what he proposed, but have
run into problems because the current code tries to seek the src_state,
and other things, so it's not an dead easy conversion to making it a
stream (which is what we really want).

--Dan







---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Branko Čibej <br...@xbc.nu>.

Vincent Lefevre wrote:

>On 2005-02-11 01:12:33 +0100, Branko Čibej wrote:
>  
>
>>Justin Erenkrantz wrote:
>>    
>>
>>>If switching to xdelta increases blame performance by this order of 
>>>magnitude, I forsee a dump/load in my future.  =)  -- justin
>>>      
>>>
>>Of course, you'd need a reload to see a performance improvement for 
>>existing revisions.
>>    
>>
>
>Would this be significant in practice?
>
It depends on your repository, but probably yes.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Vincent Lefevre <vi...@vinc17.org>.

On 2005-02-11 01:12:33 +0100, Branko Čibej wrote:
> Justin Erenkrantz wrote:
> >If switching to xdelta increases blame performance by this order of 
> >magnitude, I forsee a dump/load in my future.  =)  -- justin
> 
> Of course, you'd need a reload to see a performance improvement for 
> existing revisions.

Would this be significant in practice?

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Branko Čibej <br...@xbc.nu>.

Justin Erenkrantz wrote:

> --On Thursday, February 10, 2005 10:48 AM -0500 Mark Phippard 
> <Ma...@softlanding.com> wrote:
>
>> Doesn't it require a dump/load though?  I thought that means 2.0.
>
>
> What it'd mean is that we'd be required to maintain vdelta support 
> through 2.0.  However, new repositories could certainly be created 
> with xdelta support.

No, that's the whole point. We could rip out vdelta, put in xdelta, and 
nobody (repository, server or client) would notice.

> If switching to xdelta increases blame performance by this order of 
> magnitude, I forsee a dump/load in my future.  =)  -- justin

Of course, you'd need a reload to see a performance improvement for 
existing revisions.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On Thursday, February 10, 2005 10:48 AM -0500 Mark Phippard 
<Ma...@softlanding.com> wrote:

> Doesn't it require a dump/load though?  I thought that means 2.0.

What it'd mean is that we'd be required to maintain vdelta support through 
2.0.  However, new repositories could certainly be created with xdelta support.

If switching to xdelta increases blame performance by this order of magnitude, 
I forsee a dump/load in my future.  =)  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Mark Phippard <Ma...@softlanding.com>.

cmpilato@localhost.localdomain wrote on 02/10/2005 10:43:40 AM:

> Branko Čibej <br...@xbc.nu> writes:
> 
> > The more I look at your numbers, the more it seems that xdelta is a
> > better choice. The worse compression seems to be balanced by better
> > performance, which makes sense. Unless I've done something really
> > stupid in vdelta or the combiner, we should probably look switching to
> > xdelta in 1.2.
> 
> +1.

Doesn't it require a dump/load though?  I thought that means 2.0.

Mark

_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by "C. Michael Pilato" <cm...@collab.net>.

Branko Čibej <br...@xbc.nu> writes:

> The more I look at your numbers, the more it seems that xdelta is a
> better choice. The worse compression seems to be balanced by better
> performance, which makes sense. Unless I've done something really
> stupid in vdelta or the combiner, we should probably look switching to
> xdelta in 1.2.

+1.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Daniel Berlin <db...@dberlin.org>.

On Thu, 2005-02-10 at 11:23 +0100, Branko Čibej wrote:
> Daniel Berlin wrote:
> 
> >On Thu, 2005-02-10 at 03:57 +0100, Branko Čibej wrote: 
> >  
> >
> >>Daniel Berlin wrote:
> >>
> >>    
> >>
> >>>1. For a simple byte level blame, we should be able to get the answers
> >>>      
> >>>
> >>>from the diff format alone, without actually expanding it.  You can
> >>    
> >>
> >>>simply mark the affected byte ranges,  and stop when the whole file has
> >>>been affected.  What the output looks like is up for grabs. 
> >>>
> >>>2. You could actually do a similar thing with line level blame.
> >>>Discover the line byte ranges of the current revision, throw it into an
> >>>interval tree, and then do queries to see which line a given diff
> >>>affects, based on it's byte ranges.  Stop when every line has been
> >>>affected.
> >>>
> >>>
> >>>2 seems a bit harder than 1.
> >>>
> >>>I've been out of it for a while, so maybe there is a something about
> >>>implementing #1 i have missed.
> >>>Have i missed anything about #1 that makes it hard?
> >>> 
> >>>
> >>>      
> >>>
> >>If by "diff format" you mean the binary delta, then... There was a time 
> >>when I thought this would be possible. Now I'm not so sure. The trouble 
> >>is that the vdelta doesn't generate an edit stream, it generates a 
> >>compressed block-copy stream. Which means that can never be 100% sure, 
> >>just from looking at the delta, which bytes in the target are really new 
> >>and which are just (offset) copies from the source. The only blocks you 
> >>can really be sure about are those that are represented by new data in 
> >>the delta (either NEW blocks or target copies that take data from NEW 
> >>blocks). This problem is made worse by our use of skip deltas in (at 
> >>least) the BDB back-end.
> >>
> >>I agree that it would be nice if the server could generate some sort of 
> >>byte-range oriented info, but I don't think it can be done just by 
> >>looking at the deltas. It's sad, I know...
> >>
> >>    
> >>
> >After looking at the real slow,i'm not actually worried about this so
> >much anymore.
> >For now it's actually not the main source of slowdown for blame.
> >See my followup mail on the delta combiner.
> >
> >The delta combiner falls down badly when you have a even a moderate
> >number of target ops in the window. It goes n^2 and takes up a large
> >amount of the time
> >  
> >
> You probably mean "target ops in the first window" or "source ops in the 
> second window", yes?
> 
yes.

> I think you're right... every single bit in the combiner is linear or 
> logarithmic, except the tiny little recursion in copy_source_ops which 
> can, indeed, go quadratic with the right mix of data. The problem seems 
> to be that the right mix of data is far too easily acheivable, at least 
> with vdelta...
.
Yup.
This is why i chose xdelta, and only ran it over the source, because
it's guaranteed not to hit this problem when combining happens (and
combining falls completely off the profile).

It also has another interesting nice property:

You are more or less guaranteed that all the same portions of the source
that exist in target will be represented by copies, and that all
different portions will be represented by inserts.

This is because it looks like so (It's early, so any parts that don't
make sense here should be read in connection with the code :P)
1. runs through with a rolling checksum checksumming every sepeate
<blocksize> bytes in the source, creating a match table
2. Hunt matching starting points in the target.
3. For every byte in the target that doesn't match a starting point in
the source, create an insert op for that byte (we rely on the fact that
the inserter knows to combine consecutive insert ops to make a single
long insert op out of this :P)
4. For every byte in the target that does match a starting point in the
source, extend the match as long as possible and create a copy from the
source.

This more or less means that you shouldn't end up with inserts unless
something was really different, because if it was the same, the prior
copy would have been extendable.  This is different from vdelta, where
it was just generating a compressed block copy.  This is a real binary
"edit script" that produces the second from the first.

> 
> /me bangs head against wall
> 
> Problem is I have no idea how to get rid of that recursion in 
> copy_source_ops and still combine deltas correctly.

I'm pretty sure you can't, unfortunately.  The current code is
theoretically sound :P.

> 
> The more I look at your numbers, the more it seems that xdelta is a 
> better choice. The worse compression seems to be balanced by better 
> performance, which makes sense. Unless I've done something really stupid 
> in vdelta or the combiner, we should probably look switching to xdelta 
> in 1.2.
> 
> -- Brane
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Branko Čibej <br...@xbc.nu>.

Daniel Berlin wrote:

>On Thu, 2005-02-10 at 03:57 +0100, Branko Čibej wrote: 
>  
>
>>Daniel Berlin wrote:
>>
>>    
>>
>>>1. For a simple byte level blame, we should be able to get the answers
>>>      
>>>
>>>from the diff format alone, without actually expanding it.  You can
>>    
>>
>>>simply mark the affected byte ranges,  and stop when the whole file has
>>>been affected.  What the output looks like is up for grabs. 
>>>
>>>2. You could actually do a similar thing with line level blame.
>>>Discover the line byte ranges of the current revision, throw it into an
>>>interval tree, and then do queries to see which line a given diff
>>>affects, based on it's byte ranges.  Stop when every line has been
>>>affected.
>>>
>>>
>>>2 seems a bit harder than 1.
>>>
>>>I've been out of it for a while, so maybe there is a something about
>>>implementing #1 i have missed.
>>>Have i missed anything about #1 that makes it hard?
>>> 
>>>
>>>      
>>>
>>If by "diff format" you mean the binary delta, then... There was a time 
>>when I thought this would be possible. Now I'm not so sure. The trouble 
>>is that the vdelta doesn't generate an edit stream, it generates a 
>>compressed block-copy stream. Which means that can never be 100% sure, 
>>just from looking at the delta, which bytes in the target are really new 
>>and which are just (offset) copies from the source. The only blocks you 
>>can really be sure about are those that are represented by new data in 
>>the delta (either NEW blocks or target copies that take data from NEW 
>>blocks). This problem is made worse by our use of skip deltas in (at 
>>least) the BDB back-end.
>>
>>I agree that it would be nice if the server could generate some sort of 
>>byte-range oriented info, but I don't think it can be done just by 
>>looking at the deltas. It's sad, I know...
>>
>>    
>>
>After looking at the real slow,i'm not actually worried about this so
>much anymore.
>For now it's actually not the main source of slowdown for blame.
>See my followup mail on the delta combiner.
>
>The delta combiner falls down badly when you have a even a moderate
>number of target ops in the window. It goes n^2 and takes up a large
>amount of the time
>  
>
You probably mean "target ops in the first window" or "source ops in the 
second window", yes?

I think you're right... every single bit in the combiner is linear or 
logarithmic, except the tiny little recursion in copy_source_ops which 
can, indeed, go quadratic with the right mix of data. The problem seems 
to be that the right mix of data is far too easily acheivable, at least 
with vdelta...

/me bangs head against wall

Problem is I have no idea how to get rid of that recursion in 
copy_source_ops and still combine deltas correctly.


The more I look at your numbers, the more it seems that xdelta is a 
better choice. The worse compression seems to be balanced by better 
performance, which makes sense. Unless I've done something really stupid 
in vdelta or the combiner, we should probably look switching to xdelta 
in 1.2.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Daniel Berlin <db...@dberlin.org>.

On Thu, 2005-02-10 at 03:57 +0100, Branko Čibej wrote: 
> Daniel Berlin wrote:
> 
> >1. For a simple byte level blame, we should be able to get the answers
> >from the diff format alone, without actually expanding it.  You can
> >simply mark the affected byte ranges,  and stop when the whole file has
> >been affected.  What the output looks like is up for grabs. 
> >
> >2. You could actually do a similar thing with line level blame.
> >Discover the line byte ranges of the current revision, throw it into an
> >interval tree, and then do queries to see which line a given diff
> >affects, based on it's byte ranges.  Stop when every line has been
> >affected.
> >
> >
> >2 seems a bit harder than 1.
> >
> >I've been out of it for a while, so maybe there is a something about
> >implementing #1 i have missed.
> >Have i missed anything about #1 that makes it hard?
> >  
> >
> If by "diff format" you mean the binary delta, then... There was a time 
> when I thought this would be possible. Now I'm not so sure. The trouble 
> is that the vdelta doesn't generate an edit stream, it generates a 
> compressed block-copy stream. Which means that can never be 100% sure, 
> just from looking at the delta, which bytes in the target are really new 
> and which are just (offset) copies from the source. The only blocks you 
> can really be sure about are those that are represented by new data in 
> the delta (either NEW blocks or target copies that take data from NEW 
> blocks). This problem is made worse by our use of skip deltas in (at 
> least) the BDB back-end.
> 
> I agree that it would be nice if the server could generate some sort of 
> byte-range oriented info, but I don't think it can be done just by 
> looking at the deltas. It's sad, I know...
> 
After looking at the real slow,i'm not actually worried about this so
much anymore.
For now it's actually not the main source of slowdown for blame.
See my followup mail on the delta combiner.

The delta combiner falls down badly when you have a even a moderate
number of target ops in the window. It goes n^2 and takes up a large
amount of the time

I have some preliminary numbers from switching from vdelta vs xdelta1 vs
a noop delta.

the xdelta numbers is literaly an xdelta implementation inserted where
we currnetly have vdelta.  It's untuned, but is faster, and never
compresses against the target data already generated (IE generates
target ops).

The noop numbers are with the vdelta call simply replaced with the
generation of a single txdelta op representing the entire target text
(IE no compression, the delta is the fulltext).

Since they both generate svndiff, no client or server changes are
necessary.  These are over local to avoid the turnarund.

All repos are fsfs, both loaded from the same fulltext dumpfile,
containing 2230 subversion revisions, which corresponded to ~2400 cvs
revisions of two frequently changed files in gcc revisions, one has 555
changes, the other the rest :).

No regressions tests fail on any of the above versions the repo text
md5s texts verify for every revision (IE expansion from deltas to
fulltexts is working okay). All were compiled from the same tree with
-O2.

They exist and were changed on a number of branches as well, and the
branches are included in the dumpfile.

vdelta based repo:

22931   repo/db/revs
9583    repo/db/revprops
32621   repo/db
8       repo/conf
20      repo/hooks
8       repo/locks
32569   repo

xdelta based repo:
[dberlin@dberlin repro]$ du repo3
29379   repo3/db/revs
9583    repo3/db/revprops
38973   repo3/db
8       repo3/conf
20      repo3/hooks
8       repo3/locks
39017   repo3

~20% size increase.

noop based repo:
450835  repo2/db/revs
9583    repo2/db/revprops
460430  repo2/db
8       repo2/conf
20      repo2/hooks
8       repo2/locks
460474  repo2

About what you'd expect :)

However, watch this:

(again, over the wire is over file:///, but it still sends delta streams
anyway, because that's get_file_revisions does :P).

vdelta based repo, using vdelta over the wire:
time svn blame on file with the smaller number of revisions: 38 seconds
time svn blame on file with the larger number of revisions: 4 minutes 38
seconds

vdelta based repo, using xdelta over the wire:
time svn blame on file with the smaller number of revisions: 38 seconds
time svn blame on file with the larger number of revisions: 4 minutes 16
seconds

noop based repo, using noop over the wire:
time svn blame on file with the smaller number of revisions: 24 seconds
time svn blame on file with the larger number of revisions: 1 minute 19
seconds

noop based repo, using xdelta over the wire:
time svn blame on file with the smaller number of revisions: 18 seconds
time svn blame on file with the larger number of revisions: 1 minute 14
seconds

xdelta based repo, using xdelta over the wire:
time svn blame on file with the smaller number of revisions: 14 seconds
time svn blame on file with the larger number of revisions: 1 minute 3
seconds

These numbers aren't counterintuitive if you think about it, and you are
more than welcome to try the timings yourself if you disbelieve them :P
The dumpfile is the one posted in the other message. 

The xdelta one only has a 20% space premium over vdelta in size of data
it's "transmitting" around, but it causes no O(N^2) behavior in the
delta combiner because it has no target copy ops.

You'll note the delta combiner passes source copy ops through relatively
unchanged, it just fixes them up a small amount. It does *not* need to
call copy_source_ops on them.

The blame output of all of them is identical, as expected.
I am in the process of loading much larger repo (4 gig vdelta'd with
79,000 revisions, many files with thousands of revisions).

I unfortunately cannot create a noop vdelta version of this repo, as it
would take ~55 gig (the size of the original dumpfile with plaintexts),
and i don't have the extra space ATM.

However, loading is also going much faster on the xdelta version of this
repo, as one would expect.

Note that the profiles for the above xdelta and noop look like what you
would expect.  The delta combiner is completely off the radar
because.there are no target ops.  noop takes more time because it's
streaming more data through the streams.

I have attached the diff for the xdelta and noop implementation, so that
anyone can experiment with the above.
Remember that if you want to accurately test the speed of the delta
combiner with each, you need to redump the dumpfile i posted into
fulltext, and then reload the repo from a dumpfile with the approriate
delta algorithm defined, since it's all svndiff to the rest of the
subversion :)

If you look at profiles, you'll also see that the untuned xdelta is
about 5x faster than vdelta at just compressing the texts it sends over
"the wire".

The upshot of all of this is that i'll probably switch the gcc repo
server to use xdelta to generate it's diffs, until the delta combiner is
fixed so it's not so slow.

It also seems like it may make sense to allow xdelta as a client option,
since it's so much faster cpu wise than vdelta, even if you exclude the
target ops problem (IE just computing diffs).  Again, to the rest of
subversion, it's all svndiff, so it's not like this is even an
incompatible change.

On a side note, i'm accepting comments on the general idea of including
xdelta as an option for 1.2 or 1.3.  I could clean the diff up and
whatnot relatively quickly, and the algorithm is not very hard to
understand, etc.

--Dan

Re: Making blame even faster

Posted by Branko Čibej <br...@xbc.nu>.

Daniel Berlin wrote:

>1. For a simple byte level blame, we should be able to get the answers
>from the diff format alone, without actually expanding it.  You can
>simply mark the affected byte ranges,  and stop when the whole file has
>been affected.  What the output looks like is up for grabs. 
>
>2. You could actually do a similar thing with line level blame.
>Discover the line byte ranges of the current revision, throw it into an
>interval tree, and then do queries to see which line a given diff
>affects, based on it's byte ranges.  Stop when every line has been
>affected.
>
>
>2 seems a bit harder than 1.
>
>I've been out of it for a while, so maybe there is a something about
>implementing #1 i have missed.
>Have i missed anything about #1 that makes it hard?
>  
>
If by "diff format" you mean the binary delta, then... There was a time 
when I thought this would be possible. Now I'm not so sure. The trouble 
is that the vdelta doesn't generate an edit stream, it generates a 
compressed block-copy stream. Which means that can never be 100% sure, 
just from looking at the delta, which bytes in the target are really new 
and which are just (offset) copies from the source. The only blocks you 
can really be sure about are those that are represented by new data in 
the delta (either NEW blocks or target copies that take data from NEW 
blocks). This problem is made worse by our use of skip deltas in (at 
least) the BDB back-end.

I agree that it would be nice if the server could generate some sort of 
byte-range oriented info, but I don't think it can be done just by 
looking at the deltas. It's sad, I know...

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Daniel Berlin <db...@dberlin.org>.

On Fri, 2005-02-11 at 18:22 +0000, Hugh Gibson wrote:
> > Hugh, are you interested in applying Daniel's patches and
> > going through a dump/load cycle to see the effect on your
> > execution times?
> 
> Unfortunately I don't have control of the server so it would be a bit 
> difficult for me to put a new server version in place.
> 
> > If you'll recall, we weren't sure that your bottleneck was client-side.
> 
> I'm happy to do client-side tests - when I get some time.....

You won't get any speedup from just replcaing the client, since it's
still going to receive only vdelta based svndiffs if you don't replace
the server.

If you can tar and rzip (which is ideal for large repos, gcc's repo is
711 meg rzip'd, 4.9 gig not rzip'd, http://rzip.samba.org) your repo and
make it available to me somehow, i'm happy to do the dump and loads,
etc, to see if it speeds up blame for you.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Hugh Gibson <hg...@cix.co.uk>.

> Hugh, are you interested in applying Daniel's patches and
> going through a dump/load cycle to see the effect on your
> execution times?

Unfortunately I don't have control of the server so it would be a bit 
difficult for me to put a new server version in place.

> If you'll recall, we weren't sure that your bottleneck was client-side.

I'm happy to do client-side tests - when I get some time.....

Hugh

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Mark Benedetto King <mb...@lowlatency.com>.

On Thu, Feb 10, 2005 at 09:06:00AM +0000, Hugh Gibson wrote:
> > I'm pretty sure that if i can make blame run at
> > reasonable speed, nobody will object to us switching.
> 
> See also http://svn.haxx.se/dev/archive-2005-01/0372.shtml
> 
> Hugh
> 

Hugh, are you interested in applying Daniel's patches and
going through a dump/load cycle to see the effect on your
execution times?   If you'll recall, we weren't sure
that your bottleneck was client-side.

--ben

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Making blame even faster

Posted by Hugh Gibson <hg...@cix.co.uk>.

> I'm pretty sure that if i can make blame run at
> reasonable speed, nobody will object to us switching.

See also http://svn.haxx.se/dev/archive-2005-01/0372.shtml

Hugh

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org