You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Daniel Berlin <db...@dberlin.org> on 2005/02/09 21:57:01 UTC

delta combiner sloooooooowwwwww

So i went to determine where most of the time on gcc blames were being
spent.

Out of 300 seconds taken on one file, roughly 65% of the time is being
spent composing deltas:

 43.71    115.64   115.64 95953710     0.00     0.00
search_offset_index
 13.34    150.93    35.29  3119295     0.00     0.00  copy_source_ops
  8.17    204.50    21.62     4746     0.00     0.04
svn_txdelta__compose_windows

Notice that it only composed 4746 delta windows (this file had 555
revisions with changes that it was getting fulltexts for), but it called
copy_source_ops over 3 million times, which results in over 95 million
calls to search_offset_index.

That seems wrong :)

On another file, where it takes 9 minutes to run blame:

 58.97   1191.25  1191.25 894014688     0.00     0.00
search_offset_index
 14.79   1490.09   298.84 12389033     0.00     0.00  copy_source_ops
  2.55   1757.91    51.44    19359     0.00     0.00
svn_txdelta__compose_windows
  2.15   1801.29    43.38 197575410     0.00     0.00
svn_txdelta__insert_op

This also seems excessive.

This is at -O2.

If delta combination takes longer than not, it's not a win :).  I tried
doing this without delta combining, but it seems that codepath will no
longer work (src_state never gets set anywhere, at least by the fsfs
get_contents handler, because the delta combiner never produces windows
with src_ops in them.  If you change it so that the windows have
src_ops, the code blows up because src_state isn't valid)

This is not on the receiving side, it's on the server side that this is
so slow.

In fact, composing the delta windows for one revision seems to take
almost half a second per revision.

This is true of our other files with many revisions as well.
On some files, it seems to take almost half a second per revision to
compose the delta windows.

There must be a way to make this code faster, but for the life of me, i
can't understand the code in question.

If someone could kindly make the delta combiner run at a reasonable
speed (or let it be disabled at least for blame) :P, i believe i'll be
able to move us gcc'ers over to svn :)

I can happily provide shell access to this machine and the repo, for
debugging/profilig/whatever.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by Branko Čibej <br...@xbc.nu>.
Peter N. Lundblad wrote:

>On Thu, 10 Feb 2005, [UTF-8] Branko �^Libej wrote:
>
>  
>
>>Daniel Berlin wrote:
>>
>>    
>>
>>>The worst part, is that having stared at the delta combiner for quite a
>>>while myself, i'm actually not sure you can do anything about this
>>>either.  I believe that handling source ops from the second window (or
>>>target ops from the first) is just going to be O(n^2) because of what is
>>>*has* to do to make it work right.
>>>
>>>:(
>>>
>>>
>>>      
>>>
>>Yup. At least I'm glad I was right about that.  :-\
>>
>>We do know from other measurements that using the delta combiner is
>>better than just applying the deltas in series (generating intermediate
>>fulltexts). But that's not much consolation.
>>
>>Since we're looking specifically at blame, I suspect the best way to
>>improve perfoemance would be to do what I suggested earlier, that is,
>>teach get_file_revs to not recrete every fulltext from scratch. That
>>would reduce the whole blame computation from O(N^2) to O(N), modulo the
>>    
>>
>
>Can you clarify this a bit for someone who knows very much less about FS
>internals than you? How can get_file_revs avoid recreating fulltexts?
>  
>
You can't avoid recreating fulltexts, but you can certainly avoid 
combining a whole chain of deltas for each fulltext. If you have the 
fulltext for file F in revision R, there's a good chance  that you can 
get to F in R-1 by applying a single delta that's already stored in the 
repository

In hindsight, it seems that it was a mistake for get_file_revs to return 
the list in aacending revision order... As far as the blame calculation 
is concerned, the direction shouldn't matter.

>I did some experimentation with FSFS and noticed that consecutive file
>revisions often share a common delta predecessor (or how it is spelled).
>For example:
>rev 400's delta may be based on r360's delta, which is base on r200, then
>r80, then r2
>and r401 may be based on r386, then r360...
>(the numbers are faked, but you get what I mean)
>  
>
That's what skip deltas are about.

>So, since we're reconstructing the deltas in paralell, we could avoid some
>combining work, it looks like. Is that what you're talking about?
>  
>
Something along those lines, yes.

>>copy_source_ops anomaly. And we could avoid that if it could just use
>>the deltas that the FS already stores, where possible.
>>
>>    
>>
>Will that really be comon for consecutive revisions? I didn't see it in
>the GCC example repository. And fs_base stores deltas backwards...
>
>I'm surely missing something.
>  
>
I did say "where possible"  :-)

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Fri, 11 Feb 2005, Daniel Berlin wrote:

> On Fri, 2005-02-11 at 09:03 +0100, Peter N. Lundblad wrote:
> > On Thu, 10 Feb 2005, [UTF-8] Branko ^Libej wrote:
> >
> > Can you clarify this a bit for someone who knows very much less about FS
> > internals than you? How can get_file_revs avoid recreating fulltexts?
> >
> It can't.
> However, i believe he's saying it could simple send the delta between
> the last revision it sent, and the current revision it is now sending
> (thus only sending a real fulltext for the first revision it sends)
> He is correct.
> However, it already does this :).
>
Yes, I know, since I wrote that code:-) But to be able to be able to
produce deltas between arbitrary revisions, it internally recreates
fulltexts. Look at the implementation of svn_fs_get_file_delta_stream.

> > I did some experimentation with FSFS and noticed that consecutive file
> > revisions often share a common delta predecessor (or how it is spelled).
> > For example:
> > rev 400's delta may be based on r360's delta, which is base on r200, then
> > r80, then r2
> > and r401 may be based on r386, then r360...
> > (the numbers are faked, but you get what I mean)
> > So, since we're reconstructing the deltas in paralell, we could avoid some
> > combining work, it looks like. Is that what you're talking about?
>
> No, that is more what skip deltas were designed to solve, so that you
> could bound the number of predecessors it would have to look at to find
> the deltas.
>
My point is that two consecutive revisions share a common predecessor
pretty soon. Currently, then predecessors for both files are walked and
deltas are combined until the beginning (a delta based on the empty
stream) is reached. Those two walks could share some work meaning less
delta combining. That doesn't solve the O(n^2) of the combiner ofcourse,
but it may make tha constant factor somewhat smaller.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by Daniel Berlin <db...@dberlin.org>.
On Fri, 2005-02-11 at 09:03 +0100, Peter N. Lundblad wrote:
> On Thu, 10 Feb 2005, [UTF-8] Branko ^Libej wrote:
> 
> > Daniel Berlin wrote:
> >
> > >The worst part, is that having stared at the delta combiner for quite a
> > >while myself, i'm actually not sure you can do anything about this
> > >either.  I believe that handling source ops from the second window (or
> > >target ops from the first) is just going to be O(n^2) because of what is
> > >*has* to do to make it work right.
> > >
> > >:(
> > >
> > >
> > Yup. At least I'm glad I was right about that.  :-\
> >
> > We do know from other measurements that using the delta combiner is
> > better than just applying the deltas in series (generating intermediate
> > fulltexts). But that's not much consolation.
> >
> > Since we're looking specifically at blame, I suspect the best way to
> > improve perfoemance would be to do what I suggested earlier, that is,
> > teach get_file_revs to not recrete every fulltext from scratch. That
> > would reduce the whole blame computation from O(N^2) to O(N), modulo the
> 
> Can you clarify this a bit for someone who knows very much less about FS
> internals than you? How can get_file_revs avoid recreating fulltexts?
> 
It can't.
However, i believe he's saying it could simple send the delta between
the last revision it sent, and the current revision it is now sending
(thus only sending a real fulltext for the first revision it sends) 
He is correct.
However, it already does this :).

IE it could do

send fulltext of start
last = start

for each revision (current) you asked it for besides start:
	send delta between last and current to client
	last = current

Again, get_file_revs already does this though.


> I did some experimentation with FSFS and noticed that consecutive file
> revisions often share a common delta predecessor (or how it is spelled).
> For example:
> rev 400's delta may be based on r360's delta, which is base on r200, then
> r80, then r2
> and r401 may be based on r386, then r360...
> (the numbers are faked, but you get what I mean)
> So, since we're reconstructing the deltas in paralell, we could avoid some
> combining work, it looks like. Is that what you're talking about?

No, that is more what skip deltas were designed to solve, so that you
could bound the number of predecessors it would have to look at to find
the deltas.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Thu, 10 Feb 2005, [UTF-8] Branko �^Libej wrote:

> Daniel Berlin wrote:
>
> >The worst part, is that having stared at the delta combiner for quite a
> >while myself, i'm actually not sure you can do anything about this
> >either.  I believe that handling source ops from the second window (or
> >target ops from the first) is just going to be O(n^2) because of what is
> >*has* to do to make it work right.
> >
> >:(
> >
> >
> Yup. At least I'm glad I was right about that.  :-\
>
> We do know from other measurements that using the delta combiner is
> better than just applying the deltas in series (generating intermediate
> fulltexts). But that's not much consolation.
>
> Since we're looking specifically at blame, I suspect the best way to
> improve perfoemance would be to do what I suggested earlier, that is,
> teach get_file_revs to not recrete every fulltext from scratch. That
> would reduce the whole blame computation from O(N^2) to O(N), modulo the

Can you clarify this a bit for someone who knows very much less about FS
internals than you? How can get_file_revs avoid recreating fulltexts?

I did some experimentation with FSFS and noticed that consecutive file
revisions often share a common delta predecessor (or how it is spelled).
For example:
rev 400's delta may be based on r360's delta, which is base on r200, then
r80, then r2
and r401 may be based on r386, then r360...
(the numbers are faked, but you get what I mean)
So, since we're reconstructing the deltas in paralell, we could avoid some
combining work, it looks like. Is that what you're talking about?

> copy_source_ops anomaly. And we could avoid that if it could just use
> the deltas that the FS already stores, where possible.
>
Will that really be comon for consecutive revisions? I didn't see it in
the GCC example repository. And fs_base stores deltas backwards...

I'm surely missing something.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: delta combiner sloooooooowwwwww

Posted by Daniel Berlin <db...@dberlin.org>.
On Thu, 2005-02-10 at 10:46 -0500, Daniel Berlin wrote:
> > Yup. At least I'm glad I was right about that.  :-\
> > 
> > We do know from other measurements that using the delta combiner is 
> > better than just applying the deltas in series (generating intermediate 
> > fulltexts). But that's not much consolation.
> > 
> > Since we're looking specifically at blame, I suspect the best way to 
> > improve perfoemance would be to do what I suggested earlier, that is, 
> > teach get_file_revs to not recrete every fulltext from scratch.
> It doesn't though.
> Look at svn_repos_get_file_revs

I stupidly forgot to include the surrounding loop:


  /* Walk through the revisions in chronological order. */
  for (i = revnums->nelts; i > 0; --i)
    {
      svn_revnum_t rev = APR_ARRAY_IDX (revnums, i - 1, svn_revnum_t);
      const char *rev_path = APR_ARRAY_IDX (paths, i - 1, const char *);
      apr_hash_t *rev_props;
      apr_hash_t *props;
....

      /* Open the revision root. */
      SVN_ERR (svn_fs_revision_root (&root, repos->fs, rev, iter_pool));



> 
>       /* Compute and send delta if client asked for it.
>          Note that this was initialized to NULL, so if !
> contents_changed,
>          no deltas will be computed. */
>       if (delta_handler)
>         {
>           /* Get the content delta. */
>           SVN_ERR (svn_fs_get_file_delta_stream (&delta_stream,
>                                                  last_root, last_path,
>                                                  root, rev_path,
>                                                  iter_pool));
>           /* And send. */
>           SVN_ERR (svn_txdelta_send_txstream (delta_stream,
>                                               delta_handler,
> delta_baton,
>                                               iter_pool));
>         }
> 

...

      /* Remember root, path and props for next iteration. */
      last_root = root;
      last_path = rev_path;
      last_props = props;

      /* Swap the pools. */
      tmp_pool = iter_pool;
      iter_pool = last_pool;
      last_pool = tmp_pool;

}
> Note that it's sending a delta between the last revision it sent, and
> the current revision, and the client is recreating the fulltext by
> applying that delta to the last thing it received.
> 
> Thus, it only sends one actual fulltext.
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by Daniel Berlin <db...@dberlin.org>.
> Yup. At least I'm glad I was right about that.  :-\
> 
> We do know from other measurements that using the delta combiner is 
> better than just applying the deltas in series (generating intermediate 
> fulltexts). But that's not much consolation.
> 
> Since we're looking specifically at blame, I suspect the best way to 
> improve perfoemance would be to do what I suggested earlier, that is, 
> teach get_file_revs to not recrete every fulltext from scratch.
It doesn't though.
Look at svn_repos_get_file_revs

      /* Compute and send delta if client asked for it.
         Note that this was initialized to NULL, so if !
contents_changed,
         no deltas will be computed. */
      if (delta_handler)
        {
          /* Get the content delta. */
          SVN_ERR (svn_fs_get_file_delta_stream (&delta_stream,
                                                 last_root, last_path,
                                                 root, rev_path,
                                                 iter_pool));
          /* And send. */
          SVN_ERR (svn_txdelta_send_txstream (delta_stream,
                                              delta_handler,
delta_baton,
                                              iter_pool));
        }

Note that it's sending a delta between the last revision it sent, and
the current revision, and the client is recreating the fulltext by
applying that delta to the last thing it received.

Thus, it only sends one actual fulltext.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by Branko Čibej <br...@xbc.nu>.
Daniel Berlin wrote:

>On Thu, 2005-02-10 at 12:30 +0100, Branko Čibej wrote:
>  
>
>>Branko Čibej wrote:
>>
>>    
>>
>
>  
>
>>The really horrible bit are the timings (in microseconds)
>>
>>Function                      Average   Min     Max
>>
>>svn_txdelta__compose_windows  47.17     0.07    297.67
>>copy_source_ops                0.05     0.04    308.98
>>search_offset_index            0.06     0.01      0.07
>>
>>
>>This shows that the index xearch is fairly stable, but copy_source_ops 
>>is not. Note that about 95% of all calls to copy_source_ops come from 
>>the recursion.
>>    
>>
>
>The worst part, is that having stared at the delta combiner for quite a
>while myself, i'm actually not sure you can do anything about this
>either.  I believe that handling source ops from the second window (or
>target ops from the first) is just going to be O(n^2) because of what is
>*has* to do to make it work right.
>
>:(
>  
>
Yup. At least I'm glad I was right about that.  :-\

We do know from other measurements that using the delta combiner is 
better than just applying the deltas in series (generating intermediate 
fulltexts). But that's not much consolation.

Since we're looking specifically at blame, I suspect the best way to 
improve perfoemance would be to do what I suggested earlier, that is, 
teach get_file_revs to not recrete every fulltext from scratch. That 
would reduce the whole blame computation from O(N^2) to O(N), modulo the 
copy_source_ops anomaly. And we could avoid that if it could just use 
the deltas that the FS already stores, where possible.

Deciding about a particular delta algorithm would then be a minor issue.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Delta directions (was Re: delta combiner sloooooooowwwwww)

Posted by Daniel Berlin <db...@dberlin.org>.
On Thu, 2005-02-10 at 13:22 -0500, Greg Hudson wrote:
> I hadn't read part of the "make blame even faster" thread when I sent
> this.  Having read it, I have one addendum:
> 
> On Thu, 2005-02-10 at 13:08, Greg Hudson wrote:
> > (Obviously not before 2.0, but I'm not sure if we can
> > realize any of these benefits before 2.0.)
> 
> As people pointed out in that other thread, we can realize the "faster
> delta combiner" benefit before 2.0, and it may be worth changing to
> xdelta in 1.2 in order to do so.  My biggest concern is that FSFS
> repositories with very shallow file histories may become significantly
> larger, since we will lose all self-compression.  Perhaps we can
> continue using vdelta for self-compression in FSFS? 

You'd probably also want to do it for sending over the wire for
checkout, so it doesn't end up sending fulltexts over the wire.

>  I'm not sure how much of the speed win (if any) we'd be giving up by doing that.

> 
> At any rate, the more desirable end state is to ditch svndiff and switch
> to a format which doesn't even allow random access, and to have a
> compression layer on the output of the delta routine.  That would let us
> recapture most or all of the space efficiency and would let us use
> deltas in other ways.  But getting to that end state is much more
> difficult than simply changing our delta algorithm from vdelta to
> xdelta.
> 

Yes.
I was, of course, looking for some immediate gratification so i don't
have to wait a year or more to possibly move gcc over to subversion
(assuming people will move :P)
IE make it fast enough for now, make it even better later.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Delta directions (was Re: delta combiner sloooooooowwwwww)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2005-02-12 at 14:18, Branko Čibej wrote:
> Like I said on IRC, it's because I was wrong... the vdelta would be the 
> first in the combination chain, and it's full of target copies, so it 
> triggers the O(n^2) in the combiner from the start.

Well, we can get around this.  With Daniel's patch, an FSFS file history
looks like:

  xdelta --> xdelta --> xdelta --> vdelta --> empty

If we apply all the xdeltas to the vdeltas, we lose.  But if we instead
apply the vdelta to the empty stream and combine the rest of the deltas,
now the history looks like:

  xdelta --> xdelta --> xdelta --> <constructed text>

and we win.

FSFS already has code to deal with the "delta --> delta --> delta -->
plain" case, although it (a) doesn't get used in practice, and (b) isn't
quite generic enough for this purpose.  But it should be pretty simple
to fix it up.  Then we can get the full speed benefit of the xdelta
changeover without sacrificing plaintext compression, and without a
schema change.

(I already presented this idea on IRC, but since I'm not going to
implement it myself in the near future, I thought I should note it on
the list.)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: Delta directions (was Re: delta combiner sloooooooowwwwww)

Posted by Branko Čibej <br...@xbc.nu>.
Daniel Berlin wrote:

>On Sat, 2005-02-12 at 03:45 +0100, Branko Čibej wrote:
>  
>
>>Greg Hudson wrote:
>>
>>    
>>
>>>I hadn't read part of the "make blame even faster" thread when I sent
>>>this.  Having read it, I have one addendum:
>>>
>>>On Thu, 2005-02-10 at 13:08, Greg Hudson wrote:
>>> 
>>>
>>>      
>>>
>>>>(Obviously not before 2.0, but I'm not sure if we can
>>>>realize any of these benefits before 2.0.)
>>>>   
>>>>
>>>>        
>>>>
>>>As people pointed out in that other thread, we can realize the "faster
>>>delta combiner" benefit before 2.0, and it may be worth changing to
>>>xdelta in 1.2 in order to do so.  My biggest concern is that FSFS
>>>repositories with very shallow file histories may become significantly
>>>larger, since we will lose all self-compression.  Perhaps we can
>>>continue using vdelta for self-compression in FSFS?  I'm not sure how
>>>much of the speed win (if any) we'd be giving up by doing that.
>>> 
>>>
>>>      
>>>
>>If we use vdelta strictly for self-compression of the first revision of 
>>a file in FSFS, it shouldn't be a problem. That delta would always be 
>>the last in any chain of combinations, but because the self-compressed 
>>vdelta doesn't contain source copy ops, the combiner should behave nicely.
>>    
>>
>
>I actually tried this.
>We still get a very large amount of copy_source_ops time according to
>oprofile (though a factor of 5 or 10 less than you do with vdelta every
>revision).
>
>This was with vdelta only used when source_len == 0
>
>I tested this with two repos. One where vdelta was enabled if source_len
>== 0,and one where vdelta was not enabled for sotring to the repo, but
>was for compressing fulltexts that went over the wire (through a small
>hack to add an argument to compose_window, and disable vdelta for the
>tpush_* call to compose_window, which is only used by fsfs when storing
>to the repo).
>
>I verified that vdelta was happening only during over-the-wire transmits
>by aborting if it triggered vdelta, doing a repository load to make sure
>it didn't happen during that, and then making sure it *did* abort when i
>did a checkout or something that was going to send fulltexts.
>
>No clue why the combiner still behaves badly but i can provide you with
>repos and patches to reproduce.
>  
>
Like I said on IRC, it's because I was wrong... the vdelta would be the 
first in the combination chain, and it's full of target copies, so it 
triggers the O(n^2) in the combiner from the start.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Delta directions (was Re: delta combiner sloooooooowwwwww)

Posted by Daniel Berlin <db...@dberlin.org>.
On Sat, 2005-02-12 at 03:45 +0100, Branko Čibej wrote:
> Greg Hudson wrote:
> 
> >I hadn't read part of the "make blame even faster" thread when I sent
> >this.  Having read it, I have one addendum:
> >
> >On Thu, 2005-02-10 at 13:08, Greg Hudson wrote:
> >  
> >
> >>(Obviously not before 2.0, but I'm not sure if we can
> >>realize any of these benefits before 2.0.)
> >>    
> >>
> >
> >As people pointed out in that other thread, we can realize the "faster
> >delta combiner" benefit before 2.0, and it may be worth changing to
> >xdelta in 1.2 in order to do so.  My biggest concern is that FSFS
> >repositories with very shallow file histories may become significantly
> >larger, since we will lose all self-compression.  Perhaps we can
> >continue using vdelta for self-compression in FSFS?  I'm not sure how
> >much of the speed win (if any) we'd be giving up by doing that.
> >  
> >
> If we use vdelta strictly for self-compression of the first revision of 
> a file in FSFS, it shouldn't be a problem. That delta would always be 
> the last in any chain of combinations, but because the self-compressed 
> vdelta doesn't contain source copy ops, the combiner should behave nicely.

I actually tried this.
We still get a very large amount of copy_source_ops time according to
oprofile (though a factor of 5 or 10 less than you do with vdelta every
revision).

This was with vdelta only used when source_len == 0

I tested this with two repos. One where vdelta was enabled if source_len
== 0,and one where vdelta was not enabled for sotring to the repo, but
was for compressing fulltexts that went over the wire (through a small
hack to add an argument to compose_window, and disable vdelta for the
tpush_* call to compose_window, which is only used by fsfs when storing
to the repo).

I verified that vdelta was happening only during over-the-wire transmits
by aborting if it triggered vdelta, doing a repository load to make sure
it didn't happen during that, and then making sure it *did* abort when i
did a checkout or something that was going to send fulltexts.

No clue why the combiner still behaves badly but i can provide you with
repos and patches to reproduce.

--Dan



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: Delta directions (was Re: delta combiner sloooooooowwwwww)

Posted by Branko Čibej <br...@xbc.nu>.
Greg Hudson wrote:

>I hadn't read part of the "make blame even faster" thread when I sent
>this.  Having read it, I have one addendum:
>
>On Thu, 2005-02-10 at 13:08, Greg Hudson wrote:
>  
>
>>(Obviously not before 2.0, but I'm not sure if we can
>>realize any of these benefits before 2.0.)
>>    
>>
>
>As people pointed out in that other thread, we can realize the "faster
>delta combiner" benefit before 2.0, and it may be worth changing to
>xdelta in 1.2 in order to do so.  My biggest concern is that FSFS
>repositories with very shallow file histories may become significantly
>larger, since we will lose all self-compression.  Perhaps we can
>continue using vdelta for self-compression in FSFS?  I'm not sure how
>much of the speed win (if any) we'd be giving up by doing that.
>  
>
If we use vdelta strictly for self-compression of the first revision of 
a file in FSFS, it shouldn't be a problem. That delta would always be 
the last in any chain of combinations, but because the self-compressed 
vdelta doesn't contain source copy ops, the combiner should behave nicely.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Delta directions (was Re: delta combiner sloooooooowwwwww)

Posted by Greg Hudson <gh...@MIT.EDU>.
I hadn't read part of the "make blame even faster" thread when I sent
this.  Having read it, I have one addendum:

On Thu, 2005-02-10 at 13:08, Greg Hudson wrote:
> (Obviously not before 2.0, but I'm not sure if we can
> realize any of these benefits before 2.0.)

As people pointed out in that other thread, we can realize the "faster
delta combiner" benefit before 2.0, and it may be worth changing to
xdelta in 1.2 in order to do so.  My biggest concern is that FSFS
repositories with very shallow file histories may become significantly
larger, since we will lose all self-compression.  Perhaps we can
continue using vdelta for self-compression in FSFS?  I'm not sure how
much of the speed win (if any) we'd be giving up by doing that.

At any rate, the more desirable end state is to ditch svndiff and switch
to a format which doesn't even allow random access, and to have a
compression layer on the output of the delta routine.  That would let us
recapture most or all of the space efficiency and would let us use
deltas in other ways.  But getting to that end state is much more
difficult than simply changing our delta algorithm from vdelta to
xdelta.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Delta directions (was Re: delta combiner sloooooooowwwwww)

Posted by Branko Čibej <br...@xbc.nu>.
Greg Hudson wrote:

>By sacrificing random access, we could get the following benefits:
>
>  * We could get rid of delta windowing, which is a source of
>complexity.  (Obviously not before 2.0, but I'm not sure if we can
>realize any of these benefits before 2.0.)  Eliminating windowing also
>means we could get much better results for some changes, like "insert
>128K of data at the beginning of the file".
>  
>
I'm not sure that's the case. Xdelta and diff both use windows because 
they'd otherwise have to index all the source and target data at once, 
and streaminess is still a requirement. The restriction might possibly 
make it easier to implement sliding windows, though.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Delta directions (was Re: delta combiner sloooooooowwwwww)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Thu, 2005-02-10 at 10:12, Daniel Berlin wrote:
> The worst part, is that having stared at the delta combiner for quite a
> while myself, i'm actually not sure you can do anything about this
> either.  I believe that handling source ops from the second window (or
> target ops from the first) is just going to be O(n^2) because of what is
> *has* to do to make it work right.

Perhaps the long-term answer is to stop using random-access deltas, and
instead use a generic compression algorithm on the output of our delta
routine.

I'll explain that in greater detail for those who aren't Dan Berlin and
Branko.  Right now our deltas are allowed to use instructions like:

  Copy N bytes from location <x> in source
  Copy N bytes from location <x> in already-reconstructed target
  Copy N bytes from new data, and advance new data pointer

(There's no offset for a "new data" copy, because "copy from
already-reconstructed target" handles the case of wanting to copy
earlier bytes from the new data area.)  Compare this to gnu diff, which
is allowed to use instructions like:

  Copy N lines from source, and advance source pointer
  Skip N lines from source (advance source pointer w/o copying)
  Copy N lines from new data, and advance new data pointer

The difference is not just that gnu diff uses lines and we use bytes;
it's that we're allowed to use random access to the source and target,
while gnu diff is not.  Random access allows us to produce more
efficient deltas--for instance, it allows us to get some
self-compression if the delta source is the empty stream.  So if we were
to sacrifice random access, we'd want to compress the output of the
delta routine in order to get savings in the "new data" segment.  The
end result might be more or less space efficient than what we have now.

By sacrificing random access, we could get the following benefits:

  * We could get rid of delta windowing, which is a source of
complexity.  (Obviously not before 2.0, but I'm not sure if we can
realize any of these benefits before 2.0.)  Eliminating windowing also
means we could get much better results for some changes, like "insert
128K of data at the beginning of the file".

  * We could start using deltas for the blame algorithm, like CVS does,
although it would be a little trickier.  Skip-deltas might totally
negate this benefit, unfortunately.

  * Our delta combiner would have an easier time working efficiently.

Of course, if we eliminate random access, then we need a completely
different binary delta algorithm.  But it sounds like xdelta may be such
an algorithm and it can be implemented in a small amount of code, so
that may not be a problem.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by Daniel Berlin <db...@dberlin.org>.
On Thu, 2005-02-10 at 12:30 +0100, Branko Čibej wrote:
> Branko Čibej wrote:
> 

> The really horrible bit are the timings (in microseconds)
> 
> Function                      Average   Min     Max
> 
> svn_txdelta__compose_windows  47.17     0.07    297.67
> copy_source_ops                0.05     0.04    308.98
> search_offset_index            0.06     0.01      0.07
> 
> 
> This shows that the index xearch is fairly stable, but copy_source_ops 
> is not. Note that about 95% of all calls to copy_source_ops come from 
> the recursion.

The worst part, is that having stared at the delta combiner for quite a
while myself, i'm actually not sure you can do anything about this
either.  I believe that handling source ops from the second window (or
target ops from the first) is just going to be O(n^2) because of what is
*has* to do to make it work right.

:(

--Dan


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: delta combiner sloooooooowwwwww

Posted by Branko Čibej <br...@xbc.nu>.
Branko Čibej wrote:

> Florian Weimer wrote:
>
>> * Branko Čibej:
>>
>>  
>>
>>> No, the delta combiner isn't slow. And your data seem to be skewed, 
>>> as I get similar call counts but quite different timing results (I'm 
>>> using an instrumenting profiler, not a sampling one). Running with 
>>> current trunk, with both the WC and the repo on a RAM disk, 
>>> measuring only svnserve and ignoring network I/O I get this for rtl.h:
>>>   
>>
>>
>>  
>>
>>> svn_fs_bdb__string_read         17273    0.05    17.12
>>>   
>>
>>
>> I believe Daniel is using the FSFS backend.  Maybe this explains the
>> difference?
>>  
>>
> There's not all that much difference. I think the problem is that 
> rtl.h is way too simple a test. Sigh...
>
> I'll run tests on FSFS with combine.c now.

And indeed...

Function                      Calls       F%     F+D%

svn_txdelta__compose_windows      19359    0.76   72.87
copy_source_ops               447007344   17.91   68.37
search_offset_index           894014688   43.07   43.07


The really horrible bit are the timings (in microseconds)

Function                      Average   Min     Max

svn_txdelta__compose_windows  47.17     0.07    297.67
copy_source_ops                0.05     0.04    308.98
search_offset_index            0.06     0.01      0.07


This shows that the index xearch is fairly stable, but copy_source_ops 
is not. Note that about 95% of all calls to copy_source_ops come from 
the recursion.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by Branko Čibej <br...@xbc.nu>.
Florian Weimer wrote:

>* Branko Čibej:
>
>  
>
>>No, the delta combiner isn't slow. And your data seem to be skewed, as I 
>>get similar call counts but quite different timing results (I'm using an 
>>instrumenting profiler, not a sampling one). Running with current trunk, 
>>with both the WC and the repo on a RAM disk, measuring only svnserve and 
>>ignoring network I/O I get this for rtl.h:
>>    
>>
>
>  
>
>>svn_fs_bdb__string_read         17273	0.05	17.12
>>    
>>
>
>I believe Daniel is using the FSFS backend.  Maybe this explains the
>difference?
>  
>
There's not all that much difference. I think the problem is that rtl.h 
is way too simple a test. Sigh...

I'll run tests on FSFS with combine.c now.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by Florian Weimer <fw...@deneb.enyo.de>.
* Branko Čibej:

> No, the delta combiner isn't slow. And your data seem to be skewed, as I 
> get similar call counts but quite different timing results (I'm using an 
> instrumenting profiler, not a sampling one). Running with current trunk, 
> with both the WC and the repo on a RAM disk, measuring only svnserve and 
> ignoring network I/O I get this for rtl.h:

> svn_fs_bdb__string_read         17273	0.05	17.12

I believe Daniel is using the FSFS backend.  Maybe this explains the
difference?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: delta combiner sloooooooowwwwww

Posted by Branko Čibej <br...@xbc.nu>.
Daniel Berlin wrote:

>I have placed a test repo dump file that should show this problem at
>http://www.toolchain.org/~dberlin/dumpfile.bz2
>
>It's a bit over 1 meg compressed
>
>svn blame on trunk/combine.c or trunk/rtl.h show the problems below
>(rtl.h is the faster of the two)
>
>
>On Wed, 2005-02-09 at 16:57 -0500, Daniel Berlin wrote:
>  
>
>>So i went to determine where most of the time on gcc blames were being
>>spent.
>>
>>Out of 300 seconds taken on one file, roughly 65% of the time is being
>>spent composing deltas:
>>
>> 43.71    115.64   115.64 95953710     0.00     0.00
>>search_offset_index
>> 13.34    150.93    35.29  3119295     0.00     0.00  copy_source_ops
>>  8.17    204.50    21.62     4746     0.00     0.04
>>svn_txdelta__compose_windows
>>    
>>
No, the delta combiner isn't slow. And your data seem to be skewed, as I 
get similar call counts but quite different timing results (I'm using an 
instrumenting profiler, not a sampling one). Running with current trunk, 
with both the WC and the repo on a RAM disk, measuring only svnserve and 
ignoring network I/O I get this for rtl.h:

Function                        Calls	F%	F+D%

compose_handler                 13464	0.00	33.31
svn_txdelta__compose_windows     5130	3.40	33.15
compute_window                    535	0.00	20.52
svn_txdelta__vdelta               535	0.00	20.52
vdelta                           1070	17.78	17.80
svn_fs_bdb__string_read         17273	0.05	17.12
copy_source_ops               5091068	3.86	14.32
svn_txdelta__insert_op       13710294	10.11	11.07
apr_md5_update                  12849	0.16	9.13
MD5Transform                  1944240	6.79	8.97
locate_key                      19953	0.15	6.48
search_offset_index          10182136	5.77	5.77


(F% is time spent in the function itself, F+D% is the time used by the 
function and its children)
Note that I'm looking at code compiled with optimisation, which could be 
quite important in search_offset_index, for example.

I don't think there's much that can be done for reducing computations 
for a single delta combination. Oh, we could optimise away some memcpy's 
and allocations, at the price of considerably more complicated 
structures, but that would lop off a few percent.

The trouble lies in the number of times svn_txdelta__compose_windows is 
called for a mere 536 revisions of rtl.h (the file is small enough to 
fit into a single window). If I read the code correctly, that's because 
the blame computation rebuilds the fulltext for each revision from 
scratch, instead of cacheing the previous fulltext and constructing the 
new one from it (using the delta that's often already in the 
repository). That's a bit tricky in the presence of skip deltas, but the 
main problem is that the FS doesn't have even a private API that would 
take a revision+text and return the text for revision+1, so blame 
currenly doesn't have much choice here.

If that part were optimised, that would reduce the time by about 20%. If 
blame were smart enough to not recompute deltas that it can get straight 
from the FS, we'd probably gain another 10%.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: delta combiner sloooooooowwwwww

Posted by Daniel Berlin <db...@dberlin.org>.
I have placed a test repo dump file that should show this problem at
http://www.toolchain.org/~dberlin/dumpfile.bz2

It's a bit over 1 meg compressed

svn blame on trunk/combine.c or trunk/rtl.h show the problems below
(rtl.h is the faster of the two)


On Wed, 2005-02-09 at 16:57 -0500, Daniel Berlin wrote:
> So i went to determine where most of the time on gcc blames were being
> spent.
> 
> Out of 300 seconds taken on one file, roughly 65% of the time is being
> spent composing deltas:
> 
>  43.71    115.64   115.64 95953710     0.00     0.00
> search_offset_index
>  13.34    150.93    35.29  3119295     0.00     0.00  copy_source_ops
>   8.17    204.50    21.62     4746     0.00     0.04
> svn_txdelta__compose_windows



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org