You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Johan Corveleyn <jc...@gmail.com> on 2011/10/01 01:15:00 UTC

Re: svn merge operation extremely slow

On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber <ky...@gmail.com> wrote:
> I've encountered what I think is a problem with subversion, but I'm not
> completely sure (and according to the online instructions I should bring it
> up here prior to filing a bug).

Actually, the instructions on
http://subversion.apache.org/issue-tracker.html say that you should
send your report to users@, not dev@. So I'm adding users@. Please
drop dev@ from any further replies.

> Basically, we're trying to merge a rather large collection of fixes back in
> our trunk.  I check out a fresh copy of the trunk, then use the merge
> syntax:
> svn merge https://path/to/my/branch .
>
> This generally churns along just fine, but we occasionally get hung up on
> medium sized binary files where the svn client jumps to 100% cpu usage and
> sits on it for 3+ hours before moving on to the next file.  These files are
> anywhere from 3-10MB in size, so not ridiculously huge.  We generally have
> these files marked as octet stream, but changing to text did not help the
> situation when we tried that.
>
> I did find an old forum discussion about a potential issue that could be
> related.  I was wondering if this was ever addressed and could it still be
> the same problem.  Link is here:
> http://www.svnforum.org/threads/36123-Slow-SVN-merge
>
> I'm using svn client 1.6.12.  I looked at the online change log up through
> the 1.7 alphas and didn't see any bug fixes that sounded relevant.

This could be a relevant change (listed in the 1.7 release notes, not
in the change log):
http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations

Can you please try one of the 1.7 pre-release binaries, and see if it
helps? See http://subversion.apache.org/packages.html#pre-release

Cheers,
-- 
Johan

Re: svn merge operation extremely slow

Posted by David Chapman <dc...@acm.org>.
On 10/2/2011 2:08 PM, Kyle Leber wrote:
> I was able to capture a profile from svn (after remembering I have to 
> link statically).  I compiled with "-pg -O0" Here is the top of the file:
>
> Each sample counts as 0.01 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls   s/call   s/call  name
>  88.88    133.49   133.49 2002891836     0.00     0.00  svn_diff__snake
>   5.97    142.45     8.96       56     0.16     2.54  svn_diff(long, 
> char, short)
>   1.98    145.42     2.97  4163001     0.00     0.00  MD5Transform
>   0.41    146.04     0.62  4163001     0.00     0.00  Decode
>
> Is it OK to attach the full report to this user list?  The resulting 
> text file is 1.3MB and I wasn't sure if the list would tolerate an 
> attachment of that size.
>

It's a weekend, so you might not get a lot of replies from people who 
know SVN source code, but it's likely that the full report won't be 
needed.  There are two billion (!) calls to svn_diff__snake(), and the 
question is why there are so many.  It might help the devs if you pasted 
in the entries for functions which directly called svn_diff__snake() 
(quite possibly svn_diff() only) and perhaps the functions which 
svn_diff__snake() called directly (none of any significance, if I read 
the above report correctly).  This should be only a few dozen lines of 
the report.  Note that you'll have to trace through the report (the 
top-level function is listed first, followed by its children, 
grandchildren, etc.) to find the entries for these functions.

I have a suspicion that one of the devs will be able to identify the 
issue from just the above report, but a little more information might 
turn out to be helpful.  They certainly won't need to see information 
for all of the zillion functions in SVN.

-- 
     David Chapman         dcchapman@acm.org
     Chapman Consulting -- San Jose, CA


Re: svn merge operation extremely slow

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Kyle Leber wrote on Mon, Oct 03, 2011 at 08:16:53 -0400:
> On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn <jc...@gmail.com> wrote:
> >  (2) If the "base" of the merge target doesn't have a mime-type, look
> > if it has one on the "actual" node (the uncommitted local
> > modifications). If that's binary, then we'll go binary.
> >
> >  (3) Else: text merge

I stand corrected.

Re: svn merge operation extremely slow

Posted by Kyle Leber <ky...@gmail.com>.
On Mon, Oct 3, 2011 at 9:55 AM, Johan Corveleyn <jc...@gmail.com> wrote:

> On Mon, Oct 3, 2011 at 3:04 PM, Stefan Sperling <st...@elego.de> wrote:
> > On Mon, Oct 03, 2011 at 02:59:25PM +0200, Johan Corveleyn wrote:
> >> On Mon, Oct 3, 2011 at 2:35 PM, Johan Corveleyn <jc...@gmail.com>
> wrote:
> >> > On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber <ky...@gmail.com>
> wrote:
> >> >> I set the mime-type to "application/octet-stream" in the working copy
> prior
> >> >> to merge and this fixed the problem.  No more heavy CPU usage or
> excessive
> >> >> time spent on the file.
> >> >
> >> > I'm glad it helped. Apart from the performance, it's important that
> >> > svn does this merge the "binary way", because as you said line-based
> >> > merges are not correct for this file.
> >>
> >> It may also interest you (and other readers of this thread) that there
> >> is an open enhancement request for making text-merges take the same
> >> shortcut as binary-merges (if mine == merge-left then set merged :=
> >> merge-right), to avoid expensive diffing [1]. But that hasn't been
> >> addressed yet.
> >>
> >>
> >> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
> >> trivial text files merged MUCH slower than binary - pls optimize.
> >>
> >
> > I think we should also file an issue about the problem discussed
> > in this thread. svn should take properties on the left/right side of the
> > merge into account when determining whether to treat a file as binary.
> > I guess it should run the binary merge if any of left, right, or the
> > target are marked as binary.
>
> Yes, maybe you're right. I don't know the specifics / historics of
> this behavior (maybe there is a reason for this?). But on the surface
> it looks like it should indeed do a binary merge if either one of
> left, right or target is marked as binary.
>
> Even if #4009 would be addressed, it would still make a difference in
> the situation where the shortcut-condition (mine == merge-left)
> doesn't hold. In that case, I think the "binary-merge" would always
> flag a conflict (because it can't do a line-based merge). Is that also
> the behavior we want f.i. if only merge-left (or only merge-right)
> were marked as binary, and all the other "players" are marked as text?
> I guess it's the safest thing to do ...
>
> --
> Johan
>

Did someone already file this issue, or do you need me to?  I have never
done this before, but can certainly make an attempt if this is what's
needed.

- Kyle

Re: svn merge operation extremely slow

Posted by Johan Corveleyn <jc...@gmail.com>.
On Mon, Oct 3, 2011 at 3:04 PM, Stefan Sperling <st...@elego.de> wrote:
> On Mon, Oct 03, 2011 at 02:59:25PM +0200, Johan Corveleyn wrote:
>> On Mon, Oct 3, 2011 at 2:35 PM, Johan Corveleyn <jc...@gmail.com> wrote:
>> > On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber <ky...@gmail.com> wrote:
>> >> I set the mime-type to "application/octet-stream" in the working copy prior
>> >> to merge and this fixed the problem.  No more heavy CPU usage or excessive
>> >> time spent on the file.
>> >
>> > I'm glad it helped. Apart from the performance, it's important that
>> > svn does this merge the "binary way", because as you said line-based
>> > merges are not correct for this file.
>>
>> It may also interest you (and other readers of this thread) that there
>> is an open enhancement request for making text-merges take the same
>> shortcut as binary-merges (if mine == merge-left then set merged :=
>> merge-right), to avoid expensive diffing [1]. But that hasn't been
>> addressed yet.
>>
>>
>> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
>> trivial text files merged MUCH slower than binary - pls optimize.
>>
>
> I think we should also file an issue about the problem discussed
> in this thread. svn should take properties on the left/right side of the
> merge into account when determining whether to treat a file as binary.
> I guess it should run the binary merge if any of left, right, or the
> target are marked as binary.

Yes, maybe you're right. I don't know the specifics / historics of
this behavior (maybe there is a reason for this?). But on the surface
it looks like it should indeed do a binary merge if either one of
left, right or target is marked as binary.

Even if #4009 would be addressed, it would still make a difference in
the situation where the shortcut-condition (mine == merge-left)
doesn't hold. In that case, I think the "binary-merge" would always
flag a conflict (because it can't do a line-based merge). Is that also
the behavior we want f.i. if only merge-left (or only merge-right)
were marked as binary, and all the other "players" are marked as text?
I guess it's the safest thing to do ...

-- 
Johan

Re: svn merge operation extremely slow

Posted by Stefan Sperling <st...@elego.de>.
On Mon, Oct 03, 2011 at 02:59:25PM +0200, Johan Corveleyn wrote:
> On Mon, Oct 3, 2011 at 2:35 PM, Johan Corveleyn <jc...@gmail.com> wrote:
> > On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber <ky...@gmail.com> wrote:
> >> I set the mime-type to "application/octet-stream" in the working copy prior
> >> to merge and this fixed the problem.  No more heavy CPU usage or excessive
> >> time spent on the file.
> >
> > I'm glad it helped. Apart from the performance, it's important that
> > svn does this merge the "binary way", because as you said line-based
> > merges are not correct for this file.
> 
> It may also interest you (and other readers of this thread) that there
> is an open enhancement request for making text-merges take the same
> shortcut as binary-merges (if mine == merge-left then set merged :=
> merge-right), to avoid expensive diffing [1]. But that hasn't been
> addressed yet.
> 
> 
> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
> trivial text files merged MUCH slower than binary - pls optimize.
> 

I think we should also file an issue about the problem discussed
in this thread. svn should take properties on the left/right side of the
merge into account when determining whether to treat a file as binary.
I guess it should run the binary merge if any of left, right, or the
target are marked as binary.

Re: svn merge operation extremely slow

Posted by Johan Corveleyn <jc...@gmail.com>.
On Mon, Oct 3, 2011 at 3:02 PM, Daniel Shahaf <d....@daniel.shahaf.name> wrote:
> Johan Corveleyn wrote on Mon, Oct 03, 2011 at 14:59:25 +0200:
>> It may also interest you (and other readers of this thread) that there
>> is an open enhancement request for making text-merges take the same
>> shortcut as binary-merges (if mine == merge-left then set merged :=
>> merge-right), to avoid expensive diffing [1]. But that hasn't been
>> addressed yet.
>>
>>
>> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
>> trivial text files merged MUCH slower than binary - pls optimize.
>>
>
> Isn't "Set the svn:mime-type property locally, and revert it before
> commit" a workaround for that?

Yes, it would seem so. Though it may not be very helpful in lots of
situations (because people only discover the problem after they
out-waited a merge of several hours). Still, it's useful information
to work around it (maybe people can detect the problem in some wrapper
scripts, ahead of merging), so maybe you should add it to the issue
tracker.

-- 
Johan

Re: svn merge operation extremely slow

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Johan Corveleyn wrote on Mon, Oct 03, 2011 at 14:59:25 +0200:
> It may also interest you (and other readers of this thread) that there
> is an open enhancement request for making text-merges take the same
> shortcut as binary-merges (if mine == merge-left then set merged :=
> merge-right), to avoid expensive diffing [1]. But that hasn't been
> addressed yet.
> 
> 
> [1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
> trivial text files merged MUCH slower than binary - pls optimize.
> 

Isn't "Set the svn:mime-type property locally, and revert it before
commit" a workaround for that?

> -- 
> Johan

Re: svn merge operation extremely slow

Posted by Johan Corveleyn <jc...@gmail.com>.
On Mon, Oct 3, 2011 at 2:35 PM, Johan Corveleyn <jc...@gmail.com> wrote:
> On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber <ky...@gmail.com> wrote:
>>
>>
>> On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn <jc...@gmail.com> wrote:
>>>
>>> [ Again: please don't top-post on this list. I'm moving your reply to
>>> the bottom. More below ... ]
>>>
>>> On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber <ky...@gmail.com> wrote:
>>> > On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>
>>> > wrote:
>>> >>
>>> >> Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
>>> >> > Johan,
>>> >> >
>>> >> > I did a little more digging.  There were a few different places where
>>> >> > svn
>>> >> > seems to get hung up so I ran the gprof report on just the first one
>>> >> > (the
>>> >> > merge takes hours otherwise).  In this particular case, svn prints
>>> >> > out
>>> >> > that
>>> >> > it is merging from a small text file while it is hanging for more
>>> >> > than a
>>> >> > minute @ 100% CPU.  When I examine "lsof", however, it see it
>>> >> > actually
>>> >> > has a
>>> >> > different file open.  This one is a large (15 MB) "binary" file.  It
>>> >> > turns
>>> >> > out this binary file did not have a property in the trunk (which I
>>> >> > think
>>> >> > means it's treated as text, right?).  But in the branch it was marked
>>> >> > as
>>> >> > octet stream.   So perhaps svn is doing a text-based diff on this
>>> >> > binary
>>> >> > file because it used to be incorrectly marked as text?
>>> >> >
>>> >>
>>> >> If either side is marked as binary then svn will defer to the "Use
>>> >> merge-right if merge-left == base, else conflict" algorithm.
>>> >>
>>> >> Could you share the value of 'svn proplist --verbose' on both files?
>>> >>
>>> > Yup, trunk version has empty properties
>>> > branch version has:
>>> >
>>> > svn:mime-type
>>> >     application/octet-stream
>>> >
>>>
>>> What is the merge target? Is it a trunk working copy (the one without
>>> mime-type), or a branch working copy (with
>>> svn:mime-type=application/octet-stream)?
>>>
>>> I think it's the mime-type of the merge target that determines if
>>> merge will take the "binary" route, or the "text" route. See this
>>> snippet from libsvn_wc/merge.c [1] (in the function
>>> svn_wc__internal_merge):
>>>
>>> [[[
>>>  /* Decide if the merge target is a text or binary file. */
>>>  if ((mimeprop = get_prop(&mt, SVN_PROP_MIME_TYPE))
>>>      && mimeprop->value)
>>>    is_binary = svn_mime_type_is_binary(mimeprop->value->data);
>>>  else
>>>    {
>>>      const char *value = svn_prop_get_value(mt.actual_props,
>>>                                             SVN_PROP_MIME_TYPE);
>>>
>>>      is_binary = value && svn_mime_type_is_binary(value);
>>>    }
>>> ]]]
>>>
>>> (mt is the merge target)
>>>
>>> I'm not terribly familiar with this part of the codebase. But on first
>>> sight, this seems to say:
>>>
>>>  (1) Look at the mime-type of the "base version" of the merge target.
>>> If that's binary, then we'll go binary.
>>>
>>>  (2) If the "base" of the merge target doesn't have a mime-type, look
>>> if it has one on the "actual" node (the uncommitted local
>>> modifications). If that's binary, then we'll go binary.
>>>
>>>  (3) Else: text merge
>>>
>>> So I'm guessing that you're merging to trunk, the target without
>>> mime-type property, which makes svn take the "text" route for merging.
>>> Is that correct?
>>>
>>> If that's the case, maybe you can simply set the mime-type on that
>>> binary file in your merge target, as a local modification (I don't
>>> think you need to even commit it). Can you try that?
>>>
>>> --
>>> Johan
>>>
>>> [1]
>>> http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c
>>
>> Johan,
>>
>> Sorry for the top-post.  Hopefully this is better :)
>
> Much better, thank you :).
>
>> I set the mime-type to "application/octet-stream" in the working copy prior
>> to merge and this fixed the problem.  No more heavy CPU usage or excessive
>> time spent on the file.
>
> I'm glad it helped. Apart from the performance, it's important that
> svn does this merge the "binary way", because as you said line-based
> merges are not correct for this file.

It may also interest you (and other readers of this thread) that there
is an open enhancement request for making text-merges take the same
shortcut as binary-merges (if mine == merge-left then set merged :=
merge-right), to avoid expensive diffing [1]. But that hasn't been
addressed yet.


[1] http://subversion.tigris.org/issues/show_bug.cgi?id=4009 : Big
trivial text files merged MUCH slower than binary - pls optimize.

-- 
Johan

Re: svn merge operation extremely slow

Posted by Johan Corveleyn <jc...@gmail.com>.
On Mon, Oct 3, 2011 at 2:16 PM, Kyle Leber <ky...@gmail.com> wrote:
>
>
> On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn <jc...@gmail.com> wrote:
>>
>> [ Again: please don't top-post on this list. I'm moving your reply to
>> the bottom. More below ... ]
>>
>> On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber <ky...@gmail.com> wrote:
>> > On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>
>> > wrote:
>> >>
>> >> Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
>> >> > Johan,
>> >> >
>> >> > I did a little more digging.  There were a few different places where
>> >> > svn
>> >> > seems to get hung up so I ran the gprof report on just the first one
>> >> > (the
>> >> > merge takes hours otherwise).  In this particular case, svn prints
>> >> > out
>> >> > that
>> >> > it is merging from a small text file while it is hanging for more
>> >> > than a
>> >> > minute @ 100% CPU.  When I examine "lsof", however, it see it
>> >> > actually
>> >> > has a
>> >> > different file open.  This one is a large (15 MB) "binary" file.  It
>> >> > turns
>> >> > out this binary file did not have a property in the trunk (which I
>> >> > think
>> >> > means it's treated as text, right?).  But in the branch it was marked
>> >> > as
>> >> > octet stream.   So perhaps svn is doing a text-based diff on this
>> >> > binary
>> >> > file because it used to be incorrectly marked as text?
>> >> >
>> >>
>> >> If either side is marked as binary then svn will defer to the "Use
>> >> merge-right if merge-left == base, else conflict" algorithm.
>> >>
>> >> Could you share the value of 'svn proplist --verbose' on both files?
>> >>
>> > Yup, trunk version has empty properties
>> > branch version has:
>> >
>> > svn:mime-type
>> >     application/octet-stream
>> >
>>
>> What is the merge target? Is it a trunk working copy (the one without
>> mime-type), or a branch working copy (with
>> svn:mime-type=application/octet-stream)?
>>
>> I think it's the mime-type of the merge target that determines if
>> merge will take the "binary" route, or the "text" route. See this
>> snippet from libsvn_wc/merge.c [1] (in the function
>> svn_wc__internal_merge):
>>
>> [[[
>>  /* Decide if the merge target is a text or binary file. */
>>  if ((mimeprop = get_prop(&mt, SVN_PROP_MIME_TYPE))
>>      && mimeprop->value)
>>    is_binary = svn_mime_type_is_binary(mimeprop->value->data);
>>  else
>>    {
>>      const char *value = svn_prop_get_value(mt.actual_props,
>>                                             SVN_PROP_MIME_TYPE);
>>
>>      is_binary = value && svn_mime_type_is_binary(value);
>>    }
>> ]]]
>>
>> (mt is the merge target)
>>
>> I'm not terribly familiar with this part of the codebase. But on first
>> sight, this seems to say:
>>
>>  (1) Look at the mime-type of the "base version" of the merge target.
>> If that's binary, then we'll go binary.
>>
>>  (2) If the "base" of the merge target doesn't have a mime-type, look
>> if it has one on the "actual" node (the uncommitted local
>> modifications). If that's binary, then we'll go binary.
>>
>>  (3) Else: text merge
>>
>> So I'm guessing that you're merging to trunk, the target without
>> mime-type property, which makes svn take the "text" route for merging.
>> Is that correct?
>>
>> If that's the case, maybe you can simply set the mime-type on that
>> binary file in your merge target, as a local modification (I don't
>> think you need to even commit it). Can you try that?
>>
>> --
>> Johan
>>
>> [1]
>> http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c
>
> Johan,
>
> Sorry for the top-post.  Hopefully this is better :)

Much better, thank you :).

> I set the mime-type to "application/octet-stream" in the working copy prior
> to merge and this fixed the problem.  No more heavy CPU usage or excessive
> time spent on the file.

I'm glad it helped. Apart from the performance, it's important that
svn does this merge the "binary way", because as you said line-based
merges are not correct for this file.

-- 
Johan

Re: svn merge operation extremely slow

Posted by Kyle Leber <ky...@gmail.com>.
On Mon, Oct 3, 2011 at 4:10 AM, Johan Corveleyn <jc...@gmail.com> wrote:

> [ Again: please don't top-post on this list. I'm moving your reply to
> the bottom. More below ... ]
>
> On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber <ky...@gmail.com> wrote:
> > On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>
> > wrote:
> >>
> >> Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
> >> > Johan,
> >> >
> >> > I did a little more digging.  There were a few different places where
> >> > svn
> >> > seems to get hung up so I ran the gprof report on just the first one
> >> > (the
> >> > merge takes hours otherwise).  In this particular case, svn prints out
> >> > that
> >> > it is merging from a small text file while it is hanging for more than
> a
> >> > minute @ 100% CPU.  When I examine "lsof", however, it see it actually
> >> > has a
> >> > different file open.  This one is a large (15 MB) "binary" file.  It
> >> > turns
> >> > out this binary file did not have a property in the trunk (which I
> think
> >> > means it's treated as text, right?).  But in the branch it was marked
> as
> >> > octet stream.   So perhaps svn is doing a text-based diff on this
> binary
> >> > file because it used to be incorrectly marked as text?
> >> >
> >>
> >> If either side is marked as binary then svn will defer to the "Use
> >> merge-right if merge-left == base, else conflict" algorithm.
> >>
> >> Could you share the value of 'svn proplist --verbose' on both files?
> >>
> > Yup, trunk version has empty properties
> > branch version has:
> >
> > svn:mime-type
> >     application/octet-stream
> >
>
> What is the merge target? Is it a trunk working copy (the one without
> mime-type), or a branch working copy (with
> svn:mime-type=application/octet-stream)?
>
> I think it's the mime-type of the merge target that determines if
> merge will take the "binary" route, or the "text" route. See this
> snippet from libsvn_wc/merge.c [1] (in the function
> svn_wc__internal_merge):
>
> [[[
>  /* Decide if the merge target is a text or binary file. */
>  if ((mimeprop = get_prop(&mt, SVN_PROP_MIME_TYPE))
>      && mimeprop->value)
>    is_binary = svn_mime_type_is_binary(mimeprop->value->data);
>  else
>    {
>      const char *value = svn_prop_get_value(mt.actual_props,
>                                             SVN_PROP_MIME_TYPE);
>
>      is_binary = value && svn_mime_type_is_binary(value);
>    }
> ]]]
>
> (mt is the merge target)
>
> I'm not terribly familiar with this part of the codebase. But on first
> sight, this seems to say:
>
>  (1) Look at the mime-type of the "base version" of the merge target.
> If that's binary, then we'll go binary.
>
>  (2) If the "base" of the merge target doesn't have a mime-type, look
> if it has one on the "actual" node (the uncommitted local
> modifications). If that's binary, then we'll go binary.
>
>  (3) Else: text merge
>
> So I'm guessing that you're merging to trunk, the target without
> mime-type property, which makes svn take the "text" route for merging.
> Is that correct?
>
> If that's the case, maybe you can simply set the mime-type on that
> binary file in your merge target, as a local modification (I don't
> think you need to even commit it). Can you try that?
>
> --
> Johan
>
> [1]
> http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c
>

Johan,

Sorry for the top-post.  Hopefully this is better :)

I set the mime-type to "application/octet-stream" in the working copy prior
to merge and this fixed the problem.  No more heavy CPU usage or excessive
time spent on the file.

Kyle

Re: svn merge operation extremely slow

Posted by Johan Corveleyn <jc...@gmail.com>.
[ Again: please don't top-post on this list. I'm moving your reply to
the bottom. More below ... ]

On Mon, Oct 3, 2011 at 2:24 AM, Kyle Leber <ky...@gmail.com> wrote:
> On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>
> wrote:
>>
>> Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
>> > Johan,
>> >
>> > I did a little more digging.  There were a few different places where
>> > svn
>> > seems to get hung up so I ran the gprof report on just the first one
>> > (the
>> > merge takes hours otherwise).  In this particular case, svn prints out
>> > that
>> > it is merging from a small text file while it is hanging for more than a
>> > minute @ 100% CPU.  When I examine "lsof", however, it see it actually
>> > has a
>> > different file open.  This one is a large (15 MB) "binary" file.  It
>> > turns
>> > out this binary file did not have a property in the trunk (which I think
>> > means it's treated as text, right?).  But in the branch it was marked as
>> > octet stream.   So perhaps svn is doing a text-based diff on this binary
>> > file because it used to be incorrectly marked as text?
>> >
>>
>> If either side is marked as binary then svn will defer to the "Use
>> merge-right if merge-left == base, else conflict" algorithm.
>>
>> Could you share the value of 'svn proplist --verbose' on both files?
>>
> Yup, trunk version has empty properties
> branch version has:
>
> svn:mime-type
>     application/octet-stream
>

What is the merge target? Is it a trunk working copy (the one without
mime-type), or a branch working copy (with
svn:mime-type=application/octet-stream)?

I think it's the mime-type of the merge target that determines if
merge will take the "binary" route, or the "text" route. See this
snippet from libsvn_wc/merge.c [1] (in the function
svn_wc__internal_merge):

[[[
  /* Decide if the merge target is a text or binary file. */
  if ((mimeprop = get_prop(&mt, SVN_PROP_MIME_TYPE))
      && mimeprop->value)
    is_binary = svn_mime_type_is_binary(mimeprop->value->data);
  else
    {
      const char *value = svn_prop_get_value(mt.actual_props,
                                             SVN_PROP_MIME_TYPE);

      is_binary = value && svn_mime_type_is_binary(value);
    }
]]]

(mt is the merge target)

I'm not terribly familiar with this part of the codebase. But on first
sight, this seems to say:

  (1) Look at the mime-type of the "base version" of the merge target.
If that's binary, then we'll go binary.

  (2) If the "base" of the merge target doesn't have a mime-type, look
if it has one on the "actual" node (the uncommitted local
modifications). If that's binary, then we'll go binary.

  (3) Else: text merge

So I'm guessing that you're merging to trunk, the target without
mime-type property, which makes svn take the "text" route for merging.
Is that correct?

If that's the case, maybe you can simply set the mime-type on that
binary file in your merge target, as a local modification (I don't
think you need to even commit it). Can you try that?

-- 
Johan

[1] http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_wc/merge.c

Re: svn merge operation extremely slow

Posted by Kyle Leber <ky...@gmail.com>.
Yup, trunk version has empty properties
branch version has:

svn:mime-type
    application/octet-stream

On Sun, Oct 2, 2011 at 8:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
> > Johan,
> >
> > I did a little more digging.  There were a few different places where svn
> > seems to get hung up so I ran the gprof report on just the first one (the
> > merge takes hours otherwise).  In this particular case, svn prints out
> that
> > it is merging from a small text file while it is hanging for more than a
> > minute @ 100% CPU.  When I examine "lsof", however, it see it actually
> has a
> > different file open.  This one is a large (15 MB) "binary" file.  It
> turns
> > out this binary file did not have a property in the trunk (which I think
> > means it's treated as text, right?).  But in the branch it was marked as
> > octet stream.   So perhaps svn is doing a text-based diff on this binary
> > file because it used to be incorrectly marked as text?
> >
>
> If either side is marked as binary then svn will defer to the "Use
> merge-right if merge-left == base, else conflict" algorithm.
>
> Could you share the value of 'svn proplist --verbose' on both files?
>
> Thanks,
>
> Daniel
>

Re: svn merge operation extremely slow

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Kyle Leber wrote on Sun, Oct 02, 2011 at 20:05:19 -0400:
> Johan,
> 
> I did a little more digging.  There were a few different places where svn
> seems to get hung up so I ran the gprof report on just the first one (the
> merge takes hours otherwise).  In this particular case, svn prints out that
> it is merging from a small text file while it is hanging for more than a
> minute @ 100% CPU.  When I examine "lsof", however, it see it actually has a
> different file open.  This one is a large (15 MB) "binary" file.  It turns
> out this binary file did not have a property in the trunk (which I think
> means it's treated as text, right?).  But in the branch it was marked as
> octet stream.   So perhaps svn is doing a text-based diff on this binary
> file because it used to be incorrectly marked as text?
> 

If either side is marked as binary then svn will defer to the "Use
merge-right if merge-left == base, else conflict" algorithm.

Could you share the value of 'svn proplist --verbose' on both files?

Thanks,

Daniel

Re: svn merge operation extremely slow

Posted by Kyle Leber <ky...@gmail.com>.
Johan,

I did a little more digging.  There were a few different places where svn
seems to get hung up so I ran the gprof report on just the first one (the
merge takes hours otherwise).  In this particular case, svn prints out that
it is merging from a small text file while it is hanging for more than a
minute @ 100% CPU.  When I examine "lsof", however, it see it actually has a
different file open.  This one is a large (15 MB) "binary" file.  It turns
out this binary file did not have a property in the trunk (which I think
means it's treated as text, right?).  But in the branch it was marked as
octet stream.   So perhaps svn is doing a text-based diff on this binary
file because it used to be incorrectly marked as text?

Side-note: The contents of this 15MB file are actually ASCII, but we do want
it treated as binary b/c line-based merges are never valid.

Another snippet from the same gprof report is below.

Cheers,
Kyle

-----------------------------------------------
                0.00  144.03      27/27          do_text_merge [10]
[11]    95.9    0.00  144.03      27         svn_diff_file_diff3_2 [11]
                0.01  144.02      27/27          svn_diff_diff3_2 [12]
                0.00    0.00      27/5723        apr_pool_destroy [833]
                0.00    0.00      27/6430        svn_pool_create_ex [1558]
-----------------------------------------------
                0.01  144.02      27/27          svn_diff_file_diff3_2 [11]
[12]    95.9    0.01  144.02      27         svn_diff_diff3_2 [12]
                8.64  128.73      54/56          svn_diff(long, char, short)
[13]
                0.01    5.09   21014/21014       svn_diff__resolve_conflict
[15]
                0.03    1.51      81/81          svn_diff__get_tokens [25]
                0.00    0.01      27/27          datasources_open [272]
                0.01    0.00      81/85          svn_diff__get_token_counts
[284]
                0.00    0.00   42065/2476341     apr_palloc [136]
                0.00    0.00      27/27          svn_diff__tree_create
[1235]
                0.00    0.00      54/5723        apr_pool_destroy [833]
                0.00    0.00      27/27          token_discard_all [1282]
                0.00    0.00      54/6430        svn_pool_create_ex [1558]
                0.00    0.00      27/27          svn_diff__get_node_count
[1911]
-----------------------------------------------
                0.32    4.77       2/56          svn_diff__resolve_conflict
[15]
                8.64  128.73      54/56          svn_diff_diff3_2 [12]
[13]    94.8    8.96  133.49      56         svn_diff(long, char, short)
[13]
              133.49    0.00 2002891836/2002891836     svn_diff__snake [14]
                0.00    0.00     224/2476341     apr_palloc [136]
                0.00    0.00      64/64          prepend_lcs [1103]
                0.00    0.00      56/56          svn_diff__lcs_reverse
[1875]
-----------------------------------------------
              133.49    0.00 2002891836/2002891836     svn_diff(long, char,
short) [13]
[14]    88.9  133.49    0.00 2002891836         svn_diff__snake [14]
                0.00    0.00  168906/2476341     apr_palloc [136]

On Sun, Oct 2, 2011 at 5:58 PM, Johan Corveleyn <jc...@gmail.com> wrote:

> On Sun, Oct 2, 2011 at 11:08 PM, Kyle Leber <ky...@gmail.com> wrote:
> > I was able to capture a profile from svn (after remembering I have to
> link
> > statically).  I compiled with "-pg -O0" Here is the top of the file:
> >
> > Each sample counts as 0.01 seconds.
> >   %   cumulative   self              self     total
> >  time   seconds   seconds    calls   s/call   s/call  name
> >  88.88    133.49   133.49 2002891836     0.00     0.00  svn_diff__snake
> >   5.97    142.45     8.96       56     0.16     2.54  svn_diff(long,
> char,
> > short)
> >   1.98    145.42     2.97  4163001     0.00     0.00  MD5Transform
> >   0.41    146.04     0.62  4163001     0.00     0.00  Decode
>
> What's it doing in svn_diff__snake (or svn_diff for that matter)? That
> should only be hit when svn is doing textual merges (in which case it
> must do rather expensive diff calculations --- I'm sure those
> calculations can go ballistic when being confronted with a large
> binary file, not consisting of text lines).
>
> Are you sure those files were actually marked as binary (svn:mime-type
> of application/octet-stream or something else non-texty)?
>
> --
> Johan
>

Re: svn merge operation extremely slow

Posted by Johan Corveleyn <jc...@gmail.com>.
On Sun, Oct 2, 2011 at 11:08 PM, Kyle Leber <ky...@gmail.com> wrote:
> I was able to capture a profile from svn (after remembering I have to link
> statically).  I compiled with "-pg -O0" Here is the top of the file:
>
> Each sample counts as 0.01 seconds.
>   %   cumulative   self              self     total
>  time   seconds   seconds    calls   s/call   s/call  name
>  88.88    133.49   133.49 2002891836     0.00     0.00  svn_diff__snake
>   5.97    142.45     8.96       56     0.16     2.54  svn_diff(long, char,
> short)
>   1.98    145.42     2.97  4163001     0.00     0.00  MD5Transform
>   0.41    146.04     0.62  4163001     0.00     0.00  Decode

What's it doing in svn_diff__snake (or svn_diff for that matter)? That
should only be hit when svn is doing textual merges (in which case it
must do rather expensive diff calculations --- I'm sure those
calculations can go ballistic when being confronted with a large
binary file, not consisting of text lines).

Are you sure those files were actually marked as binary (svn:mime-type
of application/octet-stream or something else non-texty)?

-- 
Johan

Re: svn merge operation extremely slow

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Kyle Leber wrote on Sun, Oct 02, 2011 at 17:08:16 -0400:
> Is it OK to attach the full report to this user list?  The resulting text
> file is 1.3MB and I wasn't sure if the list would tolerate an attachment of
> that size.

It would be better to upload it somewhere and send a link to this list,
or to digest the report and post only the highlights to this list (as
you have done).

Re: svn merge operation extremely slow

Posted by Kyle Leber <ky...@gmail.com>.
I was able to capture a profile from svn (after remembering I have to link
statically).  I compiled with "-pg -O0" Here is the top of the file:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 88.88    133.49   133.49 2002891836     0.00     0.00  svn_diff__snake
  5.97    142.45     8.96       56     0.16     2.54  svn_diff(long, char,
short)
  1.98    145.42     2.97  4163001     0.00     0.00  MD5Transform
  0.41    146.04     0.62  4163001     0.00     0.00  Decode

Is it OK to attach the full report to this user list?  The resulting text
file is 1.3MB and I wasn't sure if the list would tolerate an attachment of
that size.

Cheers,
Kyle

On Sat, Oct 1, 2011 at 7:55 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> gprof is what I'm familiar with (nutshell: compile with 'gcc -pg' and
> read gmon.out).  There are no specific profiling docs for svn; if you
> need more specific advice please post to the dev@ list.  Thanks!
>
> Kyle Leber wrote on Sat, Oct 01, 2011 at 19:33:10 -0400:
> > What method of profiling do you recommend?  I have used gprof previously
> > (it's been awhile) but am not familiar with the subversion project source
> > code and build setup.  Is the a online guide or wiki describing the
> > preferred setup for performing this?
> >
> > Kyle
> >
> > On Sat, Oct 1, 2011 at 3:10 PM, Daniel Shahaf <d.s@daniel.shahaf.name
> >wrote:
> >
> > > Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200:
> > > > [ Please do not top-post on this list, i.e. please put your reply
> > > > below or inline. More below ... ]
> > > >
> > > > On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber <ky...@gmail.com>
> wrote:
> > > > > On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn <
> jcorvel@gmail.com>
> > > wrote:
> > > > >>
> > > > >> On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber <kyle.leber@gmail.com
> >
> > > wrote:
> > > > >> > I've encountered what I think is a problem with subversion, but
> I'm
> > > not
> > > > >> > completely sure (and according to the online instructions I
> should
> > > bring
> > > > >> > it
> > > > >> > up here prior to filing a bug).
> > > > >>
> > > > >> Actually, the instructions on
> > > > >> http://subversion.apache.org/issue-tracker.html say that you
> should
> > > > >> send your report to users@, not dev@. So I'm adding users@.
> Please
> > > > >> drop dev@ from any further replies.
> > > > >>
> > > > >> > Basically, we're trying to merge a rather large collection of
> > > > >> > fixes back in our trunk.  I check out a fresh copy of the trunk,
> > > > >> > then use the merge syntax: svn merge https://path/to/my/branch.
> > > > >> >
> > > > >> > This generally churns along just fine, but we occasionally get
> > > > >> > hung up on medium sized binary files where the svn client jumps
> > > > >> > to 100% cpu usage and sits on it for 3+ hours before moving on
> to
> > > > >> > the next file.  These files are anywhere from 3-10MB in size, so
> > > > >> > not ridiculously huge.  We generally have these files marked as
> > > > >> > octet stream, but changing to text did not help the situation
> > > > >> > when we tried that.
> > > > >> >
> > > > >> > I did find an old forum discussion about a potential issue that
> > > > >> > could be related.  I was wondering if this was ever addressed
> and
> > > > >> > could it still be the same problem.  Link is here:
> > > > >> > http://www.svnforum.org/threads/36123-Slow-SVN-merge
> > > > >> >
> > > > >> > I'm using svn client 1.6.12.  I looked at the online change log
> > > > >> > up through the 1.7 alphas and didn't see any bug fixes that
> > > > >> > sounded relevant.
> > > > >>
> > > > >> This could be a relevant change (listed in the 1.7 release notes,
> not
> > > > >> in the change log):
> > > > >>
> > > > >>
> > >
> http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
> > > > >>
> > > > >> Can you please try one of the 1.7 pre-release binaries, and see if
> it
> > > > >> helps? See http://subversion.apache.org/packages.html#pre-release
> > > > >>
> > > > > Thanks, Johan.  I tested with 1.7rc4 and it did not make any
> > > perceptible
> > > > > difference.  Anything else I can try?
> > > >
> > > > Hm, that's unfortunate.
> > > >
> > > > Actually, it was to be expected that this wouldn't help, because the
> > > > diff-optimizations in 1.7 only play a role when merging text files
> > > > (and diffing and blaming). And you said those
> > > > "files-that-make-merge-hang" are generally marked as octet-stream,
> and
> > > > changing them to text made no difference.
> > > >
> > > > That seems to indicate that the 100% cpu usage on the client isn't
> > > > spent in the diff code (unlike the forum thread that you linked to,
> > > > where the poster tracked it down to libsvn_diff/lcs.c --- he would
> > > > definitely have been helped by the 1.7 improvements).
> > > >
> > >
> > > What does 'svn merge' do for binary files?  I checked svn_wc__merge()
> > > a few months ago and for binary files all it knew to do was
> > >
> > > (a) if mine == merge-left then set merged := merge-right
> > > (b) invoke the configured diff3-cmd
> > > (c) raise a conflict
> > >
> > > but it didn't do any line-based merge (per Johan's second response).
> > >
> > > > So there's another reason. Maybe it has something to do with (lots
> of)
> > > > subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
> > > > on directories and files all over the place?
> > > >
> > > > Also: can you tell us what version is running on the server?
> > > >
> > > > Maybe other people on this list have had similar experiences, and can
> > > > give some suggestions?
> > > >
> > >
> > > Another line of thought: the algorithm for computing binary deltas
> > > changed a few years ago, and I recall reading (on old bug reports?)
> > > about some cases in which the delta combiner would be inefficient for
> > > deltas generated by old servers --- i.e., it would be expensive to 'svn
> > > cat' files that were committed to old servers in repositories that
> > > haven't been dumped/loaded by a newer server.
> > >
> > > In any case: can you run the merge under a profiler and tell us in what
> > > function(s) time is spent?
> > >
> > > Daniel
> > >
> > > > --
> > > > Johan
> > >
>

Re: svn merge operation extremely slow

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
gprof is what I'm familiar with (nutshell: compile with 'gcc -pg' and
read gmon.out).  There are no specific profiling docs for svn; if you
need more specific advice please post to the dev@ list.  Thanks!

Kyle Leber wrote on Sat, Oct 01, 2011 at 19:33:10 -0400:
> What method of profiling do you recommend?  I have used gprof previously
> (it's been awhile) but am not familiar with the subversion project source
> code and build setup.  Is the a online guide or wiki describing the
> preferred setup for performing this?
> 
> Kyle
> 
> On Sat, Oct 1, 2011 at 3:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:
> 
> > Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200:
> > > [ Please do not top-post on this list, i.e. please put your reply
> > > below or inline. More below ... ]
> > >
> > > On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber <ky...@gmail.com> wrote:
> > > > On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn <jc...@gmail.com>
> > wrote:
> > > >>
> > > >> On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber <ky...@gmail.com>
> > wrote:
> > > >> > I've encountered what I think is a problem with subversion, but I'm
> > not
> > > >> > completely sure (and according to the online instructions I should
> > bring
> > > >> > it
> > > >> > up here prior to filing a bug).
> > > >>
> > > >> Actually, the instructions on
> > > >> http://subversion.apache.org/issue-tracker.html say that you should
> > > >> send your report to users@, not dev@. So I'm adding users@. Please
> > > >> drop dev@ from any further replies.
> > > >>
> > > >> > Basically, we're trying to merge a rather large collection of
> > > >> > fixes back in our trunk.  I check out a fresh copy of the trunk,
> > > >> > then use the merge syntax: svn merge https://path/to/my/branch .
> > > >> >
> > > >> > This generally churns along just fine, but we occasionally get
> > > >> > hung up on medium sized binary files where the svn client jumps
> > > >> > to 100% cpu usage and sits on it for 3+ hours before moving on to
> > > >> > the next file.  These files are anywhere from 3-10MB in size, so
> > > >> > not ridiculously huge.  We generally have these files marked as
> > > >> > octet stream, but changing to text did not help the situation
> > > >> > when we tried that.
> > > >> >
> > > >> > I did find an old forum discussion about a potential issue that
> > > >> > could be related.  I was wondering if this was ever addressed and
> > > >> > could it still be the same problem.  Link is here:
> > > >> > http://www.svnforum.org/threads/36123-Slow-SVN-merge
> > > >> >
> > > >> > I'm using svn client 1.6.12.  I looked at the online change log
> > > >> > up through the 1.7 alphas and didn't see any bug fixes that
> > > >> > sounded relevant.
> > > >>
> > > >> This could be a relevant change (listed in the 1.7 release notes, not
> > > >> in the change log):
> > > >>
> > > >>
> > http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
> > > >>
> > > >> Can you please try one of the 1.7 pre-release binaries, and see if it
> > > >> helps? See http://subversion.apache.org/packages.html#pre-release
> > > >>
> > > > Thanks, Johan.  I tested with 1.7rc4 and it did not make any
> > perceptible
> > > > difference.  Anything else I can try?
> > >
> > > Hm, that's unfortunate.
> > >
> > > Actually, it was to be expected that this wouldn't help, because the
> > > diff-optimizations in 1.7 only play a role when merging text files
> > > (and diffing and blaming). And you said those
> > > "files-that-make-merge-hang" are generally marked as octet-stream, and
> > > changing them to text made no difference.
> > >
> > > That seems to indicate that the 100% cpu usage on the client isn't
> > > spent in the diff code (unlike the forum thread that you linked to,
> > > where the poster tracked it down to libsvn_diff/lcs.c --- he would
> > > definitely have been helped by the 1.7 improvements).
> > >
> >
> > What does 'svn merge' do for binary files?  I checked svn_wc__merge()
> > a few months ago and for binary files all it knew to do was
> >
> > (a) if mine == merge-left then set merged := merge-right
> > (b) invoke the configured diff3-cmd
> > (c) raise a conflict
> >
> > but it didn't do any line-based merge (per Johan's second response).
> >
> > > So there's another reason. Maybe it has something to do with (lots of)
> > > subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
> > > on directories and files all over the place?
> > >
> > > Also: can you tell us what version is running on the server?
> > >
> > > Maybe other people on this list have had similar experiences, and can
> > > give some suggestions?
> > >
> >
> > Another line of thought: the algorithm for computing binary deltas
> > changed a few years ago, and I recall reading (on old bug reports?)
> > about some cases in which the delta combiner would be inefficient for
> > deltas generated by old servers --- i.e., it would be expensive to 'svn
> > cat' files that were committed to old servers in repositories that
> > haven't been dumped/loaded by a newer server.
> >
> > In any case: can you run the merge under a profiler and tell us in what
> > function(s) time is spent?
> >
> > Daniel
> >
> > > --
> > > Johan
> >

Re: svn merge operation extremely slow

Posted by Kyle Leber <ky...@gmail.com>.
What method of profiling do you recommend?  I have used gprof previously
(it's been awhile) but am not familiar with the subversion project source
code and build setup.  Is the a online guide or wiki describing the
preferred setup for performing this?

Kyle

On Sat, Oct 1, 2011 at 3:10 PM, Daniel Shahaf <d....@daniel.shahaf.name>wrote:

> Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200:
> > [ Please do not top-post on this list, i.e. please put your reply
> > below or inline. More below ... ]
> >
> > On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber <ky...@gmail.com> wrote:
> > > On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn <jc...@gmail.com>
> wrote:
> > >>
> > >> On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber <ky...@gmail.com>
> wrote:
> > >> > I've encountered what I think is a problem with subversion, but I'm
> not
> > >> > completely sure (and according to the online instructions I should
> bring
> > >> > it
> > >> > up here prior to filing a bug).
> > >>
> > >> Actually, the instructions on
> > >> http://subversion.apache.org/issue-tracker.html say that you should
> > >> send your report to users@, not dev@. So I'm adding users@. Please
> > >> drop dev@ from any further replies.
> > >>
> > >> > Basically, we're trying to merge a rather large collection of
> > >> > fixes back in our trunk.  I check out a fresh copy of the trunk,
> > >> > then use the merge syntax: svn merge https://path/to/my/branch .
> > >> >
> > >> > This generally churns along just fine, but we occasionally get
> > >> > hung up on medium sized binary files where the svn client jumps
> > >> > to 100% cpu usage and sits on it for 3+ hours before moving on to
> > >> > the next file.  These files are anywhere from 3-10MB in size, so
> > >> > not ridiculously huge.  We generally have these files marked as
> > >> > octet stream, but changing to text did not help the situation
> > >> > when we tried that.
> > >> >
> > >> > I did find an old forum discussion about a potential issue that
> > >> > could be related.  I was wondering if this was ever addressed and
> > >> > could it still be the same problem.  Link is here:
> > >> > http://www.svnforum.org/threads/36123-Slow-SVN-merge
> > >> >
> > >> > I'm using svn client 1.6.12.  I looked at the online change log
> > >> > up through the 1.7 alphas and didn't see any bug fixes that
> > >> > sounded relevant.
> > >>
> > >> This could be a relevant change (listed in the 1.7 release notes, not
> > >> in the change log):
> > >>
> > >>
> http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
> > >>
> > >> Can you please try one of the 1.7 pre-release binaries, and see if it
> > >> helps? See http://subversion.apache.org/packages.html#pre-release
> > >>
> > > Thanks, Johan.  I tested with 1.7rc4 and it did not make any
> perceptible
> > > difference.  Anything else I can try?
> >
> > Hm, that's unfortunate.
> >
> > Actually, it was to be expected that this wouldn't help, because the
> > diff-optimizations in 1.7 only play a role when merging text files
> > (and diffing and blaming). And you said those
> > "files-that-make-merge-hang" are generally marked as octet-stream, and
> > changing them to text made no difference.
> >
> > That seems to indicate that the 100% cpu usage on the client isn't
> > spent in the diff code (unlike the forum thread that you linked to,
> > where the poster tracked it down to libsvn_diff/lcs.c --- he would
> > definitely have been helped by the 1.7 improvements).
> >
>
> What does 'svn merge' do for binary files?  I checked svn_wc__merge()
> a few months ago and for binary files all it knew to do was
>
> (a) if mine == merge-left then set merged := merge-right
> (b) invoke the configured diff3-cmd
> (c) raise a conflict
>
> but it didn't do any line-based merge (per Johan's second response).
>
> > So there's another reason. Maybe it has something to do with (lots of)
> > subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
> > on directories and files all over the place?
> >
> > Also: can you tell us what version is running on the server?
> >
> > Maybe other people on this list have had similar experiences, and can
> > give some suggestions?
> >
>
> Another line of thought: the algorithm for computing binary deltas
> changed a few years ago, and I recall reading (on old bug reports?)
> about some cases in which the delta combiner would be inefficient for
> deltas generated by old servers --- i.e., it would be expensive to 'svn
> cat' files that were committed to old servers in repositories that
> haven't been dumped/loaded by a newer server.
>
> In any case: can you run the merge under a profiler and tell us in what
> function(s) time is spent?
>
> Daniel
>
> > --
> > Johan
>

Re: svn merge operation extremely slow

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Johan Corveleyn wrote on Sat, Oct 01, 2011 at 20:47:29 +0200:
> [ Please do not top-post on this list, i.e. please put your reply
> below or inline. More below ... ]
> 
> On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber <ky...@gmail.com> wrote:
> > On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn <jc...@gmail.com> wrote:
> >>
> >> On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber <ky...@gmail.com> wrote:
> >> > I've encountered what I think is a problem with subversion, but I'm not
> >> > completely sure (and according to the online instructions I should bring
> >> > it
> >> > up here prior to filing a bug).
> >>
> >> Actually, the instructions on
> >> http://subversion.apache.org/issue-tracker.html say that you should
> >> send your report to users@, not dev@. So I'm adding users@. Please
> >> drop dev@ from any further replies.
> >>
> >> > Basically, we're trying to merge a rather large collection of
> >> > fixes back in our trunk.  I check out a fresh copy of the trunk,
> >> > then use the merge syntax: svn merge https://path/to/my/branch .
> >> >
> >> > This generally churns along just fine, but we occasionally get
> >> > hung up on medium sized binary files where the svn client jumps
> >> > to 100% cpu usage and sits on it for 3+ hours before moving on to
> >> > the next file.  These files are anywhere from 3-10MB in size, so
> >> > not ridiculously huge.  We generally have these files marked as
> >> > octet stream, but changing to text did not help the situation
> >> > when we tried that.
> >> >
> >> > I did find an old forum discussion about a potential issue that
> >> > could be related.  I was wondering if this was ever addressed and
> >> > could it still be the same problem.  Link is here:
> >> > http://www.svnforum.org/threads/36123-Slow-SVN-merge
> >> >
> >> > I'm using svn client 1.6.12.  I looked at the online change log
> >> > up through the 1.7 alphas and didn't see any bug fixes that
> >> > sounded relevant.
> >>
> >> This could be a relevant change (listed in the 1.7 release notes, not
> >> in the change log):
> >>
> >> http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
> >>
> >> Can you please try one of the 1.7 pre-release binaries, and see if it
> >> helps? See http://subversion.apache.org/packages.html#pre-release
> >>
> > Thanks, Johan.  I tested with 1.7rc4 and it did not make any perceptible
> > difference.  Anything else I can try?
> 
> Hm, that's unfortunate.
> 
> Actually, it was to be expected that this wouldn't help, because the
> diff-optimizations in 1.7 only play a role when merging text files
> (and diffing and blaming). And you said those
> "files-that-make-merge-hang" are generally marked as octet-stream, and
> changing them to text made no difference.
> 
> That seems to indicate that the 100% cpu usage on the client isn't
> spent in the diff code (unlike the forum thread that you linked to,
> where the poster tracked it down to libsvn_diff/lcs.c --- he would
> definitely have been helped by the 1.7 improvements).
> 

What does 'svn merge' do for binary files?  I checked svn_wc__merge()
a few months ago and for binary files all it knew to do was

(a) if mine == merge-left then set merged := merge-right
(b) invoke the configured diff3-cmd
(c) raise a conflict

but it didn't do any line-based merge (per Johan's second response).

> So there's another reason. Maybe it has something to do with (lots of)
> subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
> on directories and files all over the place?
> 
> Also: can you tell us what version is running on the server?
> 
> Maybe other people on this list have had similar experiences, and can
> give some suggestions?
> 

Another line of thought: the algorithm for computing binary deltas
changed a few years ago, and I recall reading (on old bug reports?)
about some cases in which the delta combiner would be inefficient for
deltas generated by old servers --- i.e., it would be expensive to 'svn
cat' files that were committed to old servers in repositories that
haven't been dumped/loaded by a newer server.

In any case: can you run the merge under a profiler and tell us in what
function(s) time is spent?

Daniel

> -- 
> Johan

Re: svn merge operation extremely slow

Posted by Johan Corveleyn <jc...@gmail.com>.
[ Please do not top-post on this list, i.e. please put your reply
below or inline. More below ... ]

On Sat, Oct 1, 2011 at 6:49 PM, Kyle Leber <ky...@gmail.com> wrote:
> On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn <jc...@gmail.com> wrote:
>>
>> On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber <ky...@gmail.com> wrote:
>> > I've encountered what I think is a problem with subversion, but I'm not
>> > completely sure (and according to the online instructions I should bring
>> > it
>> > up here prior to filing a bug).
>>
>> Actually, the instructions on
>> http://subversion.apache.org/issue-tracker.html say that you should
>> send your report to users@, not dev@. So I'm adding users@. Please
>> drop dev@ from any further replies.
>>
>> > Basically, we're trying to merge a rather large collection of fixes back
>> > in
>> > our trunk.  I check out a fresh copy of the trunk, then use the merge
>> > syntax:
>> > svn merge https://path/to/my/branch .
>> >
>> > This generally churns along just fine, but we occasionally get hung up
>> > on
>> > medium sized binary files where the svn client jumps to 100% cpu usage
>> > and
>> > sits on it for 3+ hours before moving on to the next file.  These files
>> > are
>> > anywhere from 3-10MB in size, so not ridiculously huge.  We generally
>> > have
>> > these files marked as octet stream, but changing to text did not help
>> > the
>> > situation when we tried that.
>> >
>> > I did find an old forum discussion about a potential issue that could be
>> > related.  I was wondering if this was ever addressed and could it still
>> > be
>> > the same problem.  Link is here:
>> > http://www.svnforum.org/threads/36123-Slow-SVN-merge
>> >
>> > I'm using svn client 1.6.12.  I looked at the online change log up
>> > through
>> > the 1.7 alphas and didn't see any bug fixes that sounded relevant.
>>
>> This could be a relevant change (listed in the 1.7 release notes, not
>> in the change log):
>>
>> http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
>>
>> Can you please try one of the 1.7 pre-release binaries, and see if it
>> helps? See http://subversion.apache.org/packages.html#pre-release
>>
> Thanks, Johan.  I tested with 1.7rc4 and it did not make any perceptible
> difference.  Anything else I can try?

Hm, that's unfortunate.

Actually, it was to be expected that this wouldn't help, because the
diff-optimizations in 1.7 only play a role when merging text files
(and diffing and blaming). And you said those
"files-that-make-merge-hang" are generally marked as octet-stream, and
changing them to text made no difference.

That seems to indicate that the 100% cpu usage on the client isn't
spent in the diff code (unlike the forum thread that you linked to,
where the poster tracked it down to libsvn_diff/lcs.c --- he would
definitely have been helped by the 1.7 improvements).

So there's another reason. Maybe it has something to do with (lots of)
subtree mergeinfo? Can you verify if there is a lot of svn:mergeinfo
on directories and files all over the place?

Also: can you tell us what version is running on the server?

Maybe other people on this list have had similar experiences, and can
give some suggestions?

-- 
Johan

Re: svn merge operation extremely slow

Posted by Kyle Leber <ky...@gmail.com>.
Thanks, Johan.  I tested with 1.7rc4 and it did not make any perceptible
difference.  Anything else I can try?

On Fri, Sep 30, 2011 at 7:15 PM, Johan Corveleyn <jc...@gmail.com> wrote:

> On Fri, Sep 30, 2011 at 3:29 PM, Kyle Leber <ky...@gmail.com> wrote:
> > I've encountered what I think is a problem with subversion, but I'm not
> > completely sure (and according to the online instructions I should bring
> it
> > up here prior to filing a bug).
>
> Actually, the instructions on
> http://subversion.apache.org/issue-tracker.html say that you should
> send your report to users@, not dev@. So I'm adding users@. Please
> drop dev@ from any further replies.
>
> > Basically, we're trying to merge a rather large collection of fixes back
> in
> > our trunk.  I check out a fresh copy of the trunk, then use the merge
> > syntax:
> > svn merge https://path/to/my/branch .
> >
> > This generally churns along just fine, but we occasionally get hung up on
> > medium sized binary files where the svn client jumps to 100% cpu usage
> and
> > sits on it for 3+ hours before moving on to the next file.  These files
> are
> > anywhere from 3-10MB in size, so not ridiculously huge.  We generally
> have
> > these files marked as octet stream, but changing to text did not help the
> > situation when we tried that.
> >
> > I did find an old forum discussion about a potential issue that could be
> > related.  I was wondering if this was ever addressed and could it still
> be
> > the same problem.  Link is here:
> > http://www.svnforum.org/threads/36123-Slow-SVN-merge
> >
> > I'm using svn client 1.6.12.  I looked at the online change log up
> through
> > the 1.7 alphas and didn't see any bug fixes that sounded relevant.
>
> This could be a relevant change (listed in the 1.7 release notes, not
> in the change log):
> http://subversion.apache.org/docs/release-notes/1.7.html#diff-optimizations
>
> Can you please try one of the 1.7 pre-release binaries, and see if it
> helps? See http://subversion.apache.org/packages.html#pre-release
>
> Cheers,
> --
> Johan
>