You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Kamesh Jayachandran <ka...@collab.net> on 2006/12/23 19:54:05 UTC

[PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

Hi All,

Problem: (If this description is boring jump to 'Story' section.)

   "Sub tree 'S' of a parent tree 'P' has already been merged of certain
    revisions say rN-M of some path 'U/corresponding_dir_of_S', Where N < M.
    We try to merge a working copy of 'P' for rA-B of 'U', where 'A-B'
    intersects with 'N-M'
    This intersection could result in conflicting merges."
  i.e
   P = WC of http://localhost/svn/repos/branches/b1 (assume to be latest)
   S = WC of http://localhost/svn/repos/branches/b1/src (assume to be latest)
   U = http://localhost/svn/repos/trunk
   N = 25
   M = 30
   Merged http://localhost/svn/repos/trunk/src -r25:30 to WC of /branches/b1/src(S).
   Merging http://localhost/svn/repos/trunk -r27:35 to /branches/b1(P)
   could give conflict because we already merged few revisions for 'src'.

  This scenario is captured in 'avoid_repeated_merge_on_subtree_with_merge_info' of,
   http://svn.collab.net/repos/svn/branches/merge-tracking/subversion/tests/cmdline/merge_tests.py
--------------------------------------------------------------------------

Story: Ignore if you want.
       Just would like to describe it to record how I solved this problem.

   Four months back I took up the responsiblity to address repeat merge
   problem if subtree of the merge target has some overlapping with that of the
   current merge parent target.

   I tried solving the problem in a complex way (cut the parent tree multiple
   subtrees and run merge on each subtree.), few of my trials in this
   direction was not successful at all.

   In this connection I had a call with Mike. P and Dan to see how better we
   can solve this problem.

   I asked how mixed revision 'svn update' works.
   Mike has responded 'Some set_path... Beyond which I did not understand :)'

   Finally we came up with three abstract approaches,
   1. Cutting the trees to multiple subtrees, running the merge on each tree.
   2. Describe about our working copy subtree mergeinfos to update-report. And
      make the 'update-report' handle this case.
   3. Use ra_replay for each revision extract we want.

  I was trying to summarize the above 3 three approaches and send a mail to
  list to see what the dev@subversion think.

  I first sent my draft to Mike and Dan to see if at all I missed anything.

  Mike was frank enough to disclose that draft was bit boring and needs to be
  more lively.
  He has given valuable clues on how to improve it.


  I reworked on the draft, while explaining solution 2 (svn update on a mixed
revison like ....), I have decided to understand what that magical 'set_path
and Co.' calls are for(Mike was saying something on the call!)?

  I looked at the reporter docs for all the hooks and went to bed.

  All of a sudden before falling asleep The following simple idea striked.....

--------------------------------------------------------------------------

Simple Idea:
  Step 1. Get all nodes having 'svn:mergeinfo' using svn_client__get_prop_from_wc.
  Step 2. In the diff_editor of the parent merge call delete_path on those
          overlapping subtress obtained from the Step 1.


--------------------------------------------------------------------------

Actual implementation:
  In reality delete_path approach did not work. When I called 'delete_path' I
  got 'Skipping ...' notification, still my problematic subtrees
  getting merged and causing conflict.

  One idea sparked, why not call 'set_path' on a reporter by giving the
  'start_revision' same as 'end_revision', This idea has helped me in
  making the 'subtree merge' a 'no op' which is half of my problem.

  Other half is to run a merge on this excluded subtrees.
  Got 'Working copy locked' kind of messages.

  Finally came up with the following solution which solved all my problem.

  * Implemented 'do_subtree_merges' which would call 'do_merge' or
    'do_single_file_merge' on excluded subtrees identified by
    'svn_client__get_prop_from_wc'.
    This function would set the 'subtrees_having_mergeinfo' with all such
    excluded paths so that subsequent do_merge on a parent path would call
    'no op set_path' on these paths.
  * Modified do_merge to accept 'subtrees_having_mergeinfo' to exclude the
    subtrees having overlapping merges using 'no op set_path' on them.
  * Modified 'svn_client_merge3' and 'svn_client_merge_peg3' to first do a
    merge on each 'subtrees_having_mergeinfo'

Ran make check and davautocheck all the tests were passing.

Refer the attached patch for details.
--------------------------------------------------------------------------

Small concern with the patch:
I use the same 'merge_cmd_baton' of 'svn_client_merge3' and
'svn_client_merge_peg3' in each subtree merges.
I had a concern with this approach as I felt
'added_path, target, url, path, dry_run_deletions' of merge_cmd_baton
can not be shared by different 'merges'.

I tried the following to see what happens with this approach.
Reused merge_tests.py testcase 40 to create the basic repeat merge problem.

Re: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

Posted by Kamesh Jayachandran <ka...@collab.net>.
Daniel Rall wrote:
> On Thu, 25 Jan 2007, Kamesh Jayachandran wrote:
> ...
>   
>>>>> @@ -3547,6 +3692,25 @@
>>>>>     recursive diff-editor thing. */
>>>>>  else if (entry->kind == svn_node_dir)
>>>>>    {
>>>>> +      if (strcmp(URL1, URL2) == 0)
>>>>>
>>>>> Inside of do_merge(), we perform some calculations on the two URLs
>>>>> before using them to set the merge_baton.same_urls field.  The
>>>>> same_urls field is effectively what you're trying to use here to get
>>>>> do_subtree_merges().  Why the difference?  Is it important?
>>>>>           
> ...
>   
>> Then we may need to change the do_merge, do_single_file_merge, to accept 
>> the 'notification_receiver_baton_t *'. So that we can do this check at 
>> single place. Am I correct in understanding this?
>>     
>
> The key here is getting the final "URL1" and "URL2" from the
> "initial_URL1" and "initial_URL2" parameters (which may change after
> being passed on to the do_*merge implementation), so that the
> "same_urls" boolean is set correctly.
>
> To do this outside of the do_*merge implementation looks like it might
> be messy, which is why I was asking if the difference I describe in
> the previous paragraph is important.
>
>   
The check is exactly same as do_*_merge.(URL1=initial_URL1, 
URL2=initial_URL2 no changes at all since 
peg_revision.kind=svn_opt_revision_unspecified).

The check is important as we need to make a call to 
'discover_and_merge_children' only if both the URLs are same.
>   
>>>>> In do_merge() and do_single_file_merge(), we also have:
>>>>>
>>>>>     /* When only recording merge info, we don't perform an actual
>>>>>        merge for the specified range. */
>>>>>     if (merge_b->record_only)
>>>>>       {
>>>>>           
> ...
>   
>>>>>         /* ### Handle WC-local reverts which have modified our merge
>>>>>            ### info. */
>>>>>         apr_hash_t *merges;
>>>>>         SVN_ERR(determine_merges_performed(&merges, target_wcpath, 
>>>>>                                            &range, &notify_b, pool));
>>>>>         return update_wc_merge_info(target_wcpath, entry, rel_path,
>>>>>                                     merges, is_revert, ra_session,
>>>>>                                     adm_access, ctx, pool);
>>>>>       }
>>>>>           
> ...
>   
>>> I believe it's about an 'svn merge -r N:N-1'
>>> which reverted changes in the WC which had previously been merged in.
>>> In such a case, we need to remove merge info from the WC.  I'm not
>>> clear on whether determine_merges_performed() and
>>> update_wc_merge_info() handles that case.  It rather looks like it
>>> does, but this is another area that we could use a test case for.
>>>       
> ...
>   
>> I am still not clear here.
>> <comment inside --record-only block>
>> /* ### Handle WC-local reverts which have modified our merge
>>    ### info. */
>> </comment>
>>
>> Why bother about reverts? That too in record-only block! The 
>> 'do_(merge|single_file_merge)' 's job is to append(with mergeinfo 
>> algebra) the merge range to existing mergeinfo. Why it should bother 
>> about earlier revert attempts?
>>     
>
> 'svn merge --record-only -c -N' can be used to revert the record of a
> merge.
>
>   
Only if you have merged(or atleast recorded) 'N' already.
First merge being reverse is not supported.

Refer the thread at,
http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=117551
>> Do you mean the scenario like the following
>> * pristine WC with svn:mergeinfo '/branches/b1:87-96'
>> * svn merge -r 88:87 url_to_branches/b1
>> Then  svn:mergeinfo will be '/branches/b1:87, 89-96'
>> I fail to see the connection to above comment and the above scenario.
>>     
>
> This scenario needs to be handled.  Another scenario which needs to be
> handled is that any revert(s) which occurred *before* the currently
> running 'merge' operation should not be clobbered by the current
> 'merge'.  The "is_revert" flag I added to update_wc_merge_info() looks
> like it may already be doing the trick for the first case; I'm less
> sure about the second.
>
> We should add a test which covers both these cases (or enhance an
> existing test).
>
> - Dan
>   

Will post a patch.

With regards
Kamesh Jayachandran

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

Posted by Daniel Rall <dl...@collab.net>.
On Thu, 25 Jan 2007, Kamesh Jayachandran wrote:
...
> >>>@@ -3547,6 +3692,25 @@
> >>>     recursive diff-editor thing. */
> >>>  else if (entry->kind == svn_node_dir)
> >>>    {
> >>>+      if (strcmp(URL1, URL2) == 0)
> >>>
> >>>Inside of do_merge(), we perform some calculations on the two URLs
> >>>before using them to set the merge_baton.same_urls field.  The
> >>>same_urls field is effectively what you're trying to use here to get
> >>>do_subtree_merges().  Why the difference?  Is it important?
...
> Then we may need to change the do_merge, do_single_file_merge, to accept 
> the 'notification_receiver_baton_t *'. So that we can do this check at 
> single place. Am I correct in understanding this?

The key here is getting the final "URL1" and "URL2" from the
"initial_URL1" and "initial_URL2" parameters (which may change after
being passed on to the do_*merge implementation), so that the
"same_urls" boolean is set correctly.

To do this outside of the do_*merge implementation looks like it might
be messy, which is why I was asking if the difference I describe in
the previous paragraph is important.


> >>>In do_merge() and do_single_file_merge(), we also have:
> >>>
> >>>     /* When only recording merge info, we don't perform an actual
> >>>        merge for the specified range. */
> >>>     if (merge_b->record_only)
> >>>       {
...
> >>>         /* ### Handle WC-local reverts which have modified our merge
> >>>            ### info. */
> >>>         apr_hash_t *merges;
> >>>         SVN_ERR(determine_merges_performed(&merges, target_wcpath, 
> >>>                                            &range, &notify_b, pool));
> >>>         return update_wc_merge_info(target_wcpath, entry, rel_path,
> >>>                                     merges, is_revert, ra_session,
> >>>                                     adm_access, ctx, pool);
> >>>       }
...
> >I believe it's about an 'svn merge -r N:N-1'
> >which reverted changes in the WC which had previously been merged in.
> >In such a case, we need to remove merge info from the WC.  I'm not
> >clear on whether determine_merges_performed() and
> >update_wc_merge_info() handles that case.  It rather looks like it
> >does, but this is another area that we could use a test case for.
...
> I am still not clear here.
> <comment inside --record-only block>
> /* ### Handle WC-local reverts which have modified our merge
>    ### info. */
> </comment>
> 
> Why bother about reverts? That too in record-only block! The 
> 'do_(merge|single_file_merge)' 's job is to append(with mergeinfo 
> algebra) the merge range to existing mergeinfo. Why it should bother 
> about earlier revert attempts?

'svn merge --record-only -c -N' can be used to revert the record of a
merge.

> Do you mean the scenario like the following
> * pristine WC with svn:mergeinfo '/branches/b1:87-96'
> * svn merge -r 88:87 url_to_branches/b1
> Then  svn:mergeinfo will be '/branches/b1:87, 89-96'
> I fail to see the connection to above comment and the above scenario.

This scenario needs to be handled.  Another scenario which needs to be
handled is that any revert(s) which occurred *before* the currently
running 'merge' operation should not be clobbered by the current
'merge'.  The "is_revert" flag I added to update_wc_merge_info() looks
like it may already be doing the trick for the first case; I'm less
sure about the second.

We should add a test which covers both these cases (or enhance an
existing test).

- Dan

Re: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

Posted by Kamesh Jayachandran <ka...@collab.net>.
Hi Dan,

Sorry for responding late on this.

> Will you think about the best way to go, and send a doc or naming
> patch for this one?
>   

Sent.

> ...
>   
>>> @@ -3547,6 +3692,25 @@
>>>      recursive diff-editor thing. */
>>>   else if (entry->kind == svn_node_dir)
>>>     {
>>> +      if (strcmp(URL1, URL2) == 0)
>>>
>>> Inside of do_merge(), we perform some calculations on the two URLs
>>> before using them to set the merge_baton.same_urls field.  The
>>> same_urls field is effectively what you're trying to use here to get
>>> do_subtree_merges().  Why the difference?  Is it important?
>>>       
>> It is just to do away with Three merge scenarios where these 'Subtree 
>> exclusion' does not make sense atleast currently.
>>     
>
> Sure, the reasoning for the checks is the same in both places.  But
> the code used to implement the checks is different -- why?  Is it
> important that it is different?
>   

Then we may need to change the do_merge, do_single_file_merge, to accept 
the 'notification_receiver_baton_t *'. So that we can do this check at 
single place. Am I correct in understanding this?

>
>>> In do_merge() and do_single_file_merge(), we also have:
>>>
>>>      /* When only recording merge info, we don't perform an actual
>>>         merge for the specified range. */
>>>      if (merge_b->record_only)
>>>        {
>>>          /* ### TODO: Support sub-tree merge info. */
>>>          /* ### Handle WC-local reverts which have modified our merge
>>>             ### info. */
>>>          apr_hash_t *merges;
>>>          SVN_ERR(determine_merges_performed(&merges, target_wcpath, 
>>>          &range,
>>>                                             &notify_b, pool));
>>>          return update_wc_merge_info(target_wcpath, entry, rel_path,
>>>                                      merges, is_revert, ra_session,
>>>                                      adm_access, ctx, pool);
>>>        }
>>>
>>> Is this TODO comment about sub-tree merge info still relevant?  If so,
>>> what needs to be changed?
>>>       
>> Yes it can be removed.
>>     
>
> Okay, include that in the next patch.  Can you also include another
> test for sub-tree merge info, especially tests in which one of the
> "internal" merges on a child path adds and/or substracts additional
> merge info.
>   

Will do.

>   
>> I have a less clue about what that second comment line means.
>>     
>
> It's a separate TODO comment.  I believe it's about an 'svn merge -r N:N-1'
>
> which reverted changes in the WC which had previous been merged in.
> In such a case, we need to remove merge info from the WC.  I'm not
> clear on whether determine_merges_performed() and
> update_wc_merge_info() handles that case.  It rather looks like it
> does, but this is another area that we could use a test case for.
>
> - Dan
>   
I am still not clear here.
<comment inside --record-only block>
/* ### Handle WC-local reverts which have modified our merge
                 ### info. */
</comment>

Why bother about reverts? That too in record-only block! . The 
'do_(merge|single_file_merge)' 's job is to append(with mergeinfo 
algebra) the merge range to existing mergeinfo. Why it should bother 
about earlier revert attempts?

Do you mean the scenario like the following
* pristine WC with svn:mergeinfo '/branches/b1:87-96'
* svn merge -r 88:87 url_to_branches/b1
Then  svn:mergeinfo will be '/branches/b1:87, 89-96'
I fail to see the connection to above comment and the above scenario.


With regards
Kamesh Jayachandran

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

Posted by Daniel Rall <dl...@collab.net>.
On Fri, 19 Jan 2007, Kamesh Jayachandran wrote:
...
> >+   SUBTREES_HAVING_MERGEINFO might have the subpaths which might have
> >
> >Avoid repeating words like "might" twice in the same sentence.
>
> Sure will correct it.

I already got this one during a rewrite of this doc string.

...
> >We don't only want to consider sub-trees here; we want to consider any
> >child of the merge target which has merge info.  Files can have their
> >own merge info, which overrides any merge info inherited from their
> >(grand-)parent directory.
> >
> >Let's name this parameter CHILDREN_WITH_MERGEINFO instead.  I've tried
> >to make the doc string more clear, too.
> 
> Fine. I was under the thought that 'SUBTREE' includes both sub-sub 
> directories and files(tree with no child at all). CHILDREN looks clear 
> though.

Yeah, arguably.  I too feel that children is more clear.

...
> >+          apr_ssize_t target_wcpath_len = strlen(target_wcpath);
> >
> >Why use apr_ssize_t instead of apr_size_t?
> 
> Copy Paste mistake :).

Okay.  Already got this one.

> > 
> >+          for (hi = apr_hash_first(pool, subtrees_having_mergeinfo);
> >
> >The iterator over this hash doesn't need to be thread-safe, so we
> >could avoid passing in pool to apr_hash_first().
>
> What if in a hypothetical case where two simultaneous merges from the 
> same application context on two different threads exercising the same 
> portion of code?(A GUI built on top of JAVAHL?)

It won't matter, because they'll be using different apr_hash_t structs
(in memory).  A pool can be provided to perform concurrent iterations
over the same hash struct (which we're not doing here).  If a pool is
not provided, the hash's own iterator field is used, rather than
incurring an extra allocation from the pool parameter.

APR_DECLARE(apr_hash_index_t *) apr_hash_first(apr_pool_t *p, apr_hash_t *ht)
{
    apr_hash_index_t *hi;
    if (p)
        hi = apr_palloc(p, sizeof(*hi));
    else
        hi = &ht->iterator;

    ...
}

...
> >+              //set no-op set-path only on subset.
> >+              if ((klen < target_wcpath_len) ||
> >+                  (strcmp(subtree_wc_target,
> >+                          target_wcpath) == 0) ||
> >+                  (strncmp(subtree_wc_target,
> >+                           target_wcpath,
> >+                           target_wcpath_len) != 0))
> >
> >Can't we use something from svn_path.h for this complex test?
> >svn_path_is_child() looks like it might work.
>
> It might work. But I don't favour it here. Because it does extra things 
> which I am not interested in (Giving a tail portions since match.). 
> Given that I know about my input here('subtree_wc_target', 
> 'target_wcpath'), I don't want it to undergo the rigorous ':' checks for 
> windows.

Okay.  I cut it down to 3 lines, which I suppose is simple enough.

> You seemed to have removed this comment prior to this check '
> 
> +              //set no-op set-path only on subset.'
> 
> May be it could be made as
> 
> +              //set no-op set-path only on child of target_wc_path not on 
> itself and on non-childs.'
>
> >+                {
> >+                  continue;
> >+                }
> >+
> >+              subtree_repospath = key + target_wcpath_len + 1;
> >+
> >+              SVN_ERR(reporter->set_path(report_baton, subtree_repospath,
> >+                                         is_revert ? r->end - 1 : r->end,
> >+                                         FALSE, NULL, pool));
> >+            }
> >+        }
> >
> >This entire block could use an inline comment indicating what it's
> >intended to do.
> 
> Yes would do.

All of the above is handled by the comments I've added to the code
block.  I find the phrasing of the "set no-op set-path" comment to be
somewhat nonsensical, but more importantly, it talks only about the
"what", but doesn't really get into the "why".

> >...
> >+/*
> >+   Fill SUBTREES_HAVING_MERGEINFO with subpaths which might have
> >+   intersecting merges.
> >+   SUBTREES_HAVING_MERGEINFO is assumed to be non-NULL.
> >+   For each such subtree calls do_merge or do_single_file_merge
> >+   based on the kind of subtree with the appropriate arguments.
> >
> >This API is slightly incohesive -- it serves two purposes:
> >
> >1) Gets the list of children with merge info.
> >
> >2) Performs the appropriate merges for each child.
> >
> >Typically, this isn't a good way to go.  However, we can consider #1
> >to be a return value listing "what was changed", so I think we can let
> >this one slide.  We might want to be a little more explicit about this
> >in the doc string, though, or name the API to reflect this
> >(e.g. discover_and_do_child_merges?).
> 
> Sounds good.

Will you think about the best way to go, and send a doc or naming
patch for this one?

...
> >@@ -3547,6 +3692,25 @@
> >      recursive diff-editor thing. */
> >   else if (entry->kind == svn_node_dir)
> >     {
> >+      if (strcmp(URL1, URL2) == 0)
> >
> >Inside of do_merge(), we perform some calculations on the two URLs
> >before using them to set the merge_baton.same_urls field.  The
> >same_urls field is effectively what you're trying to use here to get
> >do_subtree_merges().  Why the difference?  Is it important?
> 
> It is just to do away with Three merge scenarios where these 'Subtree 
> exclusion' does not make sense atleast currently.

Sure, the reasoning for the checks is the same in both places.  But
the code used to implement the checks is different -- why?  Is it
important that it is different?

...
> >@@ -3696,6 +3862,22 @@
> >      recursive diff-editor thing. */
> >   else if (entry->kind == svn_node_dir)
> >     {
> >+      SVN_ERR(do_subtree_merges(subtrees_having_mergeinfo,
> >+                                entry,
> >+                                URL,
> >+                                path,
> >+                                revision1,
> >+                                path,
> >+                                revision2,
> >+                                peg_revision,
> >+                                recurse,
> >+                                ignore_ancestry,
> >+                                adm_access,
> >+                                &merge_cmd_baton,
> >+                                pool));
> >
> >And why does svn_client_merge_peg3() not gate do_subtree_merges() in
> >the same style of "if (same_urls)" conditional?
> 
> We have only one URL in case of svn_client_merge_peg3() (Not sure why, 
> but that is the way it is).

Okay.  I guess we never run into the 3-URL case here.

...
> >In do_merge() and do_single_file_merge(), we also have:
> >
> >      /* When only recording merge info, we don't perform an actual
> >         merge for the specified range. */
> >      if (merge_b->record_only)
> >        {
> >          /* ### TODO: Support sub-tree merge info. */
> >          /* ### Handle WC-local reverts which have modified our merge
> >             ### info. */
> >          apr_hash_t *merges;
> >          SVN_ERR(determine_merges_performed(&merges, target_wcpath, 
> >          &range,
> >                                             &notify_b, pool));
> >          return update_wc_merge_info(target_wcpath, entry, rel_path,
> >                                      merges, is_revert, ra_session,
> >                                      adm_access, ctx, pool);
> >        }
> >
> >Is this TODO comment about sub-tree merge info still relevant?  If so,
> >what needs to be changed?
>
> Yes it can be removed.

Okay, include that in the next patch.  Can you also include another
test for sub-tree merge info, especially tests in which one of the
"internal" merges on a child path adds and/or substracts additional
merge info.

> I have a less clue about what that second comment line means.

It's a separate TODO comment.  I believe it's about an 'svn merge -r N:N-1'

which reverted changes in the WC which had previous been merged in.
In such a case, we need to remove merge info from the WC.  I'm not
clear on whether determine_merges_performed() and
update_wc_merge_info() handles that case.  It rather looks like it
does, but this is another area that we could use a test case for.

- Dan

Re: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

Posted by Kamesh Jayachandran <ka...@collab.net>.
Dan,

Thanks for the review.

My comments are inline.

Daniel Rall wrote:
> Kamesh, here's a brief review of your approach for dealing with
> children of merge targets which have merge info which differs from
> that of the parent.
>
> Index: subversion/libsvn_client/diff.c
> ===================================================================
> --- subversion/libsvn_client/diff.c (revision 22779)
> +++ subversion/libsvn_client/diff.c (working copy)
> ...
> +   SUBTREES_HAVING_MERGEINFO might have the subpaths which might have
>
> Avoid repeating words like "might" twice in the same sentence.
>   
Sure will correct it.

> +   intersecting merges, so we run the diff editor in such a way that
> +   it leaves off those subpaths.
> +   i.e reporter->set_path(report_baton, subtree_repospath,
> +                          same_rev_given_to_svn_ra_do_diff2, FALSE,
> +                          NULL, pool));
> +   SUBTREES_HAVING_MERGEINFO is assumed to be non-NULL.
>
> We don't only want to consider sub-trees here; we want to consider any
> child of the merge target which has merge info.  Files can have their
> own merge info, which overrides any merge info inherited from their
> (grand-)parent directory.
>
> Let's name this parameter CHILDREN_WITH_MERGEINFO instead.  I've tried
> to make the doc string more clear, too.
>   

Fine. I was under the thought that 'SUBTREE' includes both sub-sub 
directories and files(tree with no child at all). CHILDREN looks clear 
though.

>  */
>  static svn_error_t *
>  do_merge(const char *initial_URL1,
> ...
> @@ -2400,7 +2410,41 @@
>        SVN_ERR(reporter->set_path(report_baton, "",
>                                   is_revert ? r->start : r->start - 1,
>                                   FALSE, NULL, pool));
> +      if (notify_b.same_urls)
> +        {
> +          apr_ssize_t target_wcpath_len = strlen(target_wcpath);
>
> Why use apr_ssize_t instead of apr_size_t?
>   

Copy Paste mistake :).

>  
> +          for (hi = apr_hash_first(pool, subtrees_having_mergeinfo);
>
> The iterator over this hash doesn't need to be thread-safe, so we
> could avoid passing in pool to apr_hash_first().
>   
What if in a hypothetical case where two simultaneous merges from the 
same application context on two different threads exercising the same 
portion of code?(A GUI built on top of JAVAHL?)
> +               hi;
> +               hi = apr_hash_next(hi))
> +            {
> +              const void *key;
> +              apr_ssize_t klen;
> +              const char *subtree_repospath;
> +              const char *subtree_wc_target;
> +
> +              apr_hash_this(hi, &key, &klen, NULL);
> +
> +              subtree_wc_target = key;
> +              //set no-op set-path only on subset.
> +              if ((klen < target_wcpath_len) ||
> +                  (strcmp(subtree_wc_target,
> +                          target_wcpath) == 0) ||
> +                  (strncmp(subtree_wc_target,
> +                           target_wcpath,
> +                           target_wcpath_len) != 0))
>
> Can't we use something from svn_path.h for this complex test?
> svn_path_is_child() looks like it might work.
>   
It might work. But I don't favour it here. Because it does extra things 
which I am not interested in (Giving a tail portions since match.). 
Given that I know about my input here('subtree_wc_target', 
'target_wcpath'), I don't want it to undergo the rigorous ':' checks for 
windows.

You seemed to have removed this comment prior to this check '

+              //set no-op set-path only on subset.'


May be it could be made as

+              //set no-op set-path only on child of target_wc_path not on itself and on non-childs.'


> +                {
> +                  continue;
> +                }
> +
> +              subtree_repospath = key + target_wcpath_len + 1;
> +
> +              SVN_ERR(reporter->set_path(report_baton, subtree_repospath,
> +                                         is_revert ? r->end - 1 : r->end,
> +                                         FALSE, NULL, pool));
> +            }
> +        }
>
> This entire block could use an inline comment indicating what it's
> intended to do.
>
>   

Yes would do.

> ...
> +/*
> +   Fill SUBTREES_HAVING_MERGEINFO with subpaths which might have
> +   intersecting merges.
> +   SUBTREES_HAVING_MERGEINFO is assumed to be non-NULL.
> +   For each such subtree calls do_merge or do_single_file_merge
> +   based on the kind of subtree with the appropriate arguments.
>
> This API is slightly incohesive -- it serves two purposes:
>
> 1) Gets the list of children with merge info.
>
> 2) Performs the appropriate merges for each child.
>
> Typically, this isn't a good way to go.  However, we can consider #1
> to be a return value listing "what was changed", so I think we can let
> this one slide.  We might want to be a little more explicit about this
> in the doc string, though, or name the API to reflect this
> (e.g. discover_and_do_child_merges?).
>   

Sounds good.

> +   Uses PARENT_WC_ENTRY and ADM_ACCESS to fill the
> +   'SUBTREES_HAVING_MERGEINFO'.
> +   Uses the same MERGE_CMD_BATON as the caller(svn_client_merge3
> +   and svn_client_merge_peg3).
> +   Receives PARENT_WC_URL, INITIAL_PATH1, REVISION1, INITIAL_PATH2,
> +   REVISION2, PEG_REVISION, RECURSE, IGNORE_ANCESTRY, ADM_ACCESS,
> +   MERGE_CMD_BATON to cascade it to do_merge and do_single_file_merge.
> +   All allocation occurs in POOL.
> +*/
> +static svn_error_t *
> +do_subtree_merges(apr_hash_t *subtrees_having_mergeinfo,
> +                  const svn_wc_entry_t *parent_wc_entry,
> +                  const char *parent_wc_url,
> +                  const char *initial_path1,
> +                  const svn_opt_revision_t *revision1,
> +                  const char *initial_path2,
> +                  const svn_opt_revision_t *revision2,
> +                  const svn_opt_revision_t *peg_revision,
> +                  svn_boolean_t recurse,
> +                  svn_boolean_t ignore_ancestry,
> +                  svn_wc_adm_access_t *adm_access,
> +                  struct merge_cmd_baton *merge_cmd_baton,
> +                  apr_pool_t *pool)
> +{
> +  apr_hash_index_t *hi;
> +  const svn_wc_entry_t *subtree_entry;
> +
> +  SVN_ERR(svn_client__get_prop_from_wc(subtrees_having_mergeinfo,
> +                                       SVN_PROP_MERGE_INFO,
> +                                       merge_cmd_baton->target,
> +                                       FALSE,
> +                                       parent_wc_entry,
> +                                       adm_access,
> +                                       TRUE,
> +                                       merge_cmd_baton->ctx,
> +                                       pool));
> +  for (hi = apr_hash_first(pool, subtrees_having_mergeinfo);
> +       hi;
> +       hi = apr_hash_next(hi))
> +    {
> +      const void *key;
> +      const char *subtree_repospath;
> +      const char *subtree_wc_target;
> +      const char *subtree_URL;
> +
> +      apr_hash_this(hi, &key, NULL, NULL);
> +
> +      subtree_wc_target = key;
> +      if (strcmp(subtree_wc_target, merge_cmd_baton->target) == 0)
> +        {
> +          continue;
> +        }
> +      SVN_ERR(svn_wc_entry(&subtree_entry, subtree_wc_target,
> +                           adm_access, FALSE, pool));
> +      if (subtree_entry == NULL)
> +        return svn_error_createf(SVN_ERR_UNVERSIONED_RESOURCE, NULL,
> +                                 _("'%s' is not under version control"),
> +                                 svn_path_local_style(subtree_wc_target, pool));
> +
> +      subtree_repospath = key + strlen(merge_cmd_baton->target) + 1;
> +      subtree_URL = svn_path_join(parent_wc_url, subtree_repospath, pool);
> +      if (subtree_entry->kind == svn_node_file)
> +        {
> +          SVN_ERR(do_single_file_merge(subtree_URL, initial_path1, revision1,
> +                                       subtree_URL, initial_path2, revision2,
> +                                       peg_revision,
> +                                       subtree_wc_target,
> +                                       adm_access,
> +                                       merge_cmd_baton,
> +                                       pool));
> +        }
> +      else if (subtree_entry->kind == svn_node_dir)
> +        {
> +          SVN_ERR(do_merge(subtree_URL,
> +                           initial_path1,
> +                           revision1,
> +                           subtree_URL,
> +                           initial_path2,
> +                           revision2,
> +                           peg_revision,
> +                           subtree_wc_target,
> +                           adm_access,
> +                           recurse,
> +                           ignore_ancestry,
> +                           &merge_callbacks,
> +                           merge_cmd_baton,
> +                           subtrees_having_mergeinfo,
> +                           pool));
> +        }
> +    }
> +  return SVN_NO_ERROR;
> +}
> +
>  svn_error_t *
>  svn_client_merge3(const char *source1,
>                    const svn_opt_revision_t *revision1,
> @@ -3461,6 +3605,7 @@
>    const char *URL1, *URL2;
>    const char *path1, *path2;
>    svn_opt_revision_t peg_revision;
> +  apr_hash_t *subtrees_having_mergeinfo = apr_hash_make(pool);
>  
>    /* This is not a pegged merge. */
>    peg_revision.kind = svn_opt_revision_unspecified;
> @@ -3547,6 +3692,25 @@
>       recursive diff-editor thing. */
>    else if (entry->kind == svn_node_dir)
>      {
> +      if (strcmp(URL1, URL2) == 0)
>
> Inside of do_merge(), we perform some calculations on the two URLs
> before using them to set the merge_baton.same_urls field.  The
> same_urls field is effectively what you're trying to use here to get
> do_subtree_merges().  Why the difference?  Is it important?
>   

It is just to do away with Three merge scenarios where these 'Subtree 
exclusion' does not make sense atleast currently.


> +        {
> +          SVN_ERR(do_subtree_merges(subtrees_having_mergeinfo,
> ...
> +        }
> +
> +      /* Merge of a actual target.*/
> +
>        SVN_ERR(do_merge(URL1,
> ...
> @@ -3696,6 +3862,22 @@
>       recursive diff-editor thing. */
>    else if (entry->kind == svn_node_dir)
>      {
> +      SVN_ERR(do_subtree_merges(subtrees_having_mergeinfo,
> +                                entry,
> +                                URL,
> +                                path,
> +                                revision1,
> +                                path,
> +                                revision2,
> +                                peg_revision,
> +                                recurse,
> +                                ignore_ancestry,
> +                                adm_access,
> +                                &merge_cmd_baton,
> +                                pool));
>
> And why does svn_client_merge_peg3() not gate do_subtree_merges() in
> the same style of "if (same_urls)" conditional?
>   

We have only one URL in case of svn_client_merge_peg3() (Not sure why, 
but that is the way it is).
> +      /* Merge of a actual target.*/
> +
>        SVN_ERR(do_merge(URL,
> ...
>
>
>
> In do_merge() and do_single_file_merge(), we also have:
>
>       /* When only recording merge info, we don't perform an actual
>          merge for the specified range. */
>       if (merge_b->record_only)
>         {
>           /* ### TODO: Support sub-tree merge info. */
>           /* ### Handle WC-local reverts which have modified our merge
>              ### info. */
>           apr_hash_t *merges;
>           SVN_ERR(determine_merges_performed(&merges, target_wcpath, &range,
>                                              &notify_b, pool));
>           return update_wc_merge_info(target_wcpath, entry, rel_path,
>                                       merges, is_revert, ra_session,
>                                       adm_access, ctx, pool);
>         }
>
> Is this TODO comment about sub-tree merge info still relevant?  If so,
> what needs to be changed?
>   
Yes it can be removed.
I have a less clue about what that second comment line means.


Thanks
With regards
Kamesh Jayachandran

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

Posted by Daniel Rall <dl...@collab.net>.
Kamesh, here's a brief review of your approach for dealing with
children of merge targets which have merge info which differs from
that of the parent.

Index: subversion/libsvn_client/diff.c
===================================================================
--- subversion/libsvn_client/diff.c (revision 22779)
+++ subversion/libsvn_client/diff.c (working copy)
...
+   SUBTREES_HAVING_MERGEINFO might have the subpaths which might have

Avoid repeating words like "might" twice in the same sentence.

+   intersecting merges, so we run the diff editor in such a way that
+   it leaves off those subpaths.
+   i.e reporter->set_path(report_baton, subtree_repospath,
+                          same_rev_given_to_svn_ra_do_diff2, FALSE,
+                          NULL, pool));
+   SUBTREES_HAVING_MERGEINFO is assumed to be non-NULL.

We don't only want to consider sub-trees here; we want to consider any
child of the merge target which has merge info.  Files can have their
own merge info, which overrides any merge info inherited from their
(grand-)parent directory.

Let's name this parameter CHILDREN_WITH_MERGEINFO instead.  I've tried
to make the doc string more clear, too.

 */
 static svn_error_t *
 do_merge(const char *initial_URL1,
...
@@ -2400,7 +2410,41 @@
       SVN_ERR(reporter->set_path(report_baton, "",
                                  is_revert ? r->start : r->start - 1,
                                  FALSE, NULL, pool));
+      if (notify_b.same_urls)
+        {
+          apr_ssize_t target_wcpath_len = strlen(target_wcpath);

Why use apr_ssize_t instead of apr_size_t?
 
+          for (hi = apr_hash_first(pool, subtrees_having_mergeinfo);

The iterator over this hash doesn't need to be thread-safe, so we
could avoid passing in pool to apr_hash_first().

+               hi;
+               hi = apr_hash_next(hi))
+            {
+              const void *key;
+              apr_ssize_t klen;
+              const char *subtree_repospath;
+              const char *subtree_wc_target;
+
+              apr_hash_this(hi, &key, &klen, NULL);
+
+              subtree_wc_target = key;
+              //set no-op set-path only on subset.
+              if ((klen < target_wcpath_len) ||
+                  (strcmp(subtree_wc_target,
+                          target_wcpath) == 0) ||
+                  (strncmp(subtree_wc_target,
+                           target_wcpath,
+                           target_wcpath_len) != 0))

Can't we use something from svn_path.h for this complex test?
svn_path_is_child() looks like it might work.

+                {
+                  continue;
+                }
+
+              subtree_repospath = key + target_wcpath_len + 1;
+
+              SVN_ERR(reporter->set_path(report_baton, subtree_repospath,
+                                         is_revert ? r->end - 1 : r->end,
+                                         FALSE, NULL, pool));
+            }
+        }

This entire block could use an inline comment indicating what it's
intended to do.

...
+/*
+   Fill SUBTREES_HAVING_MERGEINFO with subpaths which might have
+   intersecting merges.
+   SUBTREES_HAVING_MERGEINFO is assumed to be non-NULL.
+   For each such subtree calls do_merge or do_single_file_merge
+   based on the kind of subtree with the appropriate arguments.

This API is slightly incohesive -- it serves two purposes:

1) Gets the list of children with merge info.

2) Performs the appropriate merges for each child.

Typically, this isn't a good way to go.  However, we can consider #1
to be a return value listing "what was changed", so I think we can let
this one slide.  We might want to be a little more explicit about this
in the doc string, though, or name the API to reflect this
(e.g. discover_and_do_child_merges?).

+   Uses PARENT_WC_ENTRY and ADM_ACCESS to fill the
+   'SUBTREES_HAVING_MERGEINFO'.
+   Uses the same MERGE_CMD_BATON as the caller(svn_client_merge3
+   and svn_client_merge_peg3).
+   Receives PARENT_WC_URL, INITIAL_PATH1, REVISION1, INITIAL_PATH2,
+   REVISION2, PEG_REVISION, RECURSE, IGNORE_ANCESTRY, ADM_ACCESS,
+   MERGE_CMD_BATON to cascade it to do_merge and do_single_file_merge.
+   All allocation occurs in POOL.
+*/
+static svn_error_t *
+do_subtree_merges(apr_hash_t *subtrees_having_mergeinfo,
+                  const svn_wc_entry_t *parent_wc_entry,
+                  const char *parent_wc_url,
+                  const char *initial_path1,
+                  const svn_opt_revision_t *revision1,
+                  const char *initial_path2,
+                  const svn_opt_revision_t *revision2,
+                  const svn_opt_revision_t *peg_revision,
+                  svn_boolean_t recurse,
+                  svn_boolean_t ignore_ancestry,
+                  svn_wc_adm_access_t *adm_access,
+                  struct merge_cmd_baton *merge_cmd_baton,
+                  apr_pool_t *pool)
+{
+  apr_hash_index_t *hi;
+  const svn_wc_entry_t *subtree_entry;
+
+  SVN_ERR(svn_client__get_prop_from_wc(subtrees_having_mergeinfo,
+                                       SVN_PROP_MERGE_INFO,
+                                       merge_cmd_baton->target,
+                                       FALSE,
+                                       parent_wc_entry,
+                                       adm_access,
+                                       TRUE,
+                                       merge_cmd_baton->ctx,
+                                       pool));
+  for (hi = apr_hash_first(pool, subtrees_having_mergeinfo);
+       hi;
+       hi = apr_hash_next(hi))
+    {
+      const void *key;
+      const char *subtree_repospath;
+      const char *subtree_wc_target;
+      const char *subtree_URL;
+
+      apr_hash_this(hi, &key, NULL, NULL);
+
+      subtree_wc_target = key;
+      if (strcmp(subtree_wc_target, merge_cmd_baton->target) == 0)
+        {
+          continue;
+        }
+      SVN_ERR(svn_wc_entry(&subtree_entry, subtree_wc_target,
+                           adm_access, FALSE, pool));
+      if (subtree_entry == NULL)
+        return svn_error_createf(SVN_ERR_UNVERSIONED_RESOURCE, NULL,
+                                 _("'%s' is not under version control"),
+                                 svn_path_local_style(subtree_wc_target, pool));
+
+      subtree_repospath = key + strlen(merge_cmd_baton->target) + 1;
+      subtree_URL = svn_path_join(parent_wc_url, subtree_repospath, pool);
+      if (subtree_entry->kind == svn_node_file)
+        {
+          SVN_ERR(do_single_file_merge(subtree_URL, initial_path1, revision1,
+                                       subtree_URL, initial_path2, revision2,
+                                       peg_revision,
+                                       subtree_wc_target,
+                                       adm_access,
+                                       merge_cmd_baton,
+                                       pool));
+        }
+      else if (subtree_entry->kind == svn_node_dir)
+        {
+          SVN_ERR(do_merge(subtree_URL,
+                           initial_path1,
+                           revision1,
+                           subtree_URL,
+                           initial_path2,
+                           revision2,
+                           peg_revision,
+                           subtree_wc_target,
+                           adm_access,
+                           recurse,
+                           ignore_ancestry,
+                           &merge_callbacks,
+                           merge_cmd_baton,
+                           subtrees_having_mergeinfo,
+                           pool));
+        }
+    }
+  return SVN_NO_ERROR;
+}
+
 svn_error_t *
 svn_client_merge3(const char *source1,
                   const svn_opt_revision_t *revision1,
@@ -3461,6 +3605,7 @@
   const char *URL1, *URL2;
   const char *path1, *path2;
   svn_opt_revision_t peg_revision;
+  apr_hash_t *subtrees_having_mergeinfo = apr_hash_make(pool);
 
   /* This is not a pegged merge. */
   peg_revision.kind = svn_opt_revision_unspecified;
@@ -3547,6 +3692,25 @@
      recursive diff-editor thing. */
   else if (entry->kind == svn_node_dir)
     {
+      if (strcmp(URL1, URL2) == 0)

Inside of do_merge(), we perform some calculations on the two URLs
before using them to set the merge_baton.same_urls field.  The
same_urls field is effectively what you're trying to use here to get
do_subtree_merges().  Why the difference?  Is it important?

+        {
+          SVN_ERR(do_subtree_merges(subtrees_having_mergeinfo,
...
+        }
+
+      /* Merge of a actual target.*/
+
       SVN_ERR(do_merge(URL1,
...
@@ -3696,6 +3862,22 @@
      recursive diff-editor thing. */
   else if (entry->kind == svn_node_dir)
     {
+      SVN_ERR(do_subtree_merges(subtrees_having_mergeinfo,
+                                entry,
+                                URL,
+                                path,
+                                revision1,
+                                path,
+                                revision2,
+                                peg_revision,
+                                recurse,
+                                ignore_ancestry,
+                                adm_access,
+                                &merge_cmd_baton,
+                                pool));

And why does svn_client_merge_peg3() not gate do_subtree_merges() in
the same style of "if (same_urls)" conditional?

+      /* Merge of a actual target.*/
+
       SVN_ERR(do_merge(URL,
...



In do_merge() and do_single_file_merge(), we also have:

      /* When only recording merge info, we don't perform an actual
         merge for the specified range. */
      if (merge_b->record_only)
        {
          /* ### TODO: Support sub-tree merge info. */
          /* ### Handle WC-local reverts which have modified our merge
             ### info. */
          apr_hash_t *merges;
          SVN_ERR(determine_merges_performed(&merges, target_wcpath, &range,
                                             &notify_b, pool));
          return update_wc_merge_info(target_wcpath, entry, rel_path,
                                      merges, is_revert, ra_session,
                                      adm_access, ctx, pool);
        }

Is this TODO comment about sub-tree merge info still relevant?  If so,
what needs to be changed?


I've committed a very similar version of your patch to the
merge-tracking branch in r23049 (addressing some of the items I noted
above).  We can continue refinements to it there.


- Dan



On Sun, 07 Jan 2007, Kamesh Jayachandran wrote:

> Hi,
> 
> Michael Pilato while discussing over phone about this patch raised a concern what if the subtree does not exist in the repo for the rev with which we call set_path.(no-op set-path)
> 
> I just experimented in my $BUILDDIR/subversion/tests/cmdline/svn-test-work/working-copies/merge_tests-40.(at r6)
> 
> a)Deleted the file:///path/to/repositories/merge_tests-40/A/copy-of-B/F/E resulted in r7.
> b)merged r3:7 of file:///path/to/repositories/merge_tests-40/A/B to A/copy-of-B worked well with this patch without complaining anything while set_path on copy_of_B/F/E for r7.
> 
> And hence no issues.
...
> -----Original Message-----
> From: Kamesh Jayachandran [mailto:kamesh@collab.net]
> Sent: Sun 12/24/2006 1:24 AM
> To: dev@subversion.tigris.org
> Subject:  [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already
>  
> Hi All,
> 
> Problem: (If this description is boring jump to 'Story' section.)
> 
>    "Sub tree 'S' of a parent tree 'P' has already been merged of certain
>     revisions say rN-M of some path 'U/corresponding_dir_of_S', Where N < M.
>     We try to merge a working copy of 'P' for rA-B of 'U', where 'A-B'
>     intersects with 'N-M'
>     This intersection could result in conflicting merges."
>   i.e
>    P = WC of http://localhost/svn/repos/branches/b1 (assume to be latest)
>    S = WC of http://localhost/svn/repos/branches/b1/src (assume to be latest)
>    U = http://localhost/svn/repos/trunk
>    N = 25
>    M = 30
>    Merged http://localhost/svn/repos/trunk/src -r25:30 to WC of /branches/b1/src(S).
>    Merging http://localhost/svn/repos/trunk -r27:35 to /branches/b1(P)
>    could give conflict because we already merged few revisions for 'src'.
> 
>   This scenario is captured in 'avoid_repeated_merge_on_subtree_with_merge_info' of,
>    http://svn.collab.net/repos/svn/branches/merge-tracking/subversion/tests/cmdline/merge_tests.py
> --------------------------------------------------------------------------
> 
> Story: Ignore if you want.
>        Just would like to describe it to record how I solved this problem.
> 
>    Four months back I took up the responsiblity to address repeat merge
>    problem if subtree of the merge target has some overlapping with that of the
>    current merge parent target.
> 
>    I tried solving the problem in a complex way (cut the parent tree multiple
>    subtrees and run merge on each subtree.), few of my trials in this
>    direction was not successful at all.
> 
>    In this connection I had a call with Mike. P and Dan to see how better we
>    can solve this problem.
> 
>    I asked how mixed revision 'svn update' works.
>    Mike has responded 'Some set_path... Beyond which I did not understand :)'
> 
>    Finally we came up with three abstract approaches,
>    1. Cutting the trees to multiple subtrees, running the merge on each tree.
>    2. Describe about our working copy subtree mergeinfos to update-report. And
>       make the 'update-report' handle this case.
>    3. Use ra_replay for each revision extract we want.
> 
>   I was trying to summarize the above 3 three approaches and send a mail to
>   list to see what the dev@subversion think.
> 
>   I first sent my draft to Mike and Dan to see if at all I missed anything.
> 
>   Mike was frank enough to disclose that draft was bit boring and needs to be
>   more lively.
>   He has given valuable clues on how to improve it.
> 
> 
>   I reworked on the draft, while explaining solution 2 (svn update on a mixed
> revison like ....), I have decided to understand what that magical 'set_path
> and Co.' calls are for(Mike was saying something on the call!)?
> 
>   I looked at the reporter docs for all the hooks and went to bed.
> 
>   All of a sudden before falling asleep The following simple idea striked.....
> 
> --------------------------------------------------------------------------
> 
> Simple Idea:
>   Step 1. Get all nodes having 'svn:mergeinfo' using svn_client__get_prop_from_wc.
>   Step 2. In the diff_editor of the parent merge call delete_path on those
>           overlapping subtress obtained from the Step 1.
> 
> 
> --------------------------------------------------------------------------
> 
> Actual implementation:
>   In reality delete_path approach did not work. When I called 'delete_path' I
>   got 'Skipping ...' notification, still my problematic subtrees
>   getting merged and causing conflict.
> 
>   One idea sparked, why not call 'set_path' on a reporter by giving the
>   'start_revision' same as 'end_revision', This idea has helped me in
>   making the 'subtree merge' a 'no op' which is half of my problem.
> 
>   Other half is to run a merge on this excluded subtrees.
>   Got 'Working copy locked' kind of messages.
> 
>   Finally came up with the following solution which solved all my problem.
> 
>   * Implemented 'do_subtree_merges' which would call 'do_merge' or
>     'do_single_file_merge' on excluded subtrees identified by
>     'svn_client__get_prop_from_wc'.
>     This function would set the 'subtrees_having_mergeinfo' with all such
>     excluded paths so that subsequent do_merge on a parent path would call
>     'no op set_path' on these paths.
>   * Modified do_merge to accept 'subtrees_having_mergeinfo' to exclude the
>     subtrees having overlapping merges using 'no op set_path' on them.
>   * Modified 'svn_client_merge3' and 'svn_client_merge_peg3' to first do a
>     merge on each 'subtrees_having_mergeinfo'
> 
> Ran make check and davautocheck all the tests were passing.
> 
> Refer the attached patch for details.
> --------------------------------------------------------------------------
> 
> Small concern with the patch:
> I use the same 'merge_cmd_baton' of 'svn_client_merge3' and
> 'svn_client_merge_peg3' in each subtree merges.
> I had a concern with this approach as I felt
> 'added_path, target, url, path, dry_run_deletions' of merge_cmd_baton
> can not be shared by different 'merges'.
> 
> I tried the following to see what happens with this approach.
> Reused merge_tests.py testcase 40 to create the basic repeat merge problem.
> >From some other Working copy added a file 'copy_of_B/F/E/gamma' and
> dir 'copy_of_B/F/E/mydir' in revisions 7 and 8 respectively.
> In revision 9 deleted copy_of_B/F/E/alpha.
> 
> I merged 3:9 of /A/B to copy_of_B(dry-run and actual run),
> I got the proper results and proper output.
> 
> I duplicated merge_cmd_baton on each merge with custom modification to
> 'target, url' reflecting the subtree merge.
> 
> Got the same results with this duplication.
> 
> So decided to use the same merge_cmd_baton on each merges.

RE: [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already

Posted by Kamesh Jayachandran <ka...@collab.net>.
Hi,

Michael Pilato while discussing over phone about this patch raised a concern what if the subtree does not exist in the repo for the rev with which we call set_path.(no-op set-path)

I just experimented in my $BUILDDIR/subversion/tests/cmdline/svn-test-work/working-copies/merge_tests-40.(at r6)

a)Deleted the file:///path/to/repositories/merge_tests-40/A/copy-of-B/F/E resulted in r7.
b)merged r3:7 of file:///path/to/repositories/merge_tests-40/A/B to A/copy-of-B worked well with this patch without complaining anything while set_path on copy_of_B/F/E for r7.

And hence no issues.

Thanks
With regards
Kamesh Jayachandran


-----Original Message-----
From: Kamesh Jayachandran [mailto:kamesh@collab.net]
Sent: Sun 12/24/2006 1:24 AM
To: dev@subversion.tigris.org
Subject:  [PATCH][merge-tracking]Fix repeat merge problem while subtree of merge target having overlapping merge already
 
Hi All,

Problem: (If this description is boring jump to 'Story' section.)

   "Sub tree 'S' of a parent tree 'P' has already been merged of certain
    revisions say rN-M of some path 'U/corresponding_dir_of_S', Where N < M.
    We try to merge a working copy of 'P' for rA-B of 'U', where 'A-B'
    intersects with 'N-M'
    This intersection could result in conflicting merges."
  i.e
   P = WC of http://localhost/svn/repos/branches/b1 (assume to be latest)
   S = WC of http://localhost/svn/repos/branches/b1/src (assume to be latest)
   U = http://localhost/svn/repos/trunk
   N = 25
   M = 30
   Merged http://localhost/svn/repos/trunk/src -r25:30 to WC of /branches/b1/src(S).
   Merging http://localhost/svn/repos/trunk -r27:35 to /branches/b1(P)
   could give conflict because we already merged few revisions for 'src'.

  This scenario is captured in 'avoid_repeated_merge_on_subtree_with_merge_info' of,
   http://svn.collab.net/repos/svn/branches/merge-tracking/subversion/tests/cmdline/merge_tests.py
--------------------------------------------------------------------------

Story: Ignore if you want.
       Just would like to describe it to record how I solved this problem.

   Four months back I took up the responsiblity to address repeat merge
   problem if subtree of the merge target has some overlapping with that of the
   current merge parent target.

   I tried solving the problem in a complex way (cut the parent tree multiple
   subtrees and run merge on each subtree.), few of my trials in this
   direction was not successful at all.

   In this connection I had a call with Mike. P and Dan to see how better we
   can solve this problem.

   I asked how mixed revision 'svn update' works.
   Mike has responded 'Some set_path... Beyond which I did not understand :)'

   Finally we came up with three abstract approaches,
   1. Cutting the trees to multiple subtrees, running the merge on each tree.
   2. Describe about our working copy subtree mergeinfos to update-report. And
      make the 'update-report' handle this case.
   3. Use ra_replay for each revision extract we want.

  I was trying to summarize the above 3 three approaches and send a mail to
  list to see what the dev@subversion think.

  I first sent my draft to Mike and Dan to see if at all I missed anything.

  Mike was frank enough to disclose that draft was bit boring and needs to be
  more lively.
  He has given valuable clues on how to improve it.


  I reworked on the draft, while explaining solution 2 (svn update on a mixed
revison like ....), I have decided to understand what that magical 'set_path
and Co.' calls are for(Mike was saying something on the call!)?

  I looked at the reporter docs for all the hooks and went to bed.

  All of a sudden before falling asleep The following simple idea striked.....

--------------------------------------------------------------------------

Simple Idea:
  Step 1. Get all nodes having 'svn:mergeinfo' using svn_client__get_prop_from_wc.
  Step 2. In the diff_editor of the parent merge call delete_path on those
          overlapping subtress obtained from the Step 1.


--------------------------------------------------------------------------

Actual implementation:
  In reality delete_path approach did not work. When I called 'delete_path' I
  got 'Skipping ...' notification, still my problematic subtrees
  getting merged and causing conflict.

  One idea sparked, why not call 'set_path' on a reporter by giving the
  'start_revision' same as 'end_revision', This idea has helped me in
  making the 'subtree merge' a 'no op' which is half of my problem.

  Other half is to run a merge on this excluded subtrees.
  Got 'Working copy locked' kind of messages.

  Finally came up with the following solution which solved all my problem.

  * Implemented 'do_subtree_merges' which would call 'do_merge' or
    'do_single_file_merge' on excluded subtrees identified by
    'svn_client__get_prop_from_wc'.
    This function would set the 'subtrees_having_mergeinfo' with all such
    excluded paths so that subsequent do_merge on a parent path would call
    'no op set_path' on these paths.
  * Modified do_merge to accept 'subtrees_having_mergeinfo' to exclude the
    subtrees having overlapping merges using 'no op set_path' on them.
  * Modified 'svn_client_merge3' and 'svn_client_merge_peg3' to first do a
    merge on each 'subtrees_having_mergeinfo'

Ran make check and davautocheck all the tests were passing.

Refer the attached patch for details.
--------------------------------------------------------------------------

Small concern with the patch:
I use the same 'merge_cmd_baton' of 'svn_client_merge3' and
'svn_client_merge_peg3' in each subtree merges.
I had a concern with this approach as I felt
'added_path, target, url, path, dry_run_deletions' of merge_cmd_baton
can not be shared by different 'merges'.

I tried the following to see what happens with this approach.
Reused merge_tests.py testcase 40 to create the basic repeat merge problem.