You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2011/12/11 20:35:04 UTC

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

This version of the patch is much shorter and gets back to the essentials.  I found that yes the existing 'svn_client_merge4()' API can be called as the final step, so all I am doing here now is providing a simplified reintegrate API that just checks the target WC is suitable and then finds and returns the two URLs and revisions for the two-URL merge.  The caller then calls svn_client_merge4() with the two URLs and revisions that we found.  Like this:

  The svn client reintegrate merge code calls:

    svn_client_find_reintegrate_merge(&url1, &rev1, &url2, &rev2, ...);
    svn_client_merge4(url1, rev1, url2, rev2, ...);

In between those two calls, I print a message telling the user what the equivalent two-URL merge command is.

I cut out the stuff about getting a WC write lock around the whole caboodle, becausealthough it's theoretically correct to do soI don't think it's particularly important, given that there are so many other ways and means by which a client can get into the same kind of scenario.

I cut out the code for pretty-printing the URLs in "^/repos-relpath" style because that was not fundamental but just an additional nicety.


If no objections I plan to commit this, without the extra messages about two-url-command (which I'll come back to later).

- Julian


I (Julian Foad) wrote:> I want them to be reusable, and I expect that there should be
> opportunities for re-use sooner or later.  For example, the client
> should be able to use ensure_wc_is_suitable_...() before other kinds
> of merge, and even in other non-merge contexts.  The second stage,
> find_reintegrate_merge() obviously has a purpose that is specific to
> a reintegrate merge, but still it should as far as possible return
> results that have some meaning to the caller rather than an opaque
> set of outputs that are only suitable for feeding into
> do_reintegrate_merge().  And I'll investigate now whether and how
> the third stage can call the existing two-URL merge API instead of
> introducing a reintegrate-specific API function.

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by Julian Foad <ju...@btopenworld.com>.
Paul Burba wrote:
> Julian Foad wrote:

>>  If step 2 could repeat the same check as step 1, that would be useful
>> for  your scenario, but otherwise there's little point in step 2 doing
>> a partial  check.  So for now I'll take the simple solution which is to 
>> call svn_client_merge4(allow_mixed_rev=TRUE) to bypass the check.
>> 
>>  If and when we want to perform the full check in step 2, then we'll 
>> want to alter svn_client_merge4() in some way to provide more control 
>> over those checks, or else expose the check function separately.  The 
>> best way might become clearer with further work on the APIs.
> 
> Agreed, this can be dealt with later (if it needs to be dealt with at all).
> 
> Overall I think this patch looks good.  The only problem I see is:
[...]
> svn_client_find_reintegrate_merge() takes an absolute WC path, but we
> might be passing it a relative path here.  That will ultimately cause
> an assertion in wc_db.c:

Yup, I found that too.

I also avoided the work duplication in the reimplementation of svn_client_merge_reintegrate(): I made the internal find_reintegrate_merge() function return the sessions and youngest common ancestor revision, so that merge_reintegrate_locked() can call merge_cousins_...() directly instead of calling the higher-level merge_locked().

Committed in this form in r1213349.

Thanks for looking.

- Julian

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by Paul Burba <pt...@gmail.com>.
On Mon, Dec 12, 2011 at 5:51 AM, Julian Foad <ju...@btopenworld.com> wrote:
> Responses in-line, but first a reminder of why I'm doing this.
>
> It's because I've been looking at making 'svn merge' give better diagnostics and confirmations.

+1 to the spirit of that!

> Things like issuing an error or warning if the source and target are in fact the same 'branch', or if the user is trying to reintegrate in the wrong direction (the same direction as some sync merges have been done).  To do this, the client needs to ask questions like 'what branches and revisions would this merge affect?' before executing the merge.  Similar issues apply to all kinds of merges, but I happen to have taken the reintegrate first, since it's known to be a wrapper for a two-URL merge.
>
> I expect kind of control this would be especially valuable for GUIs, and I would appreciate feedback on this matter.
>
> I don't suggest that every client should be required to execute the steps separately.  I think we should keep the 'svn_client_merge_reintegrate' API as a shortcut, and not deprecate it.

+1

> Opinions welcome.
>
> Daniel Shahaf wrote:
>
>> OK.  I can't find reintegrate_merge_locked(),
>
>
> Oops, I meant 'merge_reintegrate_locked'.
>
> And:
>> Julian Foad wrote:
>>>  >
>>>  >     svn_client_find_reintegrate_merge(&url1, &rev1, &url2, &rev2, ...);
>>>  >     svn_client_merge4(url1, rev1, url2, rev2, ...);
>>>
>>>  One issue is that there is some duplication of work in this
>>>  formulation.  Both of these functions check that the target WC is
>>>  suitable for merging into.  Both open up RA sessions.  Both determine
>>>  the youngest common ancestor.
>>
>> I don't think "opening an RA session" is a problem for a function
>> that's about to do a merge operation :)
>
> Yes, even for a small merge the cost of that can't be of much importance.
>
>> Checking the WC's sanity, presumably that's necessary since the WC may
>> be modified after svn_client_find_reintegrate_merge() but before
>> svn_client_merge4()?  Especially if the user is prompted to +1 the
>> output of svn_client_find_reintegrate_merge() before the actual merge
>> is done.  (and then goes to lunch, comes back, makes some commits, finds
>> a loose TSVN dialog and OKs it)
>
> Good thoughts.  The current situation in this patch (v2) is that step 1 (_find_reintegrate_merge) checks the WC is:
>
>   * single-revision
>   * unmodified
>   * unswitched
>
> and step 2 (_merge4) checks the WC is:
>
>   * single-revision
>
> because that's all that a normal (non-reintegrate) invocation of svn_client_merge4() needs.
>
> If step 2 could repeat the same check as step 1, that would be useful for your scenario, but otherwise there's little point in step 2 doing a partial check.  So for now I'll take the simple solution which is to call svn_client_merge4(allow_mixed_rev=TRUE) to bypass the check.
>
> If and when we want to perform the full check in step 2, then we'll want to alter svn_client_merge4() in some way to provide more control over those checks, or else expose the check function separately.  The best way might become clearer with further work on the APIs.

Agreed, this can be dealt with later (if it needs to be dealt with at all).

Overall I think this patch looks good.  The only problem I see is:

> Index: subversion/svn/merge-cmd.c
> ===================================================================
> --- subversion/svn/merge-cmd.c  (revision 1213025)
> +++ subversion/svn/merge-cmd.c  (working copy)
> @@ -39,6 +39,59 @@
>
>  /*** Code. ***/
>
> +/* Do a reintegrate merge from SOURCE_PATH_OR_URL@SOURCE_PEG_REVISION into
> + * TARGET_WCPATH.  Do it with a WC write lock unless DRY_RUN is true. */
> +static svn_error_t *
> +merge_reintegrate(const char *source_path_or_url,
> +                  const svn_opt_revision_t *source_peg_revision,
> +                  const char *target_wcpath,
> +                  svn_boolean_t dry_run,
> +                  svn_boolean_t quiet,
> +                  const apr_array_header_t *merge_options,
> +                  svn_client_ctx_t *ctx,
> +                  apr_pool_t *scratch_pool)
> +{
> +  const char *url1, *url2;
> +  svn_revnum_t rev1, rev2;
> +
> +  SVN_ERR(svn_client_find_reintegrate_merge(
> +            &url1, &rev1, &url2, &rev2,
> +            source_path_or_url, source_peg_revision, target_wcpath,
> +            ctx, scratch_pool, scratch_pool));

svn_client_find_reintegrate_merge() takes an absolute WC path, but we
might be passing it a relative path here.  That will ultimately cause
an assertion in wc_db.c:

svn: E235000: In file '..\..\..\subversion\libsvn_wc\wc_db.c' line
7173: assertion failed (svn_dirent_is_absolute(local_abspath))

Paul

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by Julian Foad <ju...@btopenworld.com>.
Responses in-line, but first a reminder of why I'm doing this.

It's because I've been looking at making 'svn merge' give better diagnostics and confirmations.  Things like issuing an error or warning if the source and target are in fact the same 'branch', or if the user is trying to reintegrate in the wrong direction (the same direction as some sync merges have been done).  To do this, the client needs to ask questions like 'what branches and revisions would this merge affect?' before executing the merge.  Similar issues apply to all kinds of merges, but I happen to have taken the reintegrate first, since it's known to be a wrapper for a two-URL merge.

I expect kind of control this would be especially valuable for GUIs, and I would appreciate feedback on this matter.

I don't suggest that every client should be required to execute the steps separately.  I think we should keep the 'svn_client_merge_reintegrate' API as a shortcut, and not deprecate it.  Opinions welcome.


Daniel Shahaf wrote:

> OK.  I can't find reintegrate_merge_locked(),


Oops, I meant 'merge_reintegrate_locked'.

And:
> Julian Foad wrote:
>>  > 
>>  >     svn_client_find_reintegrate_merge(&url1, &rev1, &url2, &rev2, ...);
>>  >     svn_client_merge4(url1, rev1, url2, rev2, ...);
>> 
>>  One issue is that there is some duplication of work in this
>>  formulation.  Both of these functions check that the target WC is
>>  suitable for merging into.  Both open up RA sessions.  Both determine
>>  the youngest common ancestor.
> 
> I don't think "opening an RA session" is a problem for a function 
> that's about to do a merge operation :)

Yes, even for a small merge the cost of that can't be of much importance.

> Checking the WC's sanity, presumably that's necessary since the WC may
> be modified after svn_client_find_reintegrate_merge() but before
> svn_client_merge4()?  Especially if the user is prompted to +1 the
> output of svn_client_find_reintegrate_merge() before the actual merge
> is done.  (and then goes to lunch, comes back, makes some commits, finds
> a loose TSVN dialog and OKs it)

Good thoughts.  The current situation in this patch (v2) is that step 1 (_find_reintegrate_merge) checks the WC is:

  * single-revision
  * unmodified
  * unswitched

and step 2 (_merge4) checks the WC is:

  * single-revision

because that's all that a normal (non-reintegrate) invocation of svn_client_merge4() needs.

If step 2 could repeat the same check as step 1, that would be useful for your scenario, but otherwise there's little point in step 2 doing a partial check.  So for now I'll take the simple solution which is to call svn_client_merge4(allow_mixed_rev=TRUE) to bypass the check.

If and when we want to perform the full check in step 2, then we'll want to alter svn_client_merge4() in some way to provide more control over those checks, or else expose the check function separately.  The best way might become clearer with further work on the APIs.

- Julian

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by DanielShahaf <da...@elego.de>.
Julian Foad wrote on Sun, Dec 11, 2011 at 21:25:55 +0000:
> I (Julian Foad) wrote:
> 
> >   The svn client reintegrate merge code calls:
> > 
> >     svn_client_find_reintegrate_merge(&url1, &rev1, &url2, &rev2, ...);
> >     svn_client_merge4(url1, rev1, url2, rev2, ...);
> 
> One issue is that there is some duplication of work in this
> formulation.  Both of these functions check that the target WC is
> suitable for merging into.  Both open up RA sessions.  Both determine
> the youngest common ancestor.
> 
> It looks like the amount of work duplicated is fairly constant,
> suggesting it would not affect large merges very much, but I have yet
> to measure this.
> 

I don't think "opening an RA session" is a problem for a function that's
about to do a merge operation :)

Checking the WC's sanity, presumably that's necessary since the WC may
be modified after svn_client_find_reintegrate_merge() but before
svn_client_merge4()?  Especially if the user is prompted to +1 the
output of svn_client_find_reintegrate_merge() before the actual merge
is done.  (and then goes to lunch, comes back, makes some commits, finds
a loose TSVN dialog and OKs it)

> We could avoid any or all of this duplication at the cost of making
> the API more complex, for example by exposing a cut-down merge
> function that only does part of what svn_client_merge4() does.  The
> 'svn_client_do_reintegrate_merge' that I showed in the previous
> version of this patch was one such option.  That was so cut-down that
> it was perhaps only suitable for doing a reintegrate merge.
> 
> Another way to address part of the issue is to expose the "check that
> the target WC is suitable for merging into" functionality as
> a separate API (like in v1 of this patch), which would be called by
> the client before doing any kind of merge (reintegrate or otherwise)
> instead of making each merge function do that check internally. 
> Another part of the issue could be addressed by pooling and reusing
> the client's RA connections (which is a totally separate project). 
> There are probably other ways of refactoring the merge APIs too.
> 
> I'll think on this a bit more.  I can say though that I'm more
> concerned with the large-scale shape of the merge APIs, and I'd
> strongly prefer general solutions such as RA session pooling over
> special-casing the APIs.
> 
> - Julian

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by Julian Foad <ju...@btopenworld.com>.
I (Julian Foad) wrote:

>   The svn client reintegrate merge code calls:
> 
>     svn_client_find_reintegrate_merge(&url1, &rev1, &url2, &rev2, ...);
>     svn_client_merge4(url1, rev1, url2, rev2, ...);

One issue is that there is some duplication of work in this formulation.  Both of these functions check that the target WC is suitable for merging into.  Both open up RA sessions.  Both determine the youngest common ancestor.

It looks like the amount of work duplicated is fairly constant, 
suggesting it would not affect large merges very much, but I have yet to
 measure this.

We could avoid any or all of this duplication at the cost of making the API more complex, for example by exposing a cut-down merge function that only does part of what svn_client_merge4() does.  The 'svn_client_do_reintegrate_merge' that I showed in the previous version of this patch was one such option.  That was so cut-down that it was perhaps only suitable for doing a reintegrate merge.

Another way to address part of the issue is to expose the "check that the target WC is suitable for merging into" functionality as a separate API (like in v1 of this patch), which would be called by the client before doing any kind of merge (reintegrate or otherwise) instead of making each merge function do that check internally.  Another part of the issue could be addressed by pooling and reusing the client's RA connections (which is a totally separate project).  There are probably other ways of refactoring the merge APIs too.

I'll think on this a bit more.  I can say though that I'm more concerned with the large-scale shape of the merge APIs, and I'd strongly prefer general solutions such as RA session pooling over special-casing the APIs.

- Julian

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by DanielShahaf <da...@elego.de>.
Julian Foad wrote on Sun, Dec 11, 2011 at 20:53:26 +0000:
> DanielShahaf wrote:
> > Julian Foad wrote on Sun, Dec 11, 2011 at 19:35:04 +0000:
> >>  +  if (source->url1)
> >>  +    {
> >>  +      svn_opt_revision_t revision1
> >>  +        = { svn_opt_revision_number, { source->rev1 } };
> >>  +      svn_opt_revision_t revision2
> >>  +        = { svn_opt_revision_number, { source->rev2 } };
> > 
> > ISTR we had trouble in the past with some compilers not allowing these
> > non-constant initializers.  (Fix would be to unroll the initialization
> > into separate lines of code.)
> 
> AFAIK we've had this kind of initialization in the Subversion source
> for a long time now, so I'm treating it as de-facto acceptable even
> though not C'89.  I've been writing quite a few of these recently. 
> I can change them all to the long-winded alternative if proven
> necessary, but I hope it's not necessary because I really like the
> brevity.

I don't think it's necessary to change from the style used in the patch
if it works in practice.

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by DanielShahaf <da...@elego.de>.
Julian Foad wrote on Sun, Dec 11, 2011 at 20:53:26 +0000:
> Look at the following patch hunk, and ideally look also at the earlier
> hunk where we see these changes are inside the function that was
> called 'reintegrate_merge_locked' but is now a cut-down function
> called 'find_reintegrate_merge'.  So ...
> 
> >>  @@ -10628,31 +10631,76 @@ merge_reintegrate_locked(const char *sou
> >>   
> >>     /* Left side: trunk@youngest-trunk-rev-merged-to-branch-at-specified-peg-rev
> >>      * Right side: branch@specified-peg-revision */
> >>  +  *source_p = apr_pmemdup(result_pool, &source, sizeof(source));
> >>  +  return SVN_NO_ERROR;
> >>  +}
> >>   
> >>  -  /* Do the real merge! */
> >>  -  /* ### TODO(reint): Make sure that one isn't the same line ancestor
> >>  -     ### of the other (what's erroneously referred to as "ancestrally
> >>  -     ### related" in this source file).  We can merge to trunk without
> >>  -     ### implementing this. */
> >>  -  err = merge_cousins_and_supplement_mergeinfo(target_abspath,
> >>  -                                         target_ra_session,
> >>  -                                         source_ra_session,
> >>  -                                         &source, yc_ancestor_rev,
> >>  -                                         TRUE /* same_repos */,
> >>  -                                         svn_depth_infinity,
> >>  -                                         FALSE /* ignore_ancestry */,
> >>  -                                         FALSE /* force */,
> >>  -                                         FALSE /* record_only */,
> >>  -                                         dry_run,
> > 
> > Could you clarify why this is removed?  I don't see it added elsewhere
> > in the patch.  Is it a functional change?  Or do the diff hunks form an
> > optical illusion here?
> 
> Instead of performing the merge, the (renamed) function now only finds
> the URLs and returns them.  Then, later on (in
> merge_reintegrate_locked), instead of calling this 'merge_cousins'
> function directly, we instead call 'merge_locked' (which is the guts
> of svn_client_merge4()) which calls 'merge_cousins'.
> 

OK.  I can't find reintegrate_merge_locked(), but overall it sounds like
an intentional side-effect of gutting the actual merge out of
svn_client_find_reintegrate_merge().  Thanks for the explanation.

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by Julian Foad <ju...@btopenworld.com>.
DanielShahaf wrote:

> [ Caveat: I'm not at all familiar with merge.c ]

Thanks for casting your eyes over it anyway.

> Julian Foad wrote on Sun, Dec 11, 2011 at 19:35:04 +0000:
>>    The svn client reintegrate merge code calls:
>> 
>>      svn_client_find_reintegrate_merge(&url1, &rev1, &url2, 
> &rev2,
>>  ...);     svn_client_merge4(url1, rev1, url2, rev2, ...);
> ...

> Looks good.

Thanks.

Look at the following patch hunk, and ideally look also at the earlier hunk where we see these changes are inside the function that was called 'reintegrate_merge_locked' but is now a cut-down function called 'find_reintegrate_merge'.  So ...

>>  @@ -10628,31 +10631,76 @@ merge_reintegrate_locked(const char *sou
>>   
>>     /* Left side: trunk@youngest-trunk-rev-merged-to-branch-at-specified-peg-rev
>>      * Right side: branch@specified-peg-revision */
>>  +  *source_p = apr_pmemdup(result_pool, &source, sizeof(source));
>>  +  return SVN_NO_ERROR;
>>  +}
>>   
>>  -  /* Do the real merge! */
>>  -  /* ### TODO(reint): Make sure that one isn't the same line ancestor
>>  -     ### of the other (what's erroneously referred to as "ancestrally
>>  -     ### related" in this source file).  We can merge to trunk without
>>  -     ### implementing this. */
>>  -  err = merge_cousins_and_supplement_mergeinfo(target_abspath,
>>  -                                         target_ra_session,
>>  -                                         source_ra_session,
>>  -                                         &source, yc_ancestor_rev,
>>  -                                         TRUE /* same_repos */,
>>  -                                         svn_depth_infinity,
>>  -                                         FALSE /* ignore_ancestry */,
>>  -                                         FALSE /* force */,
>>  -                                         FALSE /* record_only */,
>>  -                                         dry_run,
> 
> Could you clarify why this is removed?  I don't see it added elsewhere
> in the patch.  Is it a functional change?  Or do the diff hunks form an
> optical illusion here?

Instead of performing the merge, the (renamed) function now only finds the URLs and returns them.  Then, later on (in merge_reintegrate_locked), instead of calling this 'merge_cousins' function directly, we instead call 'merge_locked' (which is the guts of svn_client_merge4()) which calls 'merge_cousins'.

> 
>>  +merge_reintegrate_locked(const char *source_path_or_url,
>>  +                         const svn_opt_revision_t *source_peg_revision,
>>  +                         const char *target_abspath,
>>  +                         svn_boolean_t dry_run,
>>  +                         const apr_array_header_t *merge_options,
>>  +                         svn_client_ctx_t *ctx,
>>  +                         apr_pool_t *scratch_pool)
>>  +{
>>  +  if (source->url1)
>>  +    {
>>  +      svn_opt_revision_t revision1
>>  +        = { svn_opt_revision_number, { source->rev1 } };
>>  +      svn_opt_revision_t revision2
>>  +        = { svn_opt_revision_number, { source->rev2 } };
> 
> ISTR we had trouble in the past with some compilers not allowing these
> non-constant initializers.  (Fix would be to unroll the initialization
> into separate lines of code.)

AFAIK we've had this kind of initialization in the Subversion source for a long time now, so I'm treating it as de-facto acceptable even though not C'89.  I've been writing quite a few of these recently.  I can change them all to the long-winded alternative if proven necessary, but I hope it's not necessary because I really like the brevity.

- Julian

Re: [PATCH] Split up the reintegrate merge API: first find what to do, then do it - v2

Posted by DanielShahaf <da...@elego.de>.
[ Caveat: I'm not at all familiar with merge.c ]

Julian Foad wrote on Sun, Dec 11, 2011 at 19:35:04 +0000:
>   The svn client reintegrate merge code calls:
> 
>     svn_client_find_reintegrate_merge(&url1, &rev1, &url2, &rev2,
> ...);     svn_client_merge4(url1, rev1, url2, rev2, ...);
...
> If no objections I plan to commit this, without the extra messages
> about two-url-command (which I'll come back to later).

Looks good.

> @@ -10628,31 +10631,76 @@ merge_reintegrate_locked(const char *sou
>  
>    /* Left side: trunk@youngest-trunk-rev-merged-to-branch-at-specified-peg-rev
>     * Right side: branch@specified-peg-revision */
> +  *source_p = apr_pmemdup(result_pool, &source, sizeof(source));
> +  return SVN_NO_ERROR;
> +}
>  
> -  /* Do the real merge! */
> -  /* ### TODO(reint): Make sure that one isn't the same line ancestor
> -     ### of the other (what's erroneously referred to as "ancestrally
> -     ### related" in this source file).  We can merge to trunk without
> -     ### implementing this. */
> -  err = merge_cousins_and_supplement_mergeinfo(target_abspath,
> -                                               target_ra_session,
> -                                               source_ra_session,
> -                                               &source, yc_ancestor_rev,
> -                                               TRUE /* same_repos */,
> -                                               svn_depth_infinity,
> -                                               FALSE /* ignore_ancestry */,
> -                                               FALSE /* force */,
> -                                               FALSE /* record_only */,
> -                                               dry_run,

Could you clarify why this is removed?  I don't see it added elsewhere
in the patch.  Is it a functional change?  Or do the diff hunks form an
optical illusion here?

> +merge_reintegrate_locked(const char *source_path_or_url,
> +                         const svn_opt_revision_t *source_peg_revision,
> +                         const char *target_abspath,
> +                         svn_boolean_t dry_run,
> +                         const apr_array_header_t *merge_options,
> +                         svn_client_ctx_t *ctx,
> +                         apr_pool_t *scratch_pool)
> +{
> +  if (source->url1)
> +    {
> +      svn_opt_revision_t revision1
> +        = { svn_opt_revision_number, { source->rev1 } };
> +      svn_opt_revision_t revision2
> +        = { svn_opt_revision_number, { source->rev2 } };

ISTR we had trouble in the past with some compilers not allowing these
non-constant initializers.  (Fix would be to unroll the initialization
into separate lines of code.)

> 

Thanks!

Daniel