You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Stefan Sperling <st...@elego.de> on 2009/08/03 17:23:51 UTC

Re: [PATCH] issue 1493 - use libsvn_diff for diffing props

On Mon, Jul 27, 2009 at 12:48:27PM +0200, Daniel Näslund wrote:
> Attaching my new patch with the proposed changes from stsp.

Hi Daniel,

I've attached an updated patch this time which includes my suggested
changes. Please check my version of the patch to see if you are happy
with it.

Every change I made is documented in the comments below.

> Index: subversion/libsvn_diff/diff_memory.c
> ==================================================================--- subversion/libsvn_diff/diff_memory.c	(revision 38477)
> +++ subversion/libsvn_diff/diff_memory.c	(arbetskopia)
> @@ -426,7 +426,8 @@
>  /* Flush the hunk currently built up in BATON
>     into the baton's output_stream */

Should also update the doc string to mention the new parameter.

>  static svn_error_t *
> -output_unified_flush_hunk(output_baton_t *baton)
> +output_unified_flush_hunk(output_baton_t *baton,
> +                           const char *hunk_delimiter)

Wrong indentation here. Please align all parameters at the same column.

>  {
>    apr_off_t target_token;
>    apr_size_t hunk_len;
> @@ -453,23 +454,37 @@
>             /* Hunk length 1 is implied, don't show the
>                length field if we have a hunk that long */
>             (baton->hunk_length[0] == 1)

For functions, we use no space (including newlines) before parentheses.
This seems to be old code which hadn't been updated to this convention.

> -           ? ("@@ -%" APR_OFF_T_FMT)
> -           : ("@@ -%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT),
> +           ? ("%s -%" APR_OFF_T_FMT)
> +           : ("%s -%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT),
> +           hunk_delimiter,
>             baton->hunk_start[0], baton->hunk_length[0]));
> 
>    if (baton->hunk_length[1] > 0)
>      /* Convert our 0-based line numbers into unidiff 1-based numbers */
>      baton->hunk_start[1]++;
> -  SVN_ERR(svn_stream_printf_from_utf8
> +

The function should fall back to the default hunk delimiter
if none is supplied.

> +
> +  /* Hunk length 1 is implied, don't show the
> +     length field if we have a hunk that long */
> +  if (baton->hunk_length[1] == 1)
> +    {
> +      SVN_ERR(svn_stream_printf_from_utf8

Again, no space before paren.

>            (baton->output_stream, baton->header_encoding,
>             baton->pool,
> -           /* Hunk length 1 is implied, don't show the
> -              length field if we have a hunk that long */
> -           (baton->hunk_length[1] == 1)
> -           ? (" +%" APR_OFF_T_FMT " @@" APR_EOL_STR)
> -           : (" +%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT " @@" APR_EOL_STR),
> -           baton->hunk_start[1], baton->hunk_length[1]));
> +           " +%" APR_OFF_T_FMT " %s" APR_EOL_STR,
> +           baton->hunk_start[1], hunk_delimiter));
> +    }
> +  else
> +    {
> +      SVN_ERR(svn_stream_printf_from_utf8

Same.

> +          (baton->output_stream, baton->header_encoding,
> +           baton->pool,
> +           " +%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT " %s" APR_EOL_STR,
> +           baton->hunk_start[1], baton->hunk_length[1],
> +           hunk_delimiter));
> 
> +    }
> +
>    hunk_len = baton->hunk->len;
>    SVN_ERR(svn_stream_write(baton->output_stream,
>                             baton->hunk->data, &hunk_len));
> @@ -498,7 +513,7 @@
>    targ_mod = modified_start;
> 
>    if (btn->next_token + SVN_DIFF__UNIFIED_CONTEXT_SIZE < targ_orig)
> -    SVN_ERR(output_unified_flush_hunk(btn));
> +    SVN_ERR(output_unified_flush_hunk(btn, "@@"));

With output_unified_flush_hunk falling back to a reasonable default,
we can now just pass NULL instead of "@@" here. The default is now
set at only one location in the code.

> 
>    if (btn->hunk_length[0] == 0
>        && btn->hunk_length[1] == 0)
> @@ -530,8 +545,10 @@
> 
> 
>  svn_error_t *
> -svn_diff_mem_string_output_unified(svn_stream_t *output_stream,
> +svn_diff_mem_string_output_unified2(svn_stream_t *output_stream,
>                                     svn_diff_t *diff,
> +                                   svn_boolean_t with_diff_header,

Parameters should follow the indentation change.

> +                                   const char *hunk_delimiter,
>                                     const char *original_header,
>                                     const char *modified_header,
>                                     const char *header_encoding,
> @@ -563,23 +580,50 @@
>        fill_source_tokens(&baton.sources[0], original, pool);
>        fill_source_tokens(&baton.sources[1], modified, pool);
> 
> -      SVN_ERR(svn_stream_printf_from_utf8
> -              (output_stream, header_encoding, pool,
> -               "--- %s" APR_EOL_STR
> -               "+++ %s" APR_EOL_STR,
> -               original_header, modified_header));
> +      if (with_diff_header) {

The indentation here does not match our coding style guidelines.

> +        SVN_ERR(svn_stream_printf_from_utf8

No space before paren. I've also tweaked the indentation a bit here.

> +                (output_stream, header_encoding, pool,
> +                 "--- %s" APR_EOL_STR
> +                 "+++ %s" APR_EOL_STR,
> +                 original_header, modified_header));
> +      }
> 
>        SVN_ERR(svn_diff_output(diff, &baton,
>                                &mem_output_unified_vtable));
> -      SVN_ERR(output_unified_flush_hunk(&baton));
> 
> +      SVN_ERR(output_unified_flush_hunk(&baton, hunk_delimiter));
> +
>        svn_pool_destroy(baton.pool);
>      }
> 
>    return SVN_NO_ERROR;
>  }
> 
> +svn_error_t *
> +svn_diff_mem_string_output_unified(svn_stream_t *output_stream,
> +                                   svn_diff_t *diff,
> +                                   const char *original_header,
> +                                   const char *modified_header,
> +                                   const char *header_encoding,
> +                                   const svn_string_t *original,
> +                                   const svn_string_t *modified,
> +                                   apr_pool_t *pool)
> +{
> +  SVN_ERR(svn_diff_mem_string_output_unified2(
> +      output_stream,

Please align all parameters on the same column where the function
name ends, wherever possible, like so:

 SVN_ERR(svn_diff_mem_string_output_unified2(output_stream,
                                             diff,
					     ...

> +      diff,
> +      TRUE,
> +      "@@",

Again, now passing NULL.

> +      original_header,
> +      modified_header,
> +      header_encoding,
> +      original,
> +      modified,
> +      pool));
> +  return SVN_NO_ERROR;
> +}
> 
> +
>  
>  /* diff3 merge output */
> 
> Index: subversion/include/svn_diff.h
> ==================================================================--- subversion/include/svn_diff.h	(revision 38477)
> +++ subversion/include/svn_diff.h	(arbetskopia)
> @@ -667,15 +667,36 @@
>                            const svn_diff_file_options_t *options,
>                            apr_pool_t *pool);
> 
> -
>  /** Outputs the @a diff object generated by svn_diff_mem_string_diff()
>   * in unified diff format on @a output_stream, using @a original
>   * and @a modified for the text in the output.
> - * Outputs the header and markers in @a header_encoding.
> + * Only if @a with_diff_header is TRUE it outputs the header. Useful for
> + * property diffs.

I've tweaked this slightly. There's no need to mention one particular
case of application. You wouldn't say "useful for checking if the
string 'abc' matches the string 'abc'" in documentation of strcmp() either.

>   *
> + * Outputs the header and markers in @a header_encoding with the specified
> + * @a hunk_delimiters.
> + *
>   * @a original_header and @a modified header are
>   * used to fill the field after the "---" and "+++" header markers.
>   *
> + * @since New in 1.7.
> + */
> +svn_error_t *
> +svn_diff_mem_string_output_unified2(svn_stream_t *output_stream,
> +                                   svn_diff_t *diff,
> +                                   svn_boolean_t with_diff_header,
> +                                   const char *hunk_delimiter,
> +                                   const char *original_header,
> +                                   const char *modified_header,

Parameters should follow the indentation change.

> +                                   const char *header_encoding,
> +                                   const svn_string_t *original,
> +                                   const svn_string_t *modified,
> +                                   apr_pool_t *pool);
> +
> +/** Similar to svn_diff_mem_string_output_unified2() but with @a
> + * with_diff_headers always set to TRUE and @a hunk_delimiter always
> + * set to "@@".

Docstring slightly tweaked, see attached patch.

> + *
>   * @since New in 1.5.
>   */
>  svn_error_t *
> Index: subversion/libsvn_client/diff.c
> ==================================================================--- subversion/libsvn_client/diff.c	(revision 38477)
> +++ subversion/libsvn_client/diff.c	(arbetskopia)
> @@ -49,6 +49,7 @@
>  #include "svn_time.h"
>  #include "svn_sorts.h"
>  #include "svn_base64.h"
> +#include "svn_subst.h"
>  #include "client.h"
> 
>  #include "private/svn_wc_private.h"
> @@ -167,6 +168,34 @@
>                            _("Path '%s' must be an immediate child of " \
>                              "the directory '%s'"), path, relative_to_dir)
> 
> +/* A helper func used by display_prop_diffs. It appends the systems default
> +   eol character if there is none in the string.
> +   token is the string holding the property value. The return value is allocated
> +   from pool.*/

I've renamed this function to maybe_append_eol() and rewritten its
docstring. I think the docstring is now more accurate and describes
the behaviour in more detail. Do you think my description of the
function is correct?

> +static const svn_string_t *
> +append_eol(const svn_string_t *token, apr_pool_t *pool)
> +{
> +  const char *curp;
> +
> +  if (token->len == 0)
> +    return token;
> +
> +  curp = token->data + token->len - 1;
> +  if (*curp == '\r')
> +  {

The indentation here does not conform to our coding style guidelines.

> +    return token;
> +  }
> +  else if (*curp != '\n')
> +  {
> +    const char *tmp = apr_psprintf(pool, "%s%s", token->data, APR_EOL_STR);
> +    return svn_string_create(tmp, pool);

We can avoid using a temporary variable by using svn_string_createf().

I would like to avoid manual parsing of EOL strings here.
But we need a good set of APIs for that first.
Edmund Wong is looking into that.

> +  }
> +  else
> +  {
> +    return token;
> +  }
> +}
> +
>  /* A helper func that writes out verbal descriptions of property diffs
>     to FILE.   Of course, the apr_file_t will probably be the 'outfile'
>     passed to svn_client_diff5, which is probably stdout. */
> @@ -242,53 +271,39 @@
>            continue;
>          }
> 
> -      /* For now, we have a rather simple heuristic: if this is an
> -         "svn:" property, then assume the value is UTF-8 and must
> -         therefore be converted before printing.  Otherwise, just
> -         print whatever's there and hope for the best. */
>        {
> -        svn_boolean_t val_is_utf8 = svn_prop_is_svn_prop(propchange->name);
> +        /* Use libsvn_diff for output.*/

Superfluous comment.

> +        svn_stream_t *os = svn_stream_from_aprfile2(file, TRUE, pool);
> +        svn_diff_t *diff;
> +        svn_diff_file_options_t options;
> +        const svn_string_t *tmp;
> 
> -        if (original_value != NULL)
> -          {
> -            if (val_is_utf8)
> -              {
> -                SVN_ERR(file_printf_from_utf8
> -                        (file, encoding,
> -                         "   - %s" APR_EOL_STR, original_value->data));
> -              }
> -            else
> -              {
> -                /* ### todo: check for error? */
> -                apr_file_printf
> -                  (file, "   - %s" APR_EOL_STR, original_value->data);
> -              }
> -          }
> +        /* The last character in a property is often not a newline.
> +           Since the diff is not useful anyway for patching properties an
> +           eol character is appended when needed to remove those pescious
> +           ' \ No newline at end of file' lines. */
> +        tmp = original_value ?  original_value : svn_string_create("", pool);
> 
> -        if (propchange->value != NULL)
> -          {
> -            if (val_is_utf8)
> -              {
> -                SVN_ERR(file_printf_from_utf8
> -                        (file, encoding, "   + %s" APR_EOL_STR,
> -                         propchange->value->data));
> -              }
> -            else
> -              {
> -                /* ### todo: check for error? */
> -                apr_file_printf(file, "   + %s" APR_EOL_STR,
> -                                propchange->value->data);
> -              }
> -          }
> +        const svn_string_t *orig = append_eol(tmp, pool);

We can't declare variables in the middle of a scope.
This probably won't compile on Windows.

> +
> +        tmp = propchange->value ?  propchange->value :

Please use a single space after '?'.

> +                                   svn_string_create("", pool);
> +
> +        const svn_string_t *val = append_eol(tmp, pool);

Can't declare variables in the middle of a scope.

> +
> +        SVN_ERR(svn_diff_mem_string_diff(&diff, orig, val, &options, pool));
> +
> +        /* The diff uses '##' as delimiters for the hunk sections and does not
> +         * display a diff header. */
> +        SVN_ERR(svn_diff_mem_string_output_unified2(os, diff, TRUE, "##",

This comment does not explain why we are using '##' instead of the default.
I've changed this to explain the reason.

Also, your code does not match the comment, because with_diff_header is
set to TRUE.

> +                                           svn_dirent_local_style(path, pool),
> +                                           svn_dirent_local_style(path, pool),
> +                                           encoding, orig, val, pool));
> +        SVN_ERR(svn_stream_close(os));
> +
>        }
>      }
> 
> -  /* ### todo [issue #1533]: Use file_printf_from_utf8() to convert this
> -     to native encoding, at least conditionally?  Or is it better to
> -     have under_string always output the same eol, so programs can
> -     find it consistently?  Also, what about checking for error? */
> -  apr_file_printf(file, APR_EOL_STR);
> -
>    return SVN_NO_ERROR;
>  }
> 

I've slightly tested the attached patch now and produces splendid
diffs for multi-line property values. Thanks a lot! :)

In the future, please take more care about indentation. It takes
a lot of time to fix it up, and the less time your patches take
to process, the quicker they will get committed.

Before we commit this, as a final verification step, can you take the
time to run 'make check' on the patch and clean up any fallout in the
test suite? Or have you run the regression tests on this already?

Thanks,
Stefan

Re: [PATCH v3] issue 1493 - use libsvndiff for diffing properties

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Näslund wrote on Tue, 4 Aug 2009 at 19:55 +0200:
> > In the future, please take more care about indentation. It takes
> > a lot of time to fix it up, and the less time your patches take
> > to process, the quicker they will get committed.
> 
> Sorry about all the indentations and other style fixes you had to do.
> I'm a beginnner and as such enthusiastic but sometimes careless. I will
> try hard to save time for my tutors in my work to come. Perhaps there is
> some script to use for checking style in the code? I use the c.vim file
> from the tools directory but unfortunately it has its flaws. can not
> just do a ggvG= and everything is adjusted to meet the GNU standard.
>  

I've posted about my configuration before[1]; the relevant line is

    setlocal textwidth=79 tabstop=2 shiftwidth=2 expandtab cindent cinoptions=>2sn-s{s^-s:s

though I never tested it with gg=G; I'm only sure that it takes care of
formatting correctly the code as it's written.


(It's not perfect.. it probably wants +=t0(0, and maybe others)


[1] http://article.gmane.org/gmane.comp.version-control.subversion.devel/105238

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2380238

[PATCH v3] issue 1493 - use libsvndiff for diffing properties

Posted by Daniel Näslund <da...@longitudo.com>.
> Hi Daniel,
> 
> I've attached an updated patch this time which includes my suggested
> changes. Please check my version of the patch to see if you are happy
> with it.

I have no objections to your changes, Stefan. 

> Every change I made is documented in the comments below.
> 
> I've slightly tested the attached patch now and produces splendid
> diffs for multi-line property values. Thanks a lot! :)

You're welcome, but it's more the other way around. Thanks for taking
time enough to write at least two patches of this size on your own.
 
> In the future, please take more care about indentation. It takes
> a lot of time to fix it up, and the less time your patches take
> to process, the quicker they will get committed.

Sorry about all the indentations and other style fixes you had to do.
I'm a beginnner and as such enthusiastic but sometimes careless. I will
try hard to save time for my tutors in my work to come. Perhaps there is
some script to use for checking style in the code? I use the c.vim file
from the tools directory but unfortunately it has its flaws. can not
just do a ggvG= and everything is adjusted to meet the GNU standard.
 
> Before we commit this, as a final verification step, can you take the
> time to run 'make check' on the patch and clean up any fallout in the
> test suite? Or have you run the regression tests on this already?

There were 12 errors. Two required me to change the order of diff
targets. The --depth option for some reasson behaves differently now but
if you are correct that there is no standard order then we're fine. 

The old diff format always added a newline at the end of each target.
The new format does not.

Two tests failed for svnlook. I set them to XFail and plan to implement
the new diff format in svnlook as soon as this patch is committed.

Now make check detects no failures.

> Thanks,
> Stefan

Due to bug in tigris my patch propable will have the first diff header
on the same line as the '=====':s. A manual edit should fix it.


[[[
Fix part of issue 1493: Property diffs/merge should use libsvn_diff.
For properties I have removed the diff headers and replaced the @@-characters
with ##.

*subversion/libsvn_diff/diff_memory.c
(output_unified_flush_hunk2): Added parameter for choosing delimiter, for
instance '##' instead of '@@'.

*subversion/libsvn_diff/diff_memory.c
(svn_diff_mem_string_output_unified2): Added parameter for choosing if the diff
should be formatted as a property diff, '##' as delimiters.

*subversion/include/svn_diff.h
(output_unified_flush_hunk2): Declaration
(svn_diff_mem_string_output_unified2): Declaration

*subversion/libsvn_client/diff.c
(maybe_append_eol): Appends an end of line character if missing.

*subversion/libsvn_client/diff.c
(display_prop_diffs): Use libsvn_diff for diffing properties.

*subversion/tests/cmdline/diff_tests.py
(diff_only_property_change): Use the new diff format.
(diff_prop_change_local_edit): Use the new diff format.
(diff_prop_on_named_dir): Use the new diff format.
(diff_property_changes_to_base): Use the new diff format.
(diff_prop_change_local_propmod): Use the new diff format.
(diff_repos_wc_add_with_props): Use the new diff format.
(diff_with_depth): Use the new diff format. Change order of targets.

*subversion/tests/cmdline/depth_tests.py
(diff_in_depthy_wc): Use the new diff format. Change order of targets.

*subversion/tests/cmdline/merge_tests.py
(merge_in_new_file_and_diff): Use the new diff format.

*subversion/tests/cmdline/special_tests.py
(diff_symlink_to_dir): Use the new diff format.

*subversion/tests/cmdline/svnlook_tests.py
(test_print_property_diffs): Set to XFAIL until prop diffs is
  implemented in svnlook.
(diff_ignore_eolstyle): Set to XFAIL until prop diffs is implemented in
  svnlook.
]]]

Re: Changed order when diffing with --depth option

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Aug 04, 2009 at 01:00:51PM +0200, Daniel Näslund wrote:
> > Hi Daniel,
> > 
> > I've attached an updated patch this time which includes my suggested
> > changes. Please check my version of the patch to see if you are happy
> > with it.
> > 
> > 
> > I've slightly tested the attached patch now and produces splendid
> > diffs for multi-line property values. Thanks a lot! :)
> > 
> > Before we commit this, as a final verification step, can you take the
> > time to run 'make check' on the patch and clean up any fallout in the
> > test suite? Or have you run the regression tests on this already?
> > 
> > Thanks,
> > Stefan
> 
> When running depth_tests.py 19 and diff_tests.py 42 I get failures.
> Theese two test uses the --depth option for diffing. The tests display
> the diffs for each target from the top of the tree and downward. That
> is:
> 
> top_dir -
>         | file
>         |--subdir
> 
> will be displayed in order top_dir-file-subdir if depth is set to
> infinity. But after applying the patch for issue 1493 the order is
> subdir-file-top_dir. How can that be?  AFAIK I haven't touch anything
> that handles the depth. Using other the other depth options (empty,
> files, immediates) produces the same result. The targets are displayed
> from bottom up instead of up to down.

If I remember correctly the order which 'svn diff' uses for its
files is undefined. Julian wanted to fix this ages ago but for some
reason his patch wasn't applied.

So if the test assumes a fixed order, it's broken.

Stefan

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2379965


Changed order when diffing with --depth option

Posted by Daniel Näslund <da...@longitudo.com>.
> Hi Daniel,
> 
> I've attached an updated patch this time which includes my suggested
> changes. Please check my version of the patch to see if you are happy
> with it.
> 
> 
> I've slightly tested the attached patch now and produces splendid
> diffs for multi-line property values. Thanks a lot! :)
> 
> Before we commit this, as a final verification step, can you take the
> time to run 'make check' on the patch and clean up any fallout in the
> test suite? Or have you run the regression tests on this already?
> 
> Thanks,
> Stefan

When running depth_tests.py 19 and diff_tests.py 42 I get failures.
Theese two test uses the --depth option for diffing. The tests display
the diffs for each target from the top of the tree and downward. That
is:

top_dir -
        | file
        |--subdir

will be displayed in order top_dir-file-subdir if depth is set to
infinity. But after applying the patch for issue 1493 the order is
subdir-file-top_dir. How can that be?  AFAIK I haven't touch anything
that handles the depth. Using other the other depth options (empty,
files, immediates) produces the same result. The targets are displayed
from bottom up instead of up to down.

Mvh
Daniel