You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2008/03/10 12:05:36 UTC

Re: Property diffs as unidiff (Issue #1493)

Daniel Shahaf wrote:
> Part of bite-sized issue #1493 ("Property Diffs/Merges Should Use
> libsvn_diff"; http://subversion.tigris.org/issues/show_bug.cgi?id=1493)
> calls for property diffs to be output as unified diffs.  I'm interested
> in implementing it.
> 
> The current propdiff output is generated by display_prop_diffs() in
> libsvn_client/diff.c.  That function has enough context to generate
> a unidiff (if it called the memory-diff APIs), but if it generated
> a plain unidiff, patch(1) would choke trying (failing) to apply the
> propdiff hunks to a file.
> 
> 
> The output should be sufficiently different from unidiff that patch(1)
> wouldn't choke, but similar enough that people accustomed to reading
> unified diffs can read it easily.  (The change could be as simple as
> s/@@/@@@/g or changing the '---'/'+++' headers; anything patch(1) will 
> ignore is good.)
> 
> The output format has come up before, and formats were suggested, but
> I didn't see a decision reached.  (I could summarise the ideas I found
> in the archives if requested.)  However, I want to implement it, and
> I can't complete that unless I know how to format the output.
> 
> 
> How should property diffs be formatted, in order to be human-readable,
> unified diffs, and ignored by patch(1)?
> 
> 
> Once the answer is agreed upon, I'll be happy to make a patch to
> implement it.  Ideas, comments, pointers from anyone are welcome.

Daniel,

Thanks for your interest and apologies that you didn't get an answer to this 
email last month. That would have been just because nobody had an answer to give.

Actually the most useful thing you could do is to choose and propose a format, 
showing us how it satisfies these aims. (You'll get a much better response this 
way, as it's much easier for several busy developers to quickly read your 
proposal and say "Looks good" or "Doesn't work in such-and-such a case" than 
for any of them to spend the time needed to research and write up a proposal.)

Look in the email archives for last year's Summer-Of-Code project on creating a 
Tree-Aware Diff/Patch Format that can represent any Subversion change including 
properties, and see if a format for property diffs was proposed there. If so, 
it might be a good choice, or if no conclusion was reached the discussion might 
provide some hints.

Thanks.
- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Property diffs as unidiff (Issue #1493)

Posted by Daniel Shahaf <d....@daniel.shahaf.co.il>.
Julian Foad wrote on Mon, 10 Mar 2008 at 20:55 -0000:
> Hi Daniel.
>
> I assume you intended this to go to <de...@s.t.o>; it looks like a public 
> response but for some reason I had to re-add the list address manually.
>

When I send directly to dev@s.t.o, it takes half a day to appear. 
Sometimes less, if I'm subscribed at the time of sending.  I sent the 
previous message (and this one too) through gmane to see if it appears 
faster this way.  They sent me the authorization autoreply after 70 
minutes, and the message appeared in the archives shortly after I replied 
to gmane's autoreply.

>
> Daniel Shahaf wrote:
>> Julian, all,
>> 
>> I am sorry if I appeared to be asking everyone to do my research or 
> [...]
>> Hence I chose the middle way (not post the research right away, but 
>> indicate that I had already done it and was willing to post it).  I'll know 
>> what to do next time, however -- thanks for correcting me.
>
> No, no, I didn't think that. I must have missed the implication that you'd 
> already researched, though. Whatever, it's no problem!
>

Okay,

>
>> As Charles says, the new patch format will not include human-readable
>> property diffs.  As you (Julian) suggest, I list below several proposed
>> formats, some mine and some from the archives.
>
> My first thought on your list below is: can you eliminate some of them 
> already with a few tests, deductions or Googling? All the ones that cause 
> "patch" to throw an error are probably non-starters. For example, can you 
> find out whether any common implementations of "patch" exit as soon as they 
> try to patch a non-existent file? If any do, you can cross option (1) off the 
> list. You already said you can't see option (4) working; I haven't looked at 
> the details but assuming you're right you can cross that off the list too. 
> How about (2) and (3)?
>

I'm sure that every reasonable implementation of patch will at least warn 
on option (1), since (syntactically) it is a valid unidiff.  I'm skipping 
option (4) as you say.

Regarding (2) and (3): I looked at GNU patch, Sun OS's patch, and 
OpenBSD's patch.  (I have access to the first two, and read the source of 
the last.)  They all require unified patches' headers to start with 
exactly three '-' or '+' signs (respectively), and they all insist on 
(exactly) two @-signs at the start of a hunk header.  Except Sun OS's, the 
other two also look for a "Index:" or "Prereq:" line in the patch.

GNU patch skips (without user interaction) if the hunk header is changed, 
but prompts the user if the target file does not exist or if the diff 
header was changed.  Sun OS's patch prompts in the first three cases but 
not in the fourth (diff header changed).  However, if both the diff header 
and hunk header are changed:

     ---- foo1       2008-03-10 23:02:48.975179000 +0200
     ++++ foo2       2008-03-10 18:29:17.533954000 +0200
     %% -1 +1 %%
     -foo1
     +foo2

then both GNU's and Sun's patch(1)s skip the patch uninteractively (each 
for a different reason).  (They both print a message that no patch was 
found in the input.)

I won't be surprised if some patch(1) out there will parse two or four 
+-signs as a legitimate unidiff header.  (I didn't search to see if such a 
patch(1) implementation exists.)  I also did not consider other patch(1) 
implementations.

I guess all patch(1)s will ignore a "Property:" header.

Daniel


> Thanks.
> - Julian
>
>
> [...]
>> [ not in any particular order ]
>> 
>> 
>> 1. Using bogus filenames.
>>
>>      Index: some/path/trunk/svn:ignore
>>      --- some/path/trunk/svn:ignore
>>      +++ some/path/trunk/svn:ignore
>>      @@ -1,2 +1,2 @@
>>       Makefile
>>      -*.O
>>      +*.o
>> 
>> This format would mislead patch(1) into attempting to patch nonexisting 
>> files.  No harm done, but being ignored by patch(1) would be better.
>> 
>> 
>> 2. Change the diff header:
>> Drop the '---'/'+++' headers and rename the 'Index:' header.  [1]
>>
>>      Path: some/path/trunk
>>      Property: svn:ignore
>>      @@ -1,2 +1,2 @@
>>       Makefile
>>      -*.O
>>      +*.o
>> 
>> The hunks are unchanged, only the diff header is.  Further tweaks to the
>> header are possible, for example:
>>
>>      Property: svn:ignore (on 'some/path/trunk/')
>>      ----- svn:ignore    some/path/trunk/
>>      +++++ svn:ignore    some/path/trunk/
>>      @@ -1,2 +1,2 @@
>>       Makefile
>>      -*.O
>>      +*.o
>> 
>> The format originally proposed in [1] is:
>>
>>      Index: foo/bar.c (property 'svn:keywords' changed)
>>      ======================================================
>>      @@ -0,3 +0,3 @@
>>       Date
>>      -Revision
>>      +Id
>>       Author
>> 
>> (In my testing, patch(1) happily applied patches with 'Index:' but
>> without '---' and '+++' lines.  It seems necessary to tweak all
>> three indicators.)
>> 
>> 
>> 3. Change the hunk headers:
>> Use '@@@' instead of '@@'.
>>
>>      Property: svn:ignore
>>      --- some/path/trunk/
>>      +++ some/path/trunk/
>>      @@@ -1,2 +1,2 @@@
>>       Makefile
>>      -*.O
>>      +*.o
>> 
>> The reverse of the previous option.  My patch(1) ignores it:
>>
>>      % diff -u foo1 foo2 | sed s/@@/@@@/g | patch
>>      patch: **** Only garbage was found in the patch input.
>>      % diff -u foo1 foo2 | patch
>>      patching file foo1
>>      %
>> 
>> Using a different character than '@' should have the same effect:
>>
>>      Property: svn:ignore
>>      --- some/path/trunk/
>>      +++ some/path/trunk/
>>      %% -1,2 +1,2 %%
>>       Makefile
>>      -*.O
>>      +*.o
>> 
>> 
>> 4. Tez Kamihira's draft changeset format [2] includes property diffs
>> in a hunk titled "@@ properties: @@".
>> 
>> I do not see how this format scales to multi-line properties.  It gives
>> the following example, which seems to assume that all properties are
>> one-lined.  (For example, it does not allow one property's diff to have
>> multiple hunks.)
>> 
>> However, I have not read the proposal fully.
>>
>>       ---- ./foo.txt 2005-05-23 10:58:17.583015024 +0900
>>       ++++ ./bar.txt 2005-05-23 10:59:12.976593920 +0900
>>       @@ id: i_abc @@
>>       @@ -1,3 +1,4 @@
>>        aaaaaaaaaa
>>       +bbbbbbbbbb
>>        cccccccccc
>>        dddddddddd
>>       @@ properties: @@
>>       -zzz: xxx
>>       +zzz: yyy
>> 
>> [end of proposals]
>> 
>> 
>> Most of the ideas in options (2) and (3) are orthogonal, and may be used 
>> independently of each other.
>> 
>> 
>> Daniel
>> 
>> 
>> [1] http://svn.haxx.se/dev/archive-2006-05/0884.shtml and thread
>> [2] http://svn.haxx.se/dev/archive-2005-05/1255.shtml
>> 
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Property diffs as unidiff (Issue #1493)

Posted by Julian Foad <ju...@btopenworld.com>.
Hi Daniel.

I assume you intended this to go to <de...@s.t.o>; it looks like a public 
response but for some reason I had to re-add the list address manually.


Daniel Shahaf wrote:
> Julian, all,
> 
> I am sorry if I appeared to be asking everyone to do my research or 
[...]
> Hence I chose the middle way (not post the research right away, but 
> indicate that I had already done it and was willing to post it).  I'll 
> know what to do next time, however -- thanks for correcting me.

No, no, I didn't think that. I must have missed the implication that you'd 
already researched, though. Whatever, it's no problem!


> As Charles says, the new patch format will not include human-readable
> property diffs.  As you (Julian) suggest, I list below several proposed
> formats, some mine and some from the archives.

My first thought on your list below is: can you eliminate some of them already 
with a few tests, deductions or Googling? All the ones that cause "patch" to 
throw an error are probably non-starters. For example, can you find out whether 
any common implementations of "patch" exit as soon as they try to patch a 
non-existent file? If any do, you can cross option (1) off the list. You 
already said you can't see option (4) working; I haven't looked at the details 
but assuming you're right you can cross that off the list too. How about (2) 
and (3)?

Thanks.
- Julian


[...]
> [ not in any particular order ]
> 
> 
> 1. Using bogus filenames.
> 
>      Index: some/path/trunk/svn:ignore
>      --- some/path/trunk/svn:ignore
>      +++ some/path/trunk/svn:ignore
>      @@ -1,2 +1,2 @@
>       Makefile
>      -*.O
>      +*.o
> 
> This format would mislead patch(1) into attempting to patch nonexisting 
> files.  No harm done, but being ignored by patch(1) would be better.
> 
> 
> 2. Change the diff header:
> Drop the '---'/'+++' headers and rename the 'Index:' header.  [1]
> 
>      Path: some/path/trunk
>      Property: svn:ignore
>      @@ -1,2 +1,2 @@
>       Makefile
>      -*.O
>      +*.o
> 
> The hunks are unchanged, only the diff header is.  Further tweaks to the
> header are possible, for example:
> 
>      Property: svn:ignore (on 'some/path/trunk/')
>      ----- svn:ignore    some/path/trunk/
>      +++++ svn:ignore    some/path/trunk/
>      @@ -1,2 +1,2 @@
>       Makefile
>      -*.O
>      +*.o
> 
> The format originally proposed in [1] is:
> 
>      Index: foo/bar.c (property 'svn:keywords' changed)
>      ======================================================
>      @@ -0,3 +0,3 @@
>       Date
>      -Revision
>      +Id
>       Author
> 
> (In my testing, patch(1) happily applied patches with 'Index:' but
> without '---' and '+++' lines.  It seems necessary to tweak all
> three indicators.)
> 
> 
> 3. Change the hunk headers:
> Use '@@@' instead of '@@'.
> 
>      Property: svn:ignore
>      --- some/path/trunk/
>      +++ some/path/trunk/
>      @@@ -1,2 +1,2 @@@
>       Makefile
>      -*.O
>      +*.o
> 
> The reverse of the previous option.  My patch(1) ignores it:
> 
>      % diff -u foo1 foo2 | sed s/@@/@@@/g | patch
>      patch: **** Only garbage was found in the patch input.
>      % diff -u foo1 foo2 | patch
>      patching file foo1
>      %
> 
> Using a different character than '@' should have the same effect:
> 
>      Property: svn:ignore
>      --- some/path/trunk/
>      +++ some/path/trunk/
>      %% -1,2 +1,2 %%
>       Makefile
>      -*.O
>      +*.o
> 
> 
> 4. Tez Kamihira's draft changeset format [2] includes property diffs
> in a hunk titled "@@ properties: @@".
> 
> I do not see how this format scales to multi-line properties.  It gives
> the following example, which seems to assume that all properties are
> one-lined.  (For example, it does not allow one property's diff to have
> multiple hunks.)
> 
> However, I have not read the proposal fully.
> 
>       ---- ./foo.txt 2005-05-23 10:58:17.583015024 +0900
>       ++++ ./bar.txt 2005-05-23 10:59:12.976593920 +0900
>       @@ id: i_abc @@
>       @@ -1,3 +1,4 @@
>        aaaaaaaaaa
>       +bbbbbbbbbb
>        cccccccccc
>        dddddddddd
>       @@ properties: @@
>       -zzz: xxx
>       +zzz: yyy
> 
> [end of proposals]
> 
> 
> Most of the ideas in options (2) and (3) are orthogonal, and may be used 
> independently of each other.
> 
> 
> Daniel
> 
> 
> [1] http://svn.haxx.se/dev/archive-2006-05/0884.shtml and thread
> [2] http://svn.haxx.se/dev/archive-2005-05/1255.shtml
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Property diffs as unidiff (Issue #1493)

Posted by Daniel Shahaf <d....@daniel.shahaf.co.il>.
Julian, all,

I am sorry if I appeared to be asking everyone to do my research or write 
up my proposal.  That was not my intention, and I had done the research 
(and most of the proposal) before I posted.  I said in the original email 
that I had already done the research (albeit it was hidden in masses of 
text).

On the other hand, I assumed an attitude of "Hi. You've never met me 
before, but this is how I think this bug which has been open for years 
must be fixed." would not yield a positive response.  (And I was right 
expecting a miscommunication ;)

Hence I chose the middle way (not post the research right away, but 
indicate that I had already done it and was willing to post it).  I'll 
know what to do next time, however -- thanks for correcting me.


Back to technical topics, now.

Charles Acknin wrote on Mon, 10 Mar 2008 at 14:40 +0100:
> On Mon, Mar 10, 2008 at 1:05 PM, Julian Foad <ju...@btopenworld.com> wrote:
>>  Look in the email archives for last year's Summer-Of-Code project on creating a
>>  Tree-Aware Diff/Patch Format that can represent any Subversion change including
>>  properties, and see if a format for property diffs was proposed there. If so,
>>  it might be a good choice, or if no conclusion was reached the discussion might
>>  provide some hints.

As Charles says, the new patch format will not include human-readable
property diffs.  As you (Julian) suggest, I list below several proposed
formats, some mine and some from the archives.

>
> Well I can tell right away that the work that's been done wrt the new
> patch format during last year's GSoC is not what Daniel is looking
> for.  Indeed, the new patch format was designed to serialize any
> change of a tree and make it *transportable* between working copies
> (as opposed to visually readable for code review) for later patch
> application.
>

[ not in any particular order ]


1. Using bogus filenames.

      Index: some/path/trunk/svn:ignore
      --- some/path/trunk/svn:ignore
      +++ some/path/trunk/svn:ignore
      @@ -1,2 +1,2 @@
       Makefile
      -*.O
      +*.o

This format would mislead patch(1) into attempting to patch nonexisting 
files.  No harm done, but being ignored by patch(1) would be better.


2. Change the diff header:
Drop the '---'/'+++' headers and rename the 'Index:' header.  [1]

      Path: some/path/trunk
      Property: svn:ignore
      @@ -1,2 +1,2 @@
       Makefile
      -*.O
      +*.o

The hunks are unchanged, only the diff header is.  Further tweaks to the
header are possible, for example:

      Property: svn:ignore (on 'some/path/trunk/')
      ----- svn:ignore	some/path/trunk/
      +++++ svn:ignore	some/path/trunk/
      @@ -1,2 +1,2 @@
       Makefile
      -*.O
      +*.o

The format originally proposed in [1] is:

      Index: foo/bar.c (property 'svn:keywords' changed)
      ======================================================
      @@ -0,3 +0,3 @@
       Date
      -Revision
      +Id
       Author

(In my testing, patch(1) happily applied patches with 'Index:' but
without '---' and '+++' lines.  It seems necessary to tweak all
three indicators.)


3. Change the hunk headers:
Use '@@@' instead of '@@'.

      Property: svn:ignore
      --- some/path/trunk/
      +++ some/path/trunk/
      @@@ -1,2 +1,2 @@@
       Makefile
      -*.O
      +*.o

The reverse of the previous option.  My patch(1) ignores it:

      % diff -u foo1 foo2 | sed s/@@/@@@/g | patch
      patch: **** Only garbage was found in the patch input.
      % diff -u foo1 foo2 | patch
      patching file foo1
      %

Using a different character than '@' should have the same effect:

      Property: svn:ignore
      --- some/path/trunk/
      +++ some/path/trunk/
      %% -1,2 +1,2 %%
       Makefile
      -*.O
      +*.o


4. Tez Kamihira's draft changeset format [2] includes property diffs
in a hunk titled "@@ properties: @@".

I do not see how this format scales to multi-line properties.  It gives
the following example, which seems to assume that all properties are
one-lined.  (For example, it does not allow one property's diff to have
multiple hunks.)

However, I have not read the proposal fully.

       ---- ./foo.txt 2005-05-23 10:58:17.583015024 +0900
       ++++ ./bar.txt 2005-05-23 10:59:12.976593920 +0900
       @@ id: i_abc @@
       @@ -1,3 +1,4 @@
        aaaaaaaaaa
       +bbbbbbbbbb
        cccccccccc
        dddddddddd
       @@ properties: @@
       -zzz: xxx
       +zzz: yyy

[end of proposals]


Most of the ideas in options (2) and (3) are orthogonal, and may be used 
independently of each other.


Daniel


[1] http://svn.haxx.se/dev/archive-2006-05/0884.shtml and thread
[2] http://svn.haxx.se/dev/archive-2005-05/1255.shtml


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Property diffs as unidiff (Issue #1493)

Posted by Charles Acknin <ch...@gmail.com>.
On Mon, Mar 10, 2008 at 1:05 PM, Julian Foad <ju...@btopenworld.com> wrote:
>  Look in the email archives for last year's Summer-Of-Code project on creating a
>  Tree-Aware Diff/Patch Format that can represent any Subversion change including
>  properties, and see if a format for property diffs was proposed there. If so,
>  it might be a good choice, or if no conclusion was reached the discussion might
>  provide some hints.

Well I can tell right away that the work that's been done wrt the new
patch format during last year's GSoC is not what Daniel is looking
for.  Indeed, the new patch format was designed to serialize any
change of a tree and make it *transportable* between working copies
(as opposed to visually readable for code review) for later patch
application.

[what it does is that while traversing (i.e. during 'svn diff') the
working copy for changes (a) text-changes are left as unidiff and (b)
other non-unidiff-able changes such as {property, binary and tree}
changes are serialized into basically ra_svn protocol bytes before it
all get dumped to the output along with the unidiff.  This way, 'svn
patch' can tell apart from the unidiff and the svnpatch bytes and
dispatch the tasks accordingly in order to apply the patch (the
unidiff with text-changes is forwarded to patch(1) and the svnpatch
chunk is processed internally).]

Charles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org