You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ben Reser <be...@reser.org> on 2004/09/16 00:49:18 UTC

svn diff output

Jack recently gave a suggestion for how to fix a problem he has but I
think it's the wrong way to go.  Not because I think he has a bad
demand, but because I think it fails to meet a lot of other needs and
seems like a kludge.

Part of the problem as he observes is that different people have
different expectations of what they want from svn diff.  Some people
want something that they can use to view their changes.  Others want
output that is usable to feed into patch on someone elses computer.

Both of these are reasonable uses of svn diff.

Ultimately though we have the following additional problem.  svn diff's
output does not produce output that is usable by patch to reconstruct
the exact operations done.  You can't make a set of changes, decide not
to commit at that time, produce a svn diff, and then throw away the
working copy and then expect that the svn diff output will be usable to
reproduce the state of that working copy.

I remember talking with Karl about this and he didn't think it was this
projects job to come up with a format that achieves that.  But I think
it is.  Here's why:

a) First of all we're an open source project.  We receive submissions
from a lot of people who don't have direct commit access.  Patches are
a key method of our operation.  However, people can't submit patches
that copy files, or apply modifications to properties.  Yet these are
important parts of our functionality.

For example.  We have people translating the book now.  They've
submitted patches against the English version of the book.  However,
their patches don't represent a copy from the version of the book they
are working against.  The person commiting their changes must know what
rev they were working against, copy the English book at that rev to the
new translation location, and then apply the patch.  The patch comments
help with this, but I don't think it should be necessary for them to do
this by hand.

b) If not us then who?  If we won't create a patch format that supports
describing all the changes our repositories can version then exaction
who will?  Why would someone set out to do this?

While the unified patch format (which only supports file contents) is
generally applicable to all SCM tools.  Something that does more than it
can't really be that way.  We have properties and other versioned
meta-data.  We version trees.  We use global revision numbers.  I'm not
sure it makes sense or is possible to have a general purpose format that
can describe everything we support.  

But I do believe it is desireable for us to have a format to support all
of our features.

So this is my alternative proposal to the problems we face.

1) Develop a new format that supports our features:

  * Copies
  * Deletes
  * Moves (We don't support these directly now but we intend to in the
           future, so we should allow for this)
  * Add without history
  * Property Modifications
  * File content modifications

  Make svn diff output this format if you pass it a flag.  I'm not clear
  on what this flag should be called.  But I'm thinking something along
  the lines of how cvs handled unified diff.

2) Make svn diff without this flag produce output that is usable to
   reproduce the file changes with the standard patch command.  I.E.
   svn diff would essentially default to --show-copies in Jack's
   proposal.

3) At 2.0.0 change the default to our diff format and require --unified
   or -u to get the current output.  If we don't develop this new format
   before 2.0.0 then so be it.

How does this resolve Jack's problems?  Once the new format is the
default the output will be in the context of just using SVN.  I.E.
copies will show as just a succinct representation of a copy.  It
wouldn't include the content of the file you copied.  It would be a
description of what SVN would do to the repo if you committed right now.
THis is likely what most of our users want most of the time.

However, somtimes our users have to interact with people using other SCM
systems or no SCM system and just patch.  They then want a diff output
that is verbose enough to apply the changes they made.

If people still want a unified diff option that ignores copied files,
then we could I suppose add an option for that.  But I suspect this
covers the real use cases people have.  People that don't want to see
copies are going to be content with using our format and that format can
serve the double purpose of being supported by a svn patch command that
reads it and applies the changes to the working copy (or potentially
directly to the repo without a wc).  

Thoughts?

-- 
Ben Reser <be...@reser.org>
http://ben.reser.org

"Conscience is the inner voice which warns us somebody may be looking."
- H.L. Mencken

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Russ Allbery <rr...@stanford.edu>:
> This sounds a lot like the thread of conversation that Tom Lord tried to
> start, way back when, when it was really too early to think about this.  I
> think arch has since come up with a diff format that meets these
> requirements; maybe that would be a good place to start looking?

Yes, agreed.  Googling is not finding me a description of it, however.

I was actually thinking of arch as the other SCM (besides Bitkeeper) that
this would put Subversion head-to-head with.

My opinion (and not a spur-of-the-moment one; I thought the possibilities
through months ago) is that this is the most natural next direction of 
evolution for Subversion once the "better CVS" objective is met.

As, I judge, it now is.  At a design level the move to fsfs is cleanup.
It appears to me (admittedly, a semi-outsider) that now is the *exactly*
right time to begin thinking about action diff and repo sync.

Consider yourselves nudged.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Florian Weimer <fw...@deneb.enyo.de>.
* Rob Weir:

>> The arch changeset format is rather strongly tied to arch semantics.
>
> Well, it's tied to a particular way of looking at changesets, yeah, and
> I'd be interested in what other semantics you'd like?

Special files are stored directly in the tree (mostly .arch-ids).

The namespace of intra-file IDs (arch-tag: lines in files) and
.arch-ids tags is separate.  This might mean that this distinction
must be presevered even by non-arch users.

Binary files are not handled efficiently.

It's not possible to express most metadata used by other systems in
arch changesets.

Creation of arch changesets seems to require a working copy model
based on implicit tagging (at least for some project trees).  Implicit
tagging seems to be unique to arch.  A particular regular expression
library is required to create changesets (at least if you want to deal
with all ).

Other systems rely on fast application of changeset, and so far,
nobody has demonstrated that it's possible to write something that can
apply dozens of changesets per second, even for fairly small trees.

Some systems are not able to extract arch changesets as a file system
tree (because of path length issues or file name character set
problems).

There is virtually no documentation of the format, so it's hard to
tell what's an implementation artifact and what's a property of the
format itself.

>> This means that it doesn't handle copies, just renames, and you need
>> some kind of file ID.  (You could tack copy support onto it, but you'd
>> lose arch compatibility to some extent.)
>
> Hm, what do you want copies to mean in the context of a changeset?

Well, something similar to a rename operation, but without the
deletion of the original file.

>> It's a binary format, and this is the real obstacle IMHO.  
>
> Erm, no it's not.  It's a directory structure, which is often tar'd and
> gzip'd.

I don't know of any human-readable serialization, the .tar.gz variant
is all we have at this point.

>> The curious result is that nobody actually sends arch changesets to
>> mailing lists, even though they are considered tremendously important
>> by the arch community.
>
> No, that the changesets are in seperate files is irrelevant.  The
> essential point is that they are *conceptually* seperate; you can take
> one changeset out of a branch and apply it to others easily.  You can
> serialise it, compose it, rip it apart and make new ones.

But this is not the way tla is usually used.  tla lacks several
fundamental commands to construct changesets from named revisions.  In
order to use the patch queue managers for tla that have been proposed
so far, you don't submit a changeset for application to some version,
but you provide a name of a revision to merge with.  No changeset is
explicitly transferred, even in this case.

This is whayt I mean when I claim that arch changesets are actually an
implementation detail.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Rob Weir <rw...@ertius.org>.
[sorry for the late reply]

On Thu, Sep 16, 2004 at 08:30:46AM +0200, Florian Weimer said
> * Russ Allbery:
> 
> > This sounds a lot like the thread of conversation that Tom Lord tried to
> > start, way back when, when it was really too early to think about this.  I
> > think arch has since come up with a diff format that meets these
> > requirements; maybe that would be a good place to start looking?
> 
> The arch changeset format is rather strongly tied to arch semantics.

Well, it's tied to a particular way of looking at changesets, yeah, and
I'd be interested in what other semantics you'd like?

> This means that it doesn't handle copies, just renames, and you need
> some kind of file ID.  (You could tack copy support onto it, but you'd
> lose arch compatibility to some extent.)

Hm, what do you want copies to mean in the context of a changeset?

> It's a binary format, and this is the real obstacle IMHO.  

Erm, no it's not.  It's a directory structure, which is often tar'd and
gzip'd.

> The curious result is that nobody actually sends arch changesets to
> mailing lists, even though they are considered tremendously important
> by the arch community.

No, that the changesets are in seperate files is irrelevant.  The
essential point is that they are *conceptually* seperate; you can take
one changeset out of a branch and apply it to others easily.  You can
serialise it, compose it, rip it apart and make new ones.

-rob

-- 
Words of the day:    Craig Livingstone covert video Oil deals Manfurov Verisign

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Florian Weimer <fw...@deneb.enyo.de>.
* Russ Allbery:

> This sounds a lot like the thread of conversation that Tom Lord tried to
> start, way back when, when it was really too early to think about this.  I
> think arch has since come up with a diff format that meets these
> requirements; maybe that would be a good place to start looking?

The arch changeset format is rather strongly tied to arch semantics.
This means that it doesn't handle copies, just renames, and you need
some kind of file ID.  (You could tack copy support onto it, but you'd
lose arch compatibility to some extent.)

It's a binary format, and this is the real obstacle IMHO.  The curious
result is that nobody actually sends arch changesets to mailing lists,
even though they are considered tremendously important by the arch
community.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Russ Allbery <rr...@stanford.edu>.
Eric S Raymond <es...@thyrsus.com> writes:

> I've been thinking about action diffs for a different reason. Let's
> suppose we have two primitive operations: "generate action diff between
> revisions M and N of repository X" and "apply action diff to
> repository".

> With these and a very little state information ("I'm a daughter
> repostory of X, forked at commit T"), we could start doing
> Bitkeeper-like repository sync operations.  That's the possibility that
> interests me -- of supporting hierarchies of repositories in which
> action diffs are used to support shipping changes up towards the root.

This sounds a lot like the thread of conversation that Tom Lord tried to
start, way back when, when it was really too early to think about this.  I
think arch has since come up with a diff format that meets these
requirements; maybe that would be a good place to start looking?

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "C. Michael Pilato" <cm...@collab.net>.
Ben Collins-Sussman <su...@collab.net> writes:

> We also already have a patch-like representation of these tree deltas: 
> it's our dump format.  The dump format is meant to be portable, and has
> an RFC-822 "look and feel".  Take a look at it.  It may be 90% of what
> we need for defining this new patch format.

Bah.  We had a much better representation of this a loooong time ago.
We called it "the XML editor".  Back before Subversion could talk to a
repository or a database, it would checkout, update, and commit from
XML files.  These files were XML representations of our editor drives,
*exactly* denoting how to transfer one tree into another, binary diffs
and propchanges and everything.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Marc Haisenko <ha...@webport.de>.
On Thursday 16 September 2004 07:43, Ben Reser wrote:
> Basically I was thinking of something similar to feel of traditional
> diffs, but that supported representing our full feature set.  I think
> the dump format doesn't really fit that bill.  It's a fine format for
> dumps and seems to be optimized to make a very fast parser (important
> when loading large dumps).
>
> We need something reasonably compact (even at the cost of parsing
> difficulty), human readable (so it can serve the dual purpose of
> displaying details on the changes and how to apply a changeset to the
> wc/repo), and without binary diffs unless the file is clearly binary.
>
> Binary diffs should probably be encoding to some format that is friendly
> to email and text editors.  Maybe BASE64 our svndiff format for them.
>
> I do think the dump format points out exactly what functionality we need
> to support.

First of all, I'm just an interested user who is following this list for 
months and like to comment a bit :-)

The BASE64 part/being friendly to email is a very important aspect, I think. 
We've all seen diffs in emails, haven't we ? And we'd surely see svn diffs in 
emails as well (or snippets from diffs) so playing nice with email is must, 
IMHO.

For me as a user it would be really nice to have two options: either make a 
"normal" diff to send someone else who can then apply the patch to his 
(unversioned) source or have a "full-blown" diff just for svn.

The "normal" diff would not work with binaries, of course, so the user should 
be warned if there are changes in binary data.

My first thought was that maybe the "full-blown" diff could use the unified 
format as well and extend it in a way so that the produced patches still get 
accepted by the normal patch command (it seems to ignore stuff that is not 
part of what it should be in the patch).

Then again this could cause quite a lot of problems when binary files are 
involved but patch only changes text files and leaves the binary files 
unchanged. This is unexpected by the user and would surely cause confusion. 
So the format of the "full-blown" svn diff should be chosen so that the 
normal patch command rejects it, IMHO.

-- 
Marc Haisenko
Systemspezialist
Webport IT-Services GmbH
mailto: haisenko@webport.de

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Ben Reser <be...@reser.org>.
On Wed, Sep 15, 2004 at 08:51:48PM -0500, Ben Collins-Sussman wrote:
> We also already have a patch-like representation of these tree deltas: 
> it's our dump format.  The dump format is meant to be portable, and has
> an RFC-822 "look and feel".  Take a look at it.  It may be 90% of what
> we need for defining this new patch format.
> 
> It's documented here:
> 
>   http://svn.collab.net/repos/svn/trunk/notes/fs_dumprestore.txt
> 
> The original versions 1 and 2 of this format always included full-texts
> of files.  But ghudson introduced format version 3 in svn 1.1, which
> allows binary diffs instead of full-texts.  That's a huge win.
> 
> Is this a good starting point?  Doesn't it seem like we're mostly there?

Basically I was thinking of something similar to feel of traditional
diffs, but that supported representing our full feature set.  I think
the dump format doesn't really fit that bill.  It's a fine format for
dumps and seems to be optimized to make a very fast parser (important
when loading large dumps).  

We need something reasonably compact (even at the cost of parsing
difficulty), human readable (so it can serve the dual purpose of
displaying details on the changes and how to apply a changeset to the
wc/repo), and without binary diffs unless the file is clearly binary.

Binary diffs should probably be encoding to some format that is friendly
to email and text editors.  Maybe BASE64 our svndiff format for them.

I do think the dump format points out exactly what functionality we need
to support.

-- 
Ben Reser <be...@reser.org>
http://ben.reser.org

"Conscience is the inner voice which warns us somebody may be looking."
- H.L. Mencken

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Ben Collins-Sussman <su...@collab.net>:
> We also already have a patch-like representation of these tree deltas: 
> it's our dump format.  The dump format is meant to be portable, and has
> an RFC-822 "look and feel".  Take a look at it.  It may be 90% of what
> we need for defining this new patch format.

That is an excellent, *excellent* point.  It actually passed through my
head when I was meditating on action diffs a few months back, but I
managed to forget it until you raised it again.

> It's documented here:
> 
>   http://svn.collab.net/repos/svn/trunk/notes/fs_dumprestore.txt
> 
> The original versions 1 and 2 of this format always included full-texts
> of files.  But ghudson introduced format version 3 in svn 1.1, which
> allows binary diffs instead of full-texts.  That's a huge win.
> 
> Is this a good starting point?  Doesn't it seem like we're mostly there?

Yes, indeed it does.  We should think of a dump file is just an action diff
from the empty tree.  That unifies the concept of a dump with an action diff.

The unification was about as far as I got in my previous thinking.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Ben Collins-Sussman <su...@collab.net>.
On Wed, 2004-09-15 at 20:41, Eric S. Raymond wrote:
> Ben Reser <be...@reser.org>:
> > 1) Develop a new format that supports our features:
> >
> > Thoughts?
> 
> I'm not a committer, so I'm not sure my opinion either counts or
> *should* count in a discussion like this. 

You don't need to be a committer for this list to acknowledge a
well-thought out opinion.  It's open to the public.  :-)

> 
> But for whatever it's worth, I support Ben's proposal.  I even have
> a name for it:  "action diff".

Subversion already has a function vtable called an 'editor_t'.  It's the
main mechanism used to describe the difference between two trees:  adds,
deletes, copies, propchanges, text-changes.  (See svn_delta.h for a
verbose description.)   That's how the svn client and server send
'action diffs' to each other right now... by marshalling these vtable
function-calls over a network.

We also already have a patch-like representation of these tree deltas: 
it's our dump format.  The dump format is meant to be portable, and has
an RFC-822 "look and feel".  Take a look at it.  It may be 90% of what
we need for defining this new patch format.

It's documented here:

  http://svn.collab.net/repos/svn/trunk/notes/fs_dumprestore.txt

The original versions 1 and 2 of this format always included full-texts
of files.  But ghudson introduced format version 3 in svn 1.1, which
allows binary diffs instead of full-texts.  That's a huge win.

Is this a good starting point?  Doesn't it seem like we're mostly there?



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Branko Čibej <br...@xbc.nu>:
> >The answer is yes, I think every repository should have all its
> >history, ala BK.  I think file-IDs in the arch sense are a mistake.
> >Rather, I am imagining a system in which the basic primitives
> >are diff and merge *on change histories* rather than on working
> >copy files.
> > 
> >
> Doesn't that make it impossible to keep parts of a replicated repository 
> hidden (for security reasons)?

It does, yes.  I don't see this as a seruius problem.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Branko Čibej <br...@xbc.nu>.
Eric S. Raymond wrote:

>Greg Hudson <gh...@MIT.EDU>:
>  
>
>>On Tue, 2004-10-05 at 01:13, Rob Weir wrote:
>>    
>>
>>>Eric, how do you plan to implement a distributed version control without
>>>file-ids[0]?  Or do you think every repository should have all it's
>>>history, ala BK?
>>>      
>>>
>>>[0]: this was more or less the point of the discussions on the changeset
>>>list last year.
>>>      
>>>
>>Yeah, and I expect we'll hit the same dead end; we're not currently
>>prepared to reconceptualize Subversion in arch terms in order to have
>>arch-style distributed version control.
>>    
>>
>
>Sorry, I missed the earlier messages in this thread because of the
>subject.  I gather the "Eric" that question is addressed to is me.
>
>The answer is yes, I think every repository should have all its
>history, ala BK.  I think file-IDs in the arch sense are a mistake.
>Rather, I am imagining a system in which the basic primitives
>are diff and merge *on change histories* rather than on working
>copy files.
>  
>
Doesn't that make it impossible to keep parts of a replicated repository 
hidden (for security reasons)?

>(The conceptual resemblance to BK is not an accident; I had some influence on
>Larry McVoy's thinking back in 1998.)
>  
>
I'll only note that, as far as I can see, BK has no notion of path-based 
security.

-- Brane


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Greg Hudson <gh...@MIT.EDU>:
> On Tue, 2004-10-05 at 01:13, Rob Weir wrote:
> > Eric, how do you plan to implement a distributed version control without
> > file-ids[0]?  Or do you think every repository should have all it's
> > history, ala BK?
> 
> > [0]: this was more or less the point of the discussions on the changeset
> > list last year.
> 
> Yeah, and I expect we'll hit the same dead end; we're not currently
> prepared to reconceptualize Subversion in arch terms in order to have
> arch-style distributed version control.

Sorry, I missed the earlier messages in this thread because of the
subject.  I gather the "Eric" that question is addressed to is me.

The answer is yes, I think every repository should have all its
history, ala BK.  I think file-IDs in the arch sense are a mistake.
Rather, I am imagining a system in which the basic primitives
are diff and merge *on change histories* rather than on working
copy files.

(The conceptual resemblance to BK is not an accident; I had some influence on
Larry McVoy's thinking back in 1998.)
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2004-10-05 at 01:13, Rob Weir wrote:
> Eric, how do you plan to implement a distributed version control without
> file-ids[0]?  Or do you think every repository should have all it's
> history, ala BK?

> [0]: this was more or less the point of the discussions on the changeset
> list last year.

Yeah, and I expect we'll hit the same dead end; we're not currently
prepared to reconceptualize Subversion in arch terms in order to have
arch-style distributed version control.

But I do have one new argument against file IDs: in a world where some
of development history is supposed to remain hidden from some parties,
file IDs leak information.

To use an example not too far from the real world, suppose you are a
fab, and you work with several fiercly competetive chip design companies
within the same repository, which has a "public" area (accessible to
everyone) and private areas which are accessible by the fab staff and a
particular partner.

Now, suppose one of the partners, company A, takes a public chip design,
does some work on it within the private area, and produces another chip
design for the public area.  To the fab staff and company A's staff, the
history of the files for this chip design should look like:

  second public incarnation <-- private <-- first public incarnation

To company B (which isn't even supposed to be privy to who the fab's
other partners are), the history of the second public incarnation should
hit a brick wall at the first arrow.  But file IDs will allow company B
to deduce that these files were eventually derived from the first public
incarnation, which may be information useful in competition ("it seems
someone is deriving memory chip designs from clock chip designs").

Does this mean public file IDs are an invalid idea for version control
software?  Not necessarily; arch has (as I understand it) little
interest in this kind of privilege separation, so there's no issue
there.  Can we have good distributed version control with good smart
merge support without file IDs?  Can copy-tracking be meaningful in a
distributed context?  I don't really know; distributed version control
and smart merging aren't really my areas of interest.  (I know svk has
implemented these things without file IDs, but I don't know how it
works.)  I'm just throwing this out there.  It may be that Subversion
eventually decides it is more interested in arch-style distributed
version control than it is in this kind of privilege separation.

I'm going to note this message in issue #1525 ('Use new "entity ID"
notion instead of "committed rev"') because the proposed "entity ID"
concept is a kind of public file ID.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Rob Weir <rw...@ertius.org>.
[sorry for the late reply]

On Thu, Sep 16, 2004 at 08:57:43AM -0400, Eric S. Raymond said
> Ben Collins-Sussman <su...@collab.net>:
> > In defining an extended patch format, it seems like there are a lot of
> > different goals to choose from here.  And they all seem to be mutually
> > exclusive.  What's most important to us?
> > 
> >   * arch compatibility?
> 
> I don't think this will work.  Someone else pointed out that (a) the format
> is binary

And as I pointed out, this is actually not true.

> , and (b) it involves an arch-specific notion of file IDs.

Eric, how do you plan to implement a distributed version control without
file-ids[0]?  Or do you think every repository should have all it's
history, ala BK?

-rob

[0]: this was more or less the point of the discussions on the changeset
list last year.

-- 
Words of the day:         Albright MD5 Qaddafi White House Ft. Knox Majic CESID

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Florian Weimer <fw...@deneb.enyo.de>.
* François Beausoleil:

> Eric S. Raymond wrote:
>
>> C. Michael Pilato <cm...@collab.net>:
>>>Add a Text-content-type: header, used only when "Text-deltas: true" is
>>>present, and whose value is either "text/plain" or
>>>"application/x-svndiff" (to mean "Larry Wall's Patch Format" or
>>>"Subversion binary diff data").
>> Good thinking!  I like it.
>
> Isn't there a better content type than "text/plain" for the patch format 
> ?  Something like text/x-patch ?

If there isn't, it shouldn't be too hard to get something from IANA
(without the "x-" prefix, of course).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: svn diff output

Posted by François Beausoleil <fb...@ftml.net>.
Eric S. Raymond wrote:

> C. Michael Pilato <cm...@collab.net>:
>>Add a Text-content-type: header, used only when "Text-deltas: true" is
>>present, and whose value is either "text/plain" or
>>"application/x-svndiff" (to mean "Larry Wall's Patch Format" or
>>"Subversion binary diff data").
> 
> Good thinking!  I like it.

Isn't there a better content type than "text/plain" for the patch format 
?  Something like text/x-patch ?

Bye,
François


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
C. Michael Pilato <cm...@collab.net>:
> > 2. After step 1 is done, try to change the dumper/parser so that it handles
> > a format Larry Wall's patch(1) understands in simple cases.
> 
> Add a Text-content-type: header, used only when "Text-deltas: true" is
> present, and whose value is either "text/plain" or
> "application/x-svndiff" (to mean "Larry Wall's Patch Format" or
> "Subversion binary diff data").

Good thinking!  I like it.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Ben Collins-Sussman <su...@collab.net>.
On Thu, 2004-09-16 at 09:03, C. Michael Pilato wrote:

> Add a Text-content-type: header, used only when "Text-deltas: true" is
> present, and whose value is either "text/plain" or
> "application/x-svndiff" (to mean "Larry Wall's Patch Format" or
> "Subversion binary diff data").

That's a really simple solution.  The more I think about it, the more
elegant it is.  

'patch' will ignore all the RFC822 headers -- actually, it will ignore
*everything* but unified diff blocks.  But 'svn' will parse everything,
picking up the tree changes, binary diffs, and unified diffs.  




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "C. Michael Pilato" <cm...@collab.net>.
"Eric S. Raymond" <es...@thyrsus.com> writes:

> I suggest the following plan:
> 
> 1. Implement the action diff and repo sync primitives using whatever
> enhanced version of the editor_t object we turn out to need.  Stick
> with marshalling/unmarshalling editor_t objects to the dump format or
> a close enhancement.

The dump format is already marshaling editor_t calls.  This work is
done.

> 2. After step 1 is done, try to change the dumper/parser so that it handles
> a format Larry Wall's patch(1) understands in simple cases.

Add a Text-content-type: header, used only when "Text-deltas: true" is
present, and whose value is either "text/plain" or
"application/x-svndiff" (to mean "Larry Wall's Patch Format" or
"Subversion binary diff data").

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Mark Phippard <Ma...@softlanding.com>.
"Eric S. Raymond" <es...@thyrsus.com> wrote on 09/16/2004 08:57:43 AM:

> By doing it in two steps, we make sure Subversion's functional 
objectives
> are satisfied *first*.  I think being patch(1)-compatible is a laudable
> goal but a secondary one.

I agree that this should be the primary goal of any such feature.

Mark


_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Ben Collins-Sussman <su...@collab.net>:
> In defining an extended patch format, it seems like there are a lot of
> different goals to choose from here.  And they all seem to be mutually
> exclusive.  What's most important to us?
> 
>   * arch compatibility?

I don't think this will work.  Someone else pointed out that (a) the format
is binary, and (b) it involves an arch-specific notion of file IDs.
I think we want something purely textual that we can mail around.
 
>   * svk compatibility?

I'm mildly against this, as I think svk is a kluge.  A kluge with some
interesting ideas in it and an energetic and clever project leader, mind 
you, but still a kluge.

>   * 'patch' compatibility?  (i.e. defining a format whereby
> tree-structure changes are just ignored by Larry Wall's 'patch'?)
> 
>   * trying to use what we have?  (tweaking our dumpfile format)
> 
> I think we can choose only one of these goals.  :-)

Oh?  I'm not sure why.  The dumpfile format is not written in stone, 
after all.  I can readily imagine moving the surface syntax to being
patch-compatible.  However, I don't think it would be wise to try
doing that at the same time we're working out the semantic issues in
dump/patch unification.  There would be too much risk of allowing the 
classic patch format to contrain our thinking.

I suggest the following plan:

1. Implement the action diff and repo sync primitives using whatever
enhanced version of the editor_t object we turn out to need.  Stick
with marshalling/unmarshalling editor_t objects to the dump format or
a close enhancement.

2. After step 1 is done, try to change the dumper/parser so that it handles
a format Larry Wall's patch(1) understands in simple cases.

By doing it in two steps, we make sure Subversion's functional objectives
are satisfied *first*.  I think being patch(1)-compatible is a laudable
goal but a secondary one.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Sander Striker <st...@apache.org>.
> From: "Eric S. Raymond" <es...@thyrsus.com>
> Sent: Thursday, September 16, 2004 3:00 PM


> Folker Schamel <sc...@spinor.com>:
> >   * talk with svk, arch, xyz developers to specify a patch format standard?
> >     To be a successor of the 'patch' format?
> > 
> > (Yes, I'm idealistic ;-)
> 
> You know, I was actually thinking that if we design something really good
> it could be a candidate for an RFC.  If that happens, I can lead on the draft.
> I have experience with that process and some zorch with the IETF guys.

This is what we had in mind for the changesets@ lists.  Unfortunately
we weren't able to move forward much.

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Folker Schamel <sc...@spinor.com>:
>   * talk with svk, arch, xyz developers to specify a patch format standard?
>     To be a successor of the 'patch' format?
> 
> (Yes, I'm idealistic ;-)

You know, I was actually thinking that if we design something really good
it could be a candidate for an RFC.  If that happens, I can lead on the draft.
I have experience with that process and some zorch with the IETF guys.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Folker Schamel <sc...@spinor.com>.
Ben Collins-Sussman wrote:

> On Thu, 2004-09-16 at 02:59, John Peacock wrote:
> 
> 
>>He has also recently introduced a patch management feature which will allow 
>>patch submission and management, either as conventional unified diffs, or as a 
>>semi-custom format which has exactly the same effect on the Subversion level. 
>>He intends to build a simple patch manager app which someone could use to 
>>automate applying supplied patches to the repository.
>>
> 
> 
> Wow.
> 
> In defining an extended patch format, it seems like there are a lot of
> different goals to choose from here.  And they all seem to be mutually
> exclusive.  What's most important to us?
> 
>   * arch compatibility?
> 
>   * svk compatibility?
> 
>   * 'patch' compatibility?  (i.e. defining a format whereby
> tree-structure changes are just ignored by Larry Wall's 'patch'?)
> 
>   * trying to use what we have?  (tweaking our dumpfile format)


   * talk with svk, arch, xyz developers to specify a patch format standard?
     To be a successor of the 'patch' format?

(Yes, I'm idealistic ;-)


> 
> I think we can choose only one of these goals.  :-)
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Ben Collins-Sussman <su...@collab.net>.
On Thu, 2004-09-16 at 02:59, John Peacock wrote:

> He has also recently introduced a patch management feature which will allow 
> patch submission and management, either as conventional unified diffs, or as a 
> semi-custom format which has exactly the same effect on the Subversion level. 
> He intends to build a simple patch manager app which someone could use to 
> automate applying supplied patches to the repository.
> 

Wow.

In defining an extended patch format, it seems like there are a lot of
different goals to choose from here.  And they all seem to be mutually
exclusive.  What's most important to us?

  * arch compatibility?

  * svk compatibility?

  * 'patch' compatibility?  (i.e. defining a format whereby
tree-structure changes are just ignored by Larry Wall's 'patch'?)

  * trying to use what we have?  (tweaking our dumpfile format)

I think we can choose only one of these goals.  :-)



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
John Peacock <jp...@rowman.com>:
> Eric S. Raymond wrote:
> >With these and a very little state information ("I'm a daughter
> >repostory of X, forked at commit T"), we could start doing
> >Bitkeeper-like repository sync operations.  That's the possibility
> >that interests me -- of supporting hierarchies of repositories in
> >which action diffs are used to support shipping changes up towards the
> >root.
> 
> Take a look at CLKao's work with SVK:
> 
> 	http://svk.elixus.org
> 
> which is a tool built on top of Subversion that among other things supports 
> repository mirrors (via VCP to CVS and P4 even).  It doesn't really do 
> hierachies of repositories, but it does support mirroring a remote 
> repository and maintaining a local development stream which can ultimately 
> be used to commit to the parent repository.  He has also implemented smart 
> merging (and cherry pick merges), so no more need to keep track of when the 
> last merge was applied to a branch.

I have looked at svk a bit.  There are some interesting ideas there, but
the entirety feels to me like something of a kluge.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by John Peacock <jp...@rowman.com>.
Eric S. Raymond wrote:
> With these and a very little state information ("I'm a daughter
> repostory of X, forked at commit T"), we could start doing
> Bitkeeper-like repository sync operations.  That's the possibility
> that interests me -- of supporting hierarchies of repositories in
> which action diffs are used to support shipping changes up towards the
> root.

Take a look at CLKao's work with SVK:

	http://svk.elixus.org

which is a tool built on top of Subversion that among other things supports 
repository mirrors (via VCP to CVS and P4 even).  It doesn't really do 
hierachies of repositories, but it does support mirroring a remote repository 
and maintaining a local development stream which can ultimately be used to 
commit to the parent repository.  He has also implemented smart merging (and 
cherry pick merges), so no more need to keep track of when the last merge was 
applied to a branch.

He has also recently introduced a patch management feature which will allow 
patch submission and management, either as conventional unified diffs, or as a 
semi-custom format which has exactly the same effect on the Subversion level. 
He intends to build a simple patch manager app which someone could use to 
automate applying supplied patches to the repository.

John

-- 
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4720 Boston Way
Lanham, MD 20706
301-459-3366 x.5010
fax 301-429-5747

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Ben Reser <be...@reser.org>:
> 1) Develop a new format that supports our features:
>
> Thoughts?

I'm not a committer, so I'm not sure my opinion either counts or
*should* count in a discussion like this. 

But for whatever it's worth, I support Ben's proposal.  I even have
a name for it:  "action diff".

I've been thinking about action diffs for a different reason. Let's
suppose we have two primitive operations: "generate action diff
between revisions M and N of repository X" and "apply action diff to
repository".

With these and a very little state information ("I'm a daughter
repostory of X, forked at commit T"), we could start doing
Bitkeeper-like repository sync operations.  That's the possibility
that interests me -- of supporting hierarchies of repositories in
which action diffs are used to support shipping changes up towards the
root.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Ben Reser <be...@reser.org>.
On Thu, Sep 16, 2004 at 09:32:35AM -0700, Ben Reser wrote:
> On Thu, Sep 16, 2004 at 09:26:37AM -0500, kfogel@collab.net wrote:
> > (Not sure what I said to make anyone think otherwise, sorry if I
> > misled somehow.)
> 
> >From January 15th 2004, all times are US/Pacific aka UTC-7

Actually it would have been UTC-8 in January.  I hate daylight savings
time.

-- 
Ben Reser <be...@reser.org>
http://ben.reser.org

"Conscience is the inner voice which warns us somebody may be looking."
- H.L. Mencken

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Ben Reser <be...@reser.org>.
On Thu, Sep 16, 2004 at 10:09:43AM -0500, kfogel@collab.net wrote:
> It's clear (I hope) that what I meant was that I had other priorities
> then, and had no intention of taking up this task.
> 
> That is, I was making a passing comment in IRC about my priorities at
> that time.  I didn't mean that the Subversion project should never
> implement an improved patch format that can represent all the actions
> Subversion supports.  I even encouraged you to take it on toward the
> end there, see? :-)

Re: svn diff output

Posted by kf...@collab.net.
Ben Reser <be...@reser.org> writes:
> On Thu, Sep 16, 2004 at 09:26:37AM -0500, kfogel@collab.net wrote:
> > (Not sure what I said to make anyone think otherwise, sorry if I
> > misled somehow.)

Thanks, Ben.

It's clear (I hope) that what I meant was that I had other priorities
then, and had no intention of taking up this task.

That is, I was making a passing comment in IRC about my priorities at
that time.  I didn't mean that the Subversion project should never
implement an improved patch format that can represent all the actions
Subversion supports.  I even encouraged you to take it on toward the
end there, see? :-)

-Karl

> >From January 15th 2004, all times are US/Pacific aka UTC-7
> 
> 15:52  * kfogel grumbles about patch not adding and deleting files
> 15:52  * breser wonders when someone is going to quit grumbling and fix
> that.
> 15:53 < dsp> probably when you get off your bum and stop wondering about
> it
> 15:53  * sussman vanishes in a puff of periwinkle smoke
> 15:53 -!- sussman [~sussman@newton.ch.collab.net] has quit ["Client
> exiting"]
> 15:53 < breser> Hey I'm fixing something else right now. :)
> 15:55 < CIA-4> kfogel committed revision 8328: Follow up to r8327:
> forgot to 'svn delete' and 'svn add' the
> 15:57 <@kfogel> breser: uh, that's not something we just "fix" in
> Subversion.
> 15:57 <@kfogel> The bug, or feature lack, is in 'patch' :-).
> 15:57 <@kfogel> Not that that's any excuse -- I should have been awake
> enough to add/del the files myself
> 15:57 <@kfogel> .
> 15:57 < breser> So replace patch.
> 15:58 <@kfogel> breser: Are you going to give me logins on everyone's
> system, on the whole Internet? :-)
> 15:58 < breser> Nobody will stop using patch until someone writes
> something better.
> 15:59 <@kfogel> breser: yes, sure.
> 15:59 <@kfogel> I have no intention of taking up that burden.
> 15:59 <@kfogel> Feel free, if you want it.
> 16:00  * kfogel suspects it's pretty huge :-)
> 
> -- 
> Ben Reser <be...@reser.org>
> http://ben.reser.org
> 
> "Conscience is the inner voice which warns us somebody may be looking."
> - H.L. Mencken
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn diff output

Posted by Ben Reser <be...@reser.org>.
On Thu, Sep 16, 2004 at 09:26:37AM -0500, kfogel@collab.net wrote:
> (Not sure what I said to make anyone think otherwise, sorry if I
> misled somehow.)

Re: svn diff output

Posted by kf...@collab.net.
Ben Reser <be...@reser.org> writes:
> I remember talking with Karl about this and he didn't think it was this
> projects job to come up with a format that achieves that.  But I think
> it is.  Here's why:

Just for the record, I've always thought an enhanced patch format
would be a Good Thing.  I'm pretty sure I even started threads on it
in the distant past.

(Not sure what I said to make anyone think otherwise, sorry if I
misled somehow.)

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org