You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@btopenworld.com> on 2006/03/13 11:18:25 UTC
Pegged diffs [was: [PATCH] svn diff -r1:BASE URL asserts]

Malcolm Rowe wrote (earlier):
> Okay, perhaps I have the definition of a 'pegged diff' wrong - what
> exactly is it?  I assumed, I guess, that it was one in which we needed
> to follow history to get the name of the item at the operative revisions.

Gosh, this thread has really made me think hard about what we mean.  After 
several hours and several revisions of my reply, here's what I think, starting 
from first principles and working up towards higher-level concepts.


THE MEANING OF PEG REVISIONS

An OBJECT under version control (I mean a file or a directory tree; would 
"node" be more consistent with other docs?) is identified by one of:

   * a WC-path (under version control)

   * a URL in a certain revision of the repository

An object has a LINE OF HISTORY which records its life from creation, 
potentially through modifications, renames, and/or deletion.  (Looking forward 
in history, there is the complication that there are potentially multiple lines 
of history because of copying, but that's not relevant to the present 
discussion.)  Identifying an object at any stage of its life implicitly 
identifies its line of history.

When a revision is used in conjunction with a path-in-repository to identify an 
object, and the object's line of history is followed in order to find a later 
or earlier revision in that history, the initial revision is known as the PEG 
REVISION.

If we just want to locate a particular object as it exists now or existed in 
history, and we know what its path was at that time and do not need to follow 
its history, we will specify the object in the same way (WC path, or URL in a 
particular revision), and we might still call this particular revision the 
"peg" revision, particularly when referring to the syntax "@REV" with which it 
is specified.

Any target to an 'svn' command is therefore specified as:

   WCPATH      (implicit @WC)

   URL         (implicit @HEAD)
   URL@HEAD
   URL@N
   URL@{DATE}

When we specify a plain WC path, we are specifying an object by the path it has 
"now" in this working copy, not at some previous time when its WC path might 
have been different.  After "svn move" or "svn rm", the base path will no 
longer exist on disk.  The following is not currently supported, but would be 
logical:

   WCPATH@BASE (to specify the WCPATH that existed before mv/rm)

The following do not make sense:

   URL@WC/BASE/COMMITTED/PREV  (these keywords have no meaning without a WC)

   WCPATH@HEAD/N/{DATE}        (a WC path has no intrinsic meaning in repos *)

   WCPATH@COMMITTED/PREV       (these keywords have no meaning until a WC is
                                consulted, but the peg syntax requires them to
                                identify a revision in which the WC path can
                                be found)


Then, in many commands, a separate OPERATIVE REVISION (or a pair of them) can 
be specified with "--revision" and Subversion will trace the identified object 
through history to that revision, where its name might be different.

"-rX URL@REV" means:
   From the object found at URL in revision REV,
   trace back or forward to revision X.

"-rX wc-path" means:
   From the object found at wc-path (in the WC),
   trace back or forward to revision X.

Note that a peg revision applied to a local path doesn't make sense because a 
local path (assumed to refer to a versioned object) already locates the object 
uniquely; the peg is implicitly "WC".

<bug>
The syntax described in "svn help" for some command implies that we currently 
allow "wc-path@REV" for arbitrary REVs.

"-rX wc-path@REV" means... what?
   Trace from WC to REV and then further to X?  REV is redundant if so.
   Trace from WC to REV and ignore X?  "diff --old" does, but that's bad.

(* In the future we might want to give some meaning to "path@REV", where "path" 
looks like a WC path, but perhaps means a relative URL based on the current 
directory's URL: "foo@100" -> (url(".") + "/foo")@100.  This is not a proposal, 
just an example of a reason why we should disallow "wc-path@REV" now.)
</bug>

So "wc-path" always implies a "peg" revision of "WC", in the sense that the WC 
is where "wc-path" will be found.  (Let's assume for now that we won't allow 
the explicit syntax "wc-path@REV".)

For commands that operate on two revisions of a single object, primarily diff 
and merge, it is necessary to know in which revision the specified path is to 
be found, and a separate peg revision specifier is a convenient and flexible 
solution.  Commands that operate on a single object (at a time) support a 
separately specified peg revision only for convenience and could instead 
require the object's path to be specified exactly as it exists in the operative 
revision.


Does this all make sense so far?


The two TYPES OF DIFF INVOCATION that we support are:

   (a) Difference between one revision and another revision of an object.

   (b) Difference between one object and another which may be unrelated.  This 
is more general, and potentially covers the first type.

In both types, our user interface allows multiple targets, but those are 
handled by high-level iteration in svn_cl__diff().  Although that's 
inefficient, it means we don't presently have to think about multiple targets 
in this discussion which pertains to libsvn_client and lower layers.

In type (a), there is one object's line of history that must be identified, and 
also a "start" revision and an "end" revision within that line.  This is what I 
think we have been calling a "pegged diff".  It corresponds to type (1) 
invocation syntax in "svn help diff" and is handled by svn_client_diff_peg3(). 
  I would say now that "pegged" is not the clearest name for it; something like 
"diff of an object across revisions" would be better.

In type (b), there are two objects that must be identified.  Each can be 
identified by means of a local path, or a URL at a certain revision (which we 
might be tempted to call a "peg" revision even if we are not going to trace 
history from it to another revision).  This corresponds to type (2) syntax in 
"svn help diff" and is handled by svn_client_diff3().  If the two objects 
happen to be linked by history, then there may be scope for behavioural options 
which I won't discuss yet.


I get the impression that there is some confusion about the "real" meaning of 
these concepts, commands and functions, as it is easy to think that a "pegged 
diff" is any of:

   * a diff where "TARGET@REV" syntax was used
   * a diff where at least one object has to be traced through history
   * a diff comparing two revisions of a single object

These are all slightly different.

Are we happy to accept the third variant, the concept of comparing two 
revisions of a single object (necessarily within its own line of history), as 
the primary concept that differentiates svn_client_diff3() and 
svn_client_diff_peg3() and the like?


Malcolm Rowe wrote:
>>Okay, perhaps that wasn't too clear.  In summary:
>>
>>* I don't think a BASE:WORKING diff is pegged, because by definition
>>  theres no history-tracing to do.  However...

Has the above discussion changed your mind, at least if by "pegged" we mean my 
type (a), thus should call svn_client_diff_peg3()?

>>* I don't mind if we just call the pegged version of diff unconditionally,
> 
> where possible (i.e., only in the single-path case)

I don't get that.  If you mean single-path as opposed to multi-path, I 
certainly don't want a special case for a single path; I thought this would 
work for any number of paths.  If you mean single-path as opposed to two-path, 
as in the "--old/--new" form, then yes, the two-path form is necessarily my 
type (b), which we may colloquially refer to as "non-pegged".


>>  as long as we pass a peg-revision of svn_opt_revision_unspecified for
>>  the BASE:WORKING case.

I'd say the peg should be _revision_working in that case.


>>The latter currently isn't well-defined, though I'm just about to go
>>off and fix that, making the diff (and possibly merge; I've not checked)
>>functions more like the rest of the functions in libsvn_client.
> 
> though we will still need different functions, as the _peg versions only
> take a single path.

As opposed to two paths, I assume you mean.  I realise this conversation may be 
obsolete by now, but if it's still relevant, why is it a problem that the _peg 
functions only take a single path?

- Julian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org