You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Paul Libbrecht <pa...@activemath.org> on 2004/07/26 14:27:45 UTC
Pluggeable diff...
Hello List,
We recently played with the cmd-diff and cmd-diff3 settings in the hope
of getting "pluggeable diffs" as we wished... and... well.. it works
but it is not at all what we understood as pluggeable diffs.
Namely, we had expected such a setting to be the diff used to compute,
at commit time, the difference to be stored in the database (more or
less equivalent to what's stored as fragements of the ",v" files of
CVS), and, at update time, the merge algorithm to modify the files.
We have an XML-diff and would like to put it to use inside such a tool
as subversion. The latter should provide us the storage and transport
mechanism, being agnostic of the data of the diffs and updates... Maybe
I should call this the "patch" format.
Did I understand wrongly ?
How hard would it be to modify Subversion so that the patch-format is
pluggeable.
paul
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Pluggeable diff...
Posted by Branko Čibej <br...@xbc.nu>.
Paul Libbrecht wrote:
> Le 26 juil. 04, à 16:32, C. Michael Pilato a écrit :
>
>>> Did I understand wrongly ? How hard would it be to modify
>>> Subversion so that the patch-format is pluggeable.
>>
>> Why would you actually want this? I think some real use-cases --
>> including where Subversion is failing you now -- would help out here.
>
>
> One use case is to be able to base on a more descriptive
> representation of change to allow, for example:
> - source respecting updates (e.g. respecting an "own" identation scheme)
> - more explicit merging, including the ability to show merging within
> a user-interface: (the person has added this element and you have
> added this element as well, what should we do ? Currently, such
> conflict resolution is done in the source!)
> - more explicit merging may mean a better computation of the update
> operations' commutativity, hence less frequent conflicts.
> - more explicit merging may also mean a better "management of change"
> where a tool may analyze the incoming changes and warn on the impact
> of things that depend on that (or do the same at commit time, so that
> you know whose content you may impact by publishing such a change)
>
> Hope that gives some light, I do think there's more but these are a
> few important ones, I think.
But none of these should touch the way the server and client exchange
data. I think there's a misconception at work here again. You can
already plug in your own diff on the client, using the diff-cmd and
diff3-cmd config options (granted, it would take a bit of work to make
their behaviour depend on the type of file), and you can maintain
invariants (e.g., enforce an indentation shceme) on the server in
pre-commit hooks. But changing the _delta_ algorithm -- the one that
determines how the server and client communicate changes, and how
changes are stored in the repository -- doesn't make sense. Yes, you
could get slightly more efficient storage for certain kinds of files,
but you wouldn't achieve any of the goals you mention above.
-- Brane
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Pluggeable diff...
Posted by Paul Libbrecht <pa...@activemath.org>.
Le 26 juil. 04, à 16:32, C. Michael Pilato a écrit :
>> We have an XML-diff and would like to put it to use inside such a tool
>> as subversion. The latter should provide us the storage and transport
>> mechanism, being agnostic of the data of the diffs and
>> updates... Maybe I should call this the "patch" format.
> Subversion is already agnostic -- it treats all files as binary when
> transferring changes between repos and client.
"changes", that's what we want to affect.
>> Did I understand wrongly ? How hard would it be to modify
>> Subversion so that the patch-format is pluggeable.
> Why would you actually want this? I think some real use-cases --
> including where Subversion is failing you now -- would help out here.
One use case is to be able to base on a more descriptive representation
of change to allow, for example:
- source respecting updates (e.g. respecting an "own" identation scheme)
- more explicit merging, including the ability to show merging within a
user-interface: (the person has added this element and you have added
this element as well, what should we do ? Currently, such conflict
resolution is done in the source!)
- more explicit merging may mean a better computation of the update
operations' commutativity, hence less frequent conflicts.
- more explicit merging may also mean a better "management of change"
where a tool may analyze the incoming changes and warn on the impact of
things that depend on that (or do the same at commit time, so that you
know whose content you may impact by publishing such a change)
Hope that gives some light, I do think there's more but these are a few
important ones, I think.
paul
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Pluggeable diff...
Posted by Daniel Berlin <db...@dberlin.org>.
On Jul 26, 2004, at 3:46 PM, Paul Libbrecht wrote:
> Well, even this would be interesting, I think... that would be a half
> solution... but only half (as it would need to know the exact byte
> position of an XPath coordinate thus shouldn't tolerate, e.g., a
> whitespace difference.
I'm not sure you are understanding (it's hard to tell), so just in case
you arent:
svndiff is the name of the format that is actually stored in the
database.
It is a set of encoded instructions for how to produce some data (in
our case, a revision), given some other data (in our case, a previous
revision).
The binary diff algorithm svn uses, vdelta, produces svndiff
instructions.
You have two options for changing the binary diff algorithm:
1. Either make yours produce svndiff instructions (which i don't see
the real benefit in, since it will still be just a set of svndiff
instructions on binary data)
or
2. change svndiff so it can accomodate what you want to do.
svndiff is actually versioned now with the current version number being
0 (i produced an svndiff version 1, with better encoding/compression,
and in the process, added a version number to svndiff), and the version
number is stored in the svndiff stream.
You could produce an "svndiff version 2" that was nothing like the
existing svndiff at all (you could make it an xml based diff that works
in a completely different way, whatever), and as long as you teach
libsvn_delta to produce/read it, everything should work.
I consider these easy, but then again, i've done this before (As i
said, i produced an svndiff version 1 whose encoding is completely
different than the current svndiff version 0). It took me roughly 3-4
weeks to produce working code that did this.
What you can't currently do without going outside libsvn_delta, is
tell it to use svndiff version 0 for some files (IE use the standard
binary diff algorithm for some files), and tell it to use "svndiff
version 2" (your xml diff format) for other files. That would require
passing that information down from somewhere on high.
Of course, none of this is "pluggable" diffs, in the sense that they
are all hard-coded diff algorithms.
You could theoretically plug in any program you want to do the
encoding/decoding (as long as it can handle arbitrary data, and produce
something that we can store in the database), but it needs to always be
available, and always work, or else you'd seriously be f*cked.
The long and short of it is that hardcoded *new* diff algorithms for
certain types of files isn't actually all that hard. It's probably a
month or two of subversion hacking for an experienced subversion
hacker.
You can do special xml diffing that way if you wanted.
Plugging in random diff and merge programs at diff and merge time is a
completely intractable idea.
If your program changed its format, or failed to work with some
arbitrary data, it would make your revision database worthless (unless
you encode the way to decode the data, into the encoding format, which
is actually what svndiff does).
> Is the application of such patch really scattered around the source
> code ?
Theoretically, it should only touch libsvn_delta.
>
> paul
>
>
> Le 26 juil. 04, à 19:59, Daniel Berlin a écrit :
>
>> In fact, you *could* plug in your xmldiff, though it would likely be
>> pointless, since you'd have to take the results and turn it into
>> byte-oriented insert + copy instructions, which svndiff uses.
>>
>> :)
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Pluggeable diff...
Posted by Paul Libbrecht <pa...@activemath.org>.
Well, even this would be interesting, I think... that would be a half
solution... but only half (as it would need to know the exact byte
position of an XPath coordinate thus shouldn't tolerate, e.g., a
whitespace difference.
Is the application of such patch really scattered around the source
code ?
paul
Le 26 juil. 04, à 19:59, Daniel Berlin a écrit :
> In fact, you *could* plug in your xmldiff, though it would likely be
> pointless, since you'd have to take the results and turn it into
> byte-oriented insert + copy instructions, which svndiff uses.
>
> :)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Pluggeable diff...
Posted by Daniel Berlin <db...@dberlin.org>.
On Jul 26, 2004, at 10:32 AM, C. Michael Pilato wrote:
> Paul Libbrecht <pa...@activemath.org> writes:
>
>> We recently played with the cmd-diff and cmd-diff3 settings in the
>> hope of getting "pluggeable diffs" as we wished... and... well.. it
>> works but it is not at all what we understood as pluggeable diffs.
>>
>> Namely, we had expected such a setting to be the diff used to compute,
>> at commit time, the difference to be stored in the database (more or
>> less equivalent to what's stored as fragements of the ",v" files of
>> CVS), and, at update time, the merge algorithm to modify the files.
>
> Er, no, that's not at all what it is. Subversion uses a binary
> differencing algorithm, very *not* pluggable,
I think you meant to say: You can plug in any diff algorithm you want
into the source code, as long as it generates the internal svndiff
format in the end, which is very non-pluggable ;)
In fact, you *could* plug in your xmldiff, though it would likely be
pointless, since you'd have to take the results and turn it into
byte-oriented insert + copy instructions, which svndiff uses.
:)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Pluggeable diff...
Posted by "C. Michael Pilato" <cm...@collab.net>.
Paul Libbrecht <pa...@activemath.org> writes:
> We recently played with the cmd-diff and cmd-diff3 settings in the
> hope of getting "pluggeable diffs" as we wished... and... well.. it
> works but it is not at all what we understood as pluggeable diffs.
>
> Namely, we had expected such a setting to be the diff used to compute,
> at commit time, the difference to be stored in the database (more or
> less equivalent to what's stored as fragements of the ",v" files of
> CVS), and, at update time, the merge algorithm to modify the files.
Er, no, that's not at all what it is. Subversion uses a binary
differencing algorithm, very *not* pluggable, for getting information
to and from the repository. The --diff-cmd and --diff3-cmd options
are just for setting the programs used to generate contextual diffs,
client-side only, for the purposes of display and merging.
> We have an XML-diff and would like to put it to use inside such a tool
> as subversion. The latter should provide us the storage and transport
> mechanism, being agnostic of the data of the diffs and
> updates... Maybe I should call this the "patch" format.
Subversion is already agnostic -- it treats all files as binary when
transferring changes between repos and client.
> Did I understand wrongly ? How hard would it be to modify
> Subversion so that the patch-format is pluggeable.
Why would you actually want this? I think some real use-cases --
including where Subversion is failing you now -- would help out here.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org