You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Stefan Sperling <st...@elego.de> on 2010/03/30 14:29:56 UTC

allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

On Mon, Mar 29, 2010 at 03:37:51PM -0000, rhuijben@apache.org wrote:
> Author: rhuijben
> Date: Mon Mar 29 15:37:50 2010
> New Revision: 928806
> 
> URL: http://svn.apache.org/viewvc?rev=928806&view=rev
> Log:
> * notes/wc-ng/conflict-storage
>   Update schema to have some global information that applies to all
>   conflict data. Remove this data from the individual locations as it
>   can't differ between these locations anyway (or the node would have
>   been skipped). Finally add unversioned obstructions as a conflict
>   class.

> -Text conflicts
> ---------------
> +Common conflict data
> +--------------------
>  
> -Text conflicts only exist on files. The following information
> -is stored if there is a text conflict on the node:
> +Some information is shared for all conflict data that applies to a node. E.g.
> +when a node has a combination of text and property conflicts these were
> +always caused by the same operation. (Any later operation will skip the node
> +unless the conflicts are resolved)
>  

I'm still not sure that I agree with this.

I don't think it's smart to restrict the conflict storage layer
to accept only a single set of conflicts for a node, all of which
belong together to the point where you cannot do anything to the
node unless you have resolved *all* conflicts.

I'd like the storage layer to allow for resolving each type of conflicts
separately from the other types. Higher-level operations can then decide
whether it is safe to do something to the partially-conflicted node.
E.g. if a file has only property conflicts, why should a merge skip
merging changes into the file itself? If a directory is tree conflicted,
why should I not be allowed to merge some content into a file below
the tree-conflicted directory if doing so helps me to resolve the tree
conflict?

Of course, the higher layer can still decide not to allow certain
things which would "conflict with the conflict".

"Because our current API does not allow you to do that" is not a good
argument for not taking the opportunity of making the storage layer more
flexible. We can easily add new APIs that allow partial conflict
resolution to take place on nodes. The old API would still work as is,
i.e. if you use 1.6.x function to query about conflicts, any conflict
set on the node will trigger a positive answer. But the new API would be
more fine-grained, and would allow detailed queries about the data
stored about conflicts. Policiy decisions then happen above the storage
layer.

Stefan

Re: allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Mar 31, 2010 at 09:50:33AM +0200, Bert Huijben wrote:
> > -----Original Message-----
> > From: Stefan Sperling
> > Why not allow application of a patch which carries a diff that would
> > not
> > touch a conflicted property before the user has dealt with property
> > conflicts?
> 
> Merge tracking?
> 
> And even your very simple merge of 'svn merge ^/blah/foo.c@2 ^/blah/foo.c@3
> bar.c' can bring in property changes. It is not a text only merge.

I'm well aware of that. Maybe you missed this part of my
sentence: "not touch a conflicted property" ?

> You would have to run a full dry run of the merge to determine whether it
> would bring something incompatible.. And merge is slow enough as is.
> If we don't do the dry run merge, how are you going to handle the case where
> you do get the conflict?
> (We don't have the necessary storage to perform a rollback to before you
> started the merge)

I know that merge right now cannot do this properly.
But that is because of how merge is currently implemented.
It's not because it is absolutely impossible to do.
So let's think outside the box for a minute. Just because merge has
this limitation now does not mean it will have this limitation forever.

E.g. we may some day change the merge to be 2-pass.
The client would first receive a summary describing intended edits,
tree- and property changes. It could then check the conflicted state
of all affected nodes, and either allow the merge to proceed or abort
the merge, depending on whether the merge wants to edit conflicted
content, change conflicted properties, or change conflicted trees.

A 2-pass scheme would also allow us to e.g. resolve tree conflicts
during the first pass (matching up the two trees involved in the merge,
possibly with interactive prompting) and then apply text/property
changes during the second pass using the current merge algorithm
which assumes that trees on either side are identical.

So, why do you think the storage layer should care about any of this?
I think it should not care. It should not assume that our merge
operation will forever behave as it does now.

And I am not simply making requirements up.
I have seen release engineering people in a company do giant merges,
and then ask many developers to their desk, one after the other,
to help them resolve conflicts.
They had to do lots of manual edits to get all the conflicts resolved.
They could not use svn merge to apply diffs, but it would have helped
them (arguably, better rename handling may also have helped them to
the same degree.)

> Same applies for 'svn patch'. That will allow updating properties (maybe not
> in 1.7, unless it applies autoprops on new files. (Does it?))

Yes, it will allow updating properties.
But that does not mean that we couldn't check for conflicted properties
before applying any property changes from the patch. If you try to apply
a patch which wants to change conflicted properties, you get:
svn: Cannot apply patch foo/bar because property svn:eol-style on path
     foo.c is still in conflict.

Regarding autoprops, no, I don't think it does, because autoprops are
done by libsvn_client. Patch calls the add function from libsvn_wc.
Good catch!

> > I don't think you understand what I want to do.
> > Adding new conflicts on top of existing conflicts is not what I was
> > suggesting to support. I'm suggesting proper layering to keep doors
> > open for future enhancements. I want the storage layer to care about
> > storage, not about policy.
> 
> I don't think you can use these 'normal' operations to resolve your
> conflict. 
> 
> High level tools, can use very similar operations to resolve the conflict
> for you. (E.g. visual merge that retrieves all versions and perform the
> merge). But 'svn merge', 'svn update' and 'svn switch' are not safe on a
> conflicted tree.

Well, it seems we're unlikely to come to a common opinion on this
given that we've been reiterating the same points over and over in
this thread.

So... Greg said that we can always easily change the conflict skels
if we need more flexibility. So even if we limit the storage now,
we can remove these limits later (that may mean some additional work
later, but whatever).

So, asking everyone again, would you rather:

1) Encode into the storage layer that "all conflicts are created
   through a single operation only, and no other operation can run
   until all conflicts on a node are resolved, whether or not the
   operation really has anything to do with the recorded conflicts".

2) Give the storage layer a new point of view, saying "I don't care
   how conflicts come about, I just record them, including which
   operation has caused them. I will allow you to look at each conflict
   in isolation from all others, and resolve each independently in
   whatever way you see fit (even if you want to use additional
   operations to do so). I will not allow higher layers to record
   conflicts which conflict with what I've already recorded, but other
   than that I don't care."

I'd prefer 2), but would go for 1) if the majority thinks that's
better. Making progress is more important right now than endlessly
discussing subtle points like this.

Stefan

RE: allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

Posted by Bert Huijben <be...@qqmail.nl>.


> -----Original Message-----
> From: Stefan Sperling [mailto:stsp@elego.de]
> Sent: dinsdag 30 maart 2010 18:15
> To: Bert Huijben
> Cc: dev@subversion.apache.org
> Subject: Re: allowing multiple conflicts in storage (Re: svn commit:
> r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)
> 
> On Tue, Mar 30, 2010 at 05:55:00PM +0200, Bert Huijben wrote:
> >
> >
> > > -----Original Message-----
> > > From: Stefan Sperling [mailto:stsp@elego.de]
> > > Sent: dinsdag 30 maart 2010 16:30
> > > To: dev@subversion.apache.org
> > > Cc: commits@subversion.apache.org
> > > Subject: allowing multiple conflicts in storage (Re: svn commit:
> r928806 -
> > > /subversion/trunk/notes/wc-ng/conflict-storage)
> > >
> > > On Mon, Mar 29, 2010 at 03:37:51PM -0000, rhuijben@apache.org
> wrote:
> > > > Author: rhuijben
> > > > Date: Mon Mar 29 15:37:50 2010
> > > > New Revision: 928806
> > > >
> > > > URL: http://svn.apache.org/viewvc?rev=928806&view=rev
> > > > Log:
> > > > * notes/wc-ng/conflict-storage
> > > >   Update schema to have some global information that applies to
> all
> > > >   conflict data. Remove this data from the individual locations
> as it
> > > >   can't differ between these locations anyway (or the node would
> have
> > > >   been skipped). Finally add unversioned obstructions as a
> conflict
> > > >   class.
> > >
> > > > -Text conflicts
> > > > ---------------
> > > > +Common conflict data
> > > > +--------------------
> > > >
> > > > -Text conflicts only exist on files. The following information
> > > > -is stored if there is a text conflict on the node:
> > > > +Some information is shared for all conflict data that applies to
> a
> > node. E.g.
> > > > +when a node has a combination of text and property conflicts
> these were
> > > > +always caused by the same operation. (Any later operation will
> skip the
> > > node
> > > > +unless the conflicts are resolved)
> > > >
> > >
> > > I'm still not sure that I agree with this.
> > >
> > > I don't think it's smart to restrict the conflict storage layer
> > > to accept only a single set of conflicts for a node, all of which
> > > belong together to the point where you cannot do anything to the
> > > node unless you have resolved *all* conflicts.
> > >
> > > I'd like the storage layer to allow for resolving each type of
> conflicts
> > > separately from the other types. Higher-level operations can then
> decide
> > > whether it is safe to do something to the partially-conflicted
> node.
> > > E.g. if a file has only property conflicts, why should a merge skip
> > > merging changes into the file itself? If a directory is tree
> conflicted,
> > > why should I not be allowed to merge some content into a file below
> > > the tree-conflicted directory if doing so helps me to resolve the
> tree
> > > conflict?
> >
> > I intended to allow resolving only some parts of the conflict; but we
> can't
> > allow operations that can introduce conflicts on their own.
> 
> Sure.
> 
> > You can't apply a merge/update/switch/patch on a node that has a
> > text/property/obstruction/patch conflict, because a can cause a
> conflict
> > with the conflict state (How would you describe that?). You can't
> trust the
> > in-wc state if you have a conflict.
> >
> > In 1.6 the way to resolve a text conflict is to edit the in-wc text
> with all
> > the information provided and then mark it resolved.
> 
> Yes, and it's a pain that I can't just do:
> 
> 	svn merge ^/blah/foo.c@2 ^/blah/foo.c@3 bar.c
> 
> as part of my local edits.
> 
> And (speaking from a user perspective who doesn't know how svn works
> internally
> for a minute) why doesn't svn let me do this merge if bar.c is a tree
> conflict victim, but has not text conflicts recorded on it?
> 
> I know that it may be impossible for us to figure out what the merge
> really does before we carry out the merge. We might not know if the
> merge is safe. But why design the storage layer around this assumption?
> It's easy enough to enforce the same restriction at a higher layer.
> 
> > The same applies to properties: You read the '.prej' file and decide
> what
> > the new property value(s) must be and you mark all resolved. (Or just
> one
> > with the new storage).
> 
> Why force users to resolve property conflicts before they can e.g.
> merge
> additional content into a file using 'svn merge'?
> Why not allow application of a patch which carries a diff that would
> not
> touch a conflicted property before the user has dealt with property
> conflicts?

Merge tracking?

And even your very simple merge of 'svn merge ^/blah/foo.c@2 ^/blah/foo.c@3
bar.c' can bring in property changes. It is not a text only merge.


You would have to run a full dry run of the merge to determine whether it
would bring something incompatible.. And merge is slow enough as is.

If we don't do the dry run merge, how are you going to handle the case where
you do get the conflict?
(We don't have the necessary storage to perform a rollback to before you
started the merge)

Same applies for 'svn patch'. That will allow updating properties (maybe not
in 1.7, unless it applies autoprops on new files. (Does it?))

> 
> > But that doesn't make it possible to introduce new conflicts on top
> of
> > existing conflicts; that is +- impossible.
> 
> I don't think you understand what I want to do.
> Adding new conflicts on top of existing conflicts is not what I was
> suggesting to support. I'm suggesting proper layering to keep doors
> open for future enhancements. I want the storage layer to care about
> storage, not about policy.

I don't think you can use these 'normal' operations to resolve your
conflict. 

High level tools, can use very similar operations to resolve the conflict
for you. (E.g. visual merge that retrieves all versions and perform the
merge). But 'svn merge', 'svn update' and 'svn switch' are not safe on a
conflicted tree.

(update and switch just skip any conflicted node and merge tries to do some
things, but I don't know if your mergeinfo is correct if you merge over a
conflict)

	Bert
> 
> Stefan

Re: allowing multiple conflicts in storage

Posted by Julian Foad <ju...@wandisco.com>.

Stefan Sperling wrote:
> On Wed, Mar 31, 2010 at 07:34:41PM +0100, Julian Foad wrote:
> > OK I've attached a patch with ideas and suggestions.  (The above was
> > meant to be a concept that should be embodied in the design, not a
> > design in itself.)
> 
> I like it, please commit!

Done in r929709, excluding a couple of the queries which Greg cleared up
for me on IRC.

- Julian

> > In my dream world, the logic of "update" is
> > handled in the client layer as a sibling of "merge".
> 
> I also like your dream world :)
> 
> Stefan

Re: allowing multiple conflicts in storage

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Mar 31, 2010 at 07:34:41PM +0100, Julian Foad wrote:
> OK I've attached a patch with ideas and suggestions.  (The above was
> meant to be a concept that should be embodied in the design, not a
> design in itself.)

I like it, please commit!

> In my dream world, the logic of "update" is
> handled in the client layer as a sibling of "merge".

I also like your dream world :)

Stefan

Re: allowing multiple conflicts in storage

Posted by Julian Foad <ju...@wandisco.com>.

On Wed, 2010-03-31, Stefan Sperling wrote:
> On Wed, Mar 31, 2010 at 01:44:23PM +0100, Julian Foad wrote:
> > The Source of the Incoming Change
> > =================================
> > 
> > I suggest something like (? means an optional field):
> > 
> >   * A diff between nodes that have a location in a repository.
> > 
> >       left=(repo, rev, relpath, sha1?)
> >       right=(repo, rev, relpath, sha1?)
> > 
> >       # This source description is normally from merge/up/sw.
> >       # Full texts are (at least in principle) available.
> > 
> >   * A Patch File
> > 
> >       patch-file=(path, (info to identify which part of the
> >                          patch file applies to this node - e.g. byte
> >                          offsets or other info)?)
> >       reported-left=(repo?, rev?, relpath?)
> >       reported-right=(repo?, rev?, relpath?)
> > 
> >       # This source description is normally from operation 'patch'.
> >       # The reported-{left,right} are taken from info in the patch
> >       #   file if available: e.g. from "+++ foo (r1000)" we can fill
> >       #   in at least 'rev' and maybe 'relpath'.  These cannot be
> >       #   considered trustworthy even when given, as patch files can
> >       #   be constructed and edited at will.
> > 
> >   * A diff between nodes that don't have a repository.
> >       # e.g. a merge from a diff between two locally-modified WCs.
> >       # We have no need of this yet (not supported) but we should
> >       # expect to support this kind of source of change some time.
> 
> This sounds very interesting.
> 
> Can you try to adjust notes/wc-ng/conflict-storage accordingly?
> The notation you used is different from what is used there,
> so I'm not sure I really understand whether the above is meant
> to supplement the current design or not.
> I'm making some suggestions below but maybe you have better ideas.

OK I've attached a patch with ideas and suggestions.  (The above was
meant to be a concept that should be embodied in the design, not a
design in itself.)

> > A Change Affects the Whole Node
> > ===============================
> > 
> > It is important to record the whole-node source of the change, as far as
> > possible, even when only one property conflicts, because I want to be
> > able to see what the other properties and text were on {left, right,
> > mine} to help me resolve the conflict.  So we must not think of "a
> > property conflict" as being a fundamental thing: the fundamental thing
> > is "a change to this node conflicts with another change to this node".
> > The assumption that all properties are independent from each other and
> > independent from the text is a high-level approximation which the client
> > makes, to be more convenient in simple cases, but not a fundamental
> > truth, and we must not tie the conflict storage design to only
> > supporting this assumption.
> 
> I think the whole-node information you want is stored by the OPERATION
> skel I added today (we can still augment the OPERATION skel if you want
> more information in it).

Yup, that looks more like it.
 
> The OPERATION skel contains information as to what caused the conflict
> (the source), and the TYPE skel records details about the nature of the
> conflict. See notes/wc-ng/conflict-storage.
> 
> > In the same way, when we record a tree conflict, we want to record the
> > {left, right, mine} versions of the whole node.  We don't have the
> > infrastructure to do this yet - we don't have a pristine store for sets
> > of properties, let alone for directory trees - but we could do in
> > principle and I hope we will do in practice one day.
> 
> Recording left,right,mine trees would be nice, yes.
> 
> > The Part(s) of the Node Affected By The Change
> > ==============================================
> > 
> > Briefly: the WC probably needs to know only "this node is in conflict"
> > or perhaps tree-conflict versus content-conflict, and no more detail
> > than that.  By "know" I mean what the WC layer itself needs to
> > understand and use.
> 
> In the current model answering "is this node in conflict" means comparing
> the conflict_data column to NULL. To find out what kinds of conflicts
> there are, libsvn_wc can iterate the list of TYPE skels and look at the
> very first atom in each skel.

Great.

> > It needs to be able to *supply* much more detail,
> > of course, to the client layer, but that can (and should) be in a blob
> > whose content is opaque to libsvn_wc.  Layering is Good (unless the
> > layers are too many, too thin; but the WC layer can hardly be accused of
> > that). So I think it IS important to have a common part of the info that
> > libsvn_wc understands itself, and a detailed part that libsvn_wc simply
> > reports to libsvn_client and which libsvn_client composes and
> > decomposes.
> 
> Then maybe we should split the TYPE-specific and OPERATION skels
> into a separate list, such as:
> 
>   ((KIND (OPERATION KIND-SPECIFIC)) ((KIND (OPERATION KIND-SPECIFIC) ...)
> 
> The WC layer would only ever look at KIND, and not parse the skels any
> further than that.
> The client layer would supply and interpret data in (OPERATION KIND-SPECIFIC),
> setting the "blob" via libsvn_wc functions.

(As I said in my follow-up, I was confused between libsvn_client with
upper layer of libsvn_wc.  In my dream world, the logic of "update" is
handled in the client layer as a sibling of "merge".)

> Would this sufficiently fulfill the layering requirement?
> Probably, but there's a slight problem with this scheme.
> 
> E.g. I'd like the client layer to be able to ask about a conflict
> on a specific property. Or about the reject conflict caused by a
> specific hunk. This means that the wc layer will need to record some
> sort of identifier (e.g. a property's name) to tell several recorded
> property conflicts apart.

It's all just a question of where you draw the line above/below that API
and all the other ones you need.  I don't see how this particular
calling out of property names is special.

> So, we'd have something like this instead:
> 
>   ((KIND [ID] (OPERATION KIND-SPECIFIC))
>    (KIND [ID] (OPERATION KIND-SPECIFIC)) ...)
> 
> For types which can only be recorded once, the ID can be ommitted.
> 
> In such as scheme the client layer would use a low-level function
> interface like:
> 
> svn_wc__add_text_conflict(char *conflict_data)
> svn_wc__add_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__add_tree_conflict(char *conflict_data)
> svn_wc__add_reject_conflict(char *hunk_header, char* conflict_data)
> 
> svn_wc__list_prop_conflicts(apr_array_header_t **property_names)
> svn_wc__list_reject_conflicts(apr_array_header_t **hunk_headers)
> 
> svn_wc__get_text_conflict(char *conflict_data)
> svn_wc__get_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__get_tree_conflict(char *conflict_data)
> svn_wc__get_reject_conflict(char *hunk_name, char *conflict_data)
> 
> svn_wc__del_text_conflict(char *conflict_data)
> svn_wc__del_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__del_tree_conflict(char *conflict_data)
> svn_wc__del_reject_conflict(char *hunk_header, char* conflict_data)
> 
> (parameter types above are just for illustration, I know much of them
> could be const, and we'll also need pools etc.)
> 
> Higher-level conflict handling functions would then live entirely
> within libsvn_client.

No comment on API ideas.

> Note though that the current libsvn_client/libsvn_wc layering is
> not really clean (e.g. update editor in libsvn_wc, merge editor
> in libsvn_client), which may impose restrictions upon how dumb
> libsvn_wc can really be about conflicts.

Yup.

> > A reject is one way to record that some text change could not be applied
> > because it did not match the text found in the target file.  That is
> > logically a text conflict, but recorded in a different way, and with
> > less information about which lines of text go where.  It is certainly a
> > conflict of text rather than of properties or of tree structure.
> > 
> > So, we could have said:
> > 
> >   There are five types of conflict *description* (text conflict with
> > full texts, text conflict from a patch, property conflict, tree
> > conflict, and obstruction).
> > 
> > But property and tree conflicts can (later) come from a patch, so:
> > 
> >   There are seven types of conflict *description* (text conflict with
> > full texts, text conflict from a patch, property conflict with full
> > sources, property conflict from a patch, tree conflict with full
> > sources, tree conflict from a patch, and obstruction).
> > 
> > But in my opinion that not good.  As I said above, it's better to
> > distinguish the source of change from the kind of change as two separate
> > dimensions.
> 
> I believe we already distinguish the source of the change from the kind.
> I regard the reject as a separate kind of conflict at the storage layer
> because the information we store about rejects is different than what we
> store about text conflicts.

I think we're just arguing terminology and categorization.  We both now
understand that both the "reject" and "text" kinds of conflict
description describe a conflict in the text of the file.

- Julian

Re: allowing multiple conflicts in storage

Posted by Julian Foad <ju...@wandisco.com>.

On Wed, 2010-03-31, Stefan Sperling wrote:
> On Wed, Mar 31, 2010 at 01:44:23PM +0100, Julian Foad wrote:
> > The Source of the Incoming Change
> > =================================
> > 
> > I suggest something like (? means an optional field):
> > 
> >   * A diff between nodes that have a location in a repository.
> > 
> >       left=(repo, rev, relpath, sha1?)
> >       right=(repo, rev, relpath, sha1?)
> > 
> >       # This source description is normally from merge/up/sw.
> >       # Full texts are (at least in principle) available.
> > 
> >   * A Patch File
> > 
> >       patch-file=(path, (info to identify which part of the
> >                          patch file applies to this node - e.g. byte
> >                          offsets or other info)?)
> >       reported-left=(repo?, rev?, relpath?)
> >       reported-right=(repo?, rev?, relpath?)
> > 
> >       # This source description is normally from operation 'patch'.
> >       # The reported-{left,right} are taken from info in the patch
> >       #   file if available: e.g. from "+++ foo (r1000)" we can fill
> >       #   in at least 'rev' and maybe 'relpath'.  These cannot be
> >       #   considered trustworthy even when given, as patch files can
> >       #   be constructed and edited at will.
> > 
> >   * A diff between nodes that don't have a repository.
> >       # e.g. a merge from a diff between two locally-modified WCs.
> >       # We have no need of this yet (not supported) but we should
> >       # expect to support this kind of source of change some time.
> 
> This sounds very interesting.
> 
> Can you try to adjust notes/wc-ng/conflict-storage accordingly?
> The notation you used is different from what is used there,
> so I'm not sure I really understand whether the above is meant
> to supplement the current design or not.
> I'm making some suggestions below but maybe you have better ideas.

OK I've attached a patch with ideas and suggestions.  (The above was
meant to be a concept that should be embodied in the design, not a
design in itself.)

> > A Change Affects the Whole Node
> > ===============================
> > 
> > It is important to record the whole-node source of the change, as far as
> > possible, even when only one property conflicts, because I want to be
> > able to see what the other properties and text were on {left, right,
> > mine} to help me resolve the conflict.  So we must not think of "a
> > property conflict" as being a fundamental thing: the fundamental thing
> > is "a change to this node conflicts with another change to this node".
> > The assumption that all properties are independent from each other and
> > independent from the text is a high-level approximation which the client
> > makes, to be more convenient in simple cases, but not a fundamental
> > truth, and we must not tie the conflict storage design to only
> > supporting this assumption.
> 
> I think the whole-node information you want is stored by the OPERATION
> skel I added today (we can still augment the OPERATION skel if you want
> more information in it).

Yup, that looks more like it.
 
> The OPERATION skel contains information as to what caused the conflict
> (the source), and the TYPE skel records details about the nature of the
> conflict. See notes/wc-ng/conflict-storage.
> 
> > In the same way, when we record a tree conflict, we want to record the
> > {left, right, mine} versions of the whole node.  We don't have the
> > infrastructure to do this yet - we don't have a pristine store for sets
> > of properties, let alone for directory trees - but we could do in
> > principle and I hope we will do in practice one day.
> 
> Recording left,right,mine trees would be nice, yes.
> 
> > The Part(s) of the Node Affected By The Change
> > ==============================================
> > 
> > Briefly: the WC probably needs to know only "this node is in conflict"
> > or perhaps tree-conflict versus content-conflict, and no more detail
> > than that.  By "know" I mean what the WC layer itself needs to
> > understand and use.
> 
> In the current model answering "is this node in conflict" means comparing
> the conflict_data column to NULL. To find out what kinds of conflicts
> there are, libsvn_wc can iterate the list of TYPE skels and look at the
> very first atom in each skel.

Great.

> > It needs to be able to *supply* much more detail,
> > of course, to the client layer, but that can (and should) be in a blob
> > whose content is opaque to libsvn_wc.  Layering is Good (unless the
> > layers are too many, too thin; but the WC layer can hardly be accused of
> > that). So I think it IS important to have a common part of the info that
> > libsvn_wc understands itself, and a detailed part that libsvn_wc simply
> > reports to libsvn_client and which libsvn_client composes and
> > decomposes.
> 
> Then maybe we should split the TYPE-specific and OPERATION skels
> into a separate list, such as:
> 
>   ((KIND (OPERATION KIND-SPECIFIC)) ((KIND (OPERATION KIND-SPECIFIC) ...)
> 
> The WC layer would only ever look at KIND, and not parse the skels any
> further than that.
> The client layer would supply and interpret data in (OPERATION KIND-SPECIFIC),
> setting the "blob" via libsvn_wc functions.

(As I said in my follow-up, I was confused between libsvn_client with
upper layer of libsvn_wc.  In my dream world, the logic of "update" is
handled in the client layer as a sibling of "merge".)

> Would this sufficiently fulfill the layering requirement?
> Probably, but there's a slight problem with this scheme.
> 
> E.g. I'd like the client layer to be able to ask about a conflict
> on a specific property. Or about the reject conflict caused by a
> specific hunk. This means that the wc layer will need to record some
> sort of identifier (e.g. a property's name) to tell several recorded
> property conflicts apart.

It's all just a question of where you draw the line above/below that API
and all the other ones you need.  I don't see how this particular
calling out of property names is special.

> So, we'd have something like this instead:
> 
>   ((KIND [ID] (OPERATION KIND-SPECIFIC))
>    (KIND [ID] (OPERATION KIND-SPECIFIC)) ...)
> 
> For types which can only be recorded once, the ID can be ommitted.
> 
> In such as scheme the client layer would use a low-level function
> interface like:
> 
> svn_wc__add_text_conflict(char *conflict_data)
> svn_wc__add_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__add_tree_conflict(char *conflict_data)
> svn_wc__add_reject_conflict(char *hunk_header, char* conflict_data)
> 
> svn_wc__list_prop_conflicts(apr_array_header_t **property_names)
> svn_wc__list_reject_conflicts(apr_array_header_t **hunk_headers)
> 
> svn_wc__get_text_conflict(char *conflict_data)
> svn_wc__get_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__get_tree_conflict(char *conflict_data)
> svn_wc__get_reject_conflict(char *hunk_name, char *conflict_data)
> 
> svn_wc__del_text_conflict(char *conflict_data)
> svn_wc__del_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__del_tree_conflict(char *conflict_data)
> svn_wc__del_reject_conflict(char *hunk_header, char* conflict_data)
> 
> (parameter types above are just for illustration, I know much of them
> could be const, and we'll also need pools etc.)
> 
> Higher-level conflict handling functions would then live entirely
> within libsvn_client.

No comment on API ideas.

> Note though that the current libsvn_client/libsvn_wc layering is
> not really clean (e.g. update editor in libsvn_wc, merge editor
> in libsvn_client), which may impose restrictions upon how dumb
> libsvn_wc can really be about conflicts.

Yup.

> > A reject is one way to record that some text change could not be applied
> > because it did not match the text found in the target file.  That is
> > logically a text conflict, but recorded in a different way, and with
> > less information about which lines of text go where.  It is certainly a
> > conflict of text rather than of properties or of tree structure.
> > 
> > So, we could have said:
> > 
> >   There are five types of conflict *description* (text conflict with
> > full texts, text conflict from a patch, property conflict, tree
> > conflict, and obstruction).
> > 
> > But property and tree conflicts can (later) come from a patch, so:
> > 
> >   There are seven types of conflict *description* (text conflict with
> > full texts, text conflict from a patch, property conflict with full
> > sources, property conflict from a patch, tree conflict with full
> > sources, tree conflict from a patch, and obstruction).
> > 
> > But in my opinion that not good.  As I said above, it's better to
> > distinguish the source of change from the kind of change as two separate
> > dimensions.
> 
> I believe we already distinguish the source of the change from the kind.
> I regard the reject as a separate kind of conflict at the storage layer
> because the information we store about rejects is different than what we
> store about text conflicts.

I think we're just arguing terminology and categorization.  We both now
understand that both the "reject" and "text" kinds of conflict
description describe a conflict in the text of the file.

- Julian

Re: allowing multiple conflicts in storage

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Mar 31, 2010 at 01:44:23PM +0100, Julian Foad wrote:
> The Source of the Incoming Change
> =================================
> 
> I suggest something like (? means an optional field):
> 
>   * A diff between nodes that have a location in a repository.
> 
>       left=(repo, rev, relpath, sha1?)
>       right=(repo, rev, relpath, sha1?)
> 
>       # This source description is normally from merge/up/sw.
>       # Full texts are (at least in principle) available.
> 
>   * A Patch File
> 
>       patch-file=(path, (info to identify which part of the
>                          patch file applies to this node - e.g. byte
>                          offsets or other info)?)
>       reported-left=(repo?, rev?, relpath?)
>       reported-right=(repo?, rev?, relpath?)
> 
>       # This source description is normally from operation 'patch'.
>       # The reported-{left,right} are taken from info in the patch
>       #   file if available: e.g. from "+++ foo (r1000)" we can fill
>       #   in at least 'rev' and maybe 'relpath'.  These cannot be
>       #   considered trustworthy even when given, as patch files can
>       #   be constructed and edited at will.
> 
>   * A diff between nodes that don't have a repository.
>       # e.g. a merge from a diff between two locally-modified WCs.
>       # We have no need of this yet (not supported) but we should
>       # expect to support this kind of source of change some time.

This sounds very interesting.

Can you try to adjust notes/wc-ng/conflict-storage accordingly?
The notation you used is different from what is used there,
so I'm not sure I really understand whether the above is meant
to supplement the current design or not.
I'm making some suggestions below but maybe you have better ideas.

> A Change Affects the Whole Node
> ===============================
> 
> It is important to record the whole-node source of the change, as far as
> possible, even when only one property conflicts, because I want to be
> able to see what the other properties and text were on {left, right,
> mine} to help me resolve the conflict.  So we must not think of "a
> property conflict" as being a fundamental thing: the fundamental thing
> is "a change to this node conflicts with another change to this node".
> The assumption that all properties are independent from each other and
> independent from the text is a high-level approximation which the client
> makes, to be more convenient in simple cases, but not a fundamental
> truth, and we must not tie the conflict storage design to only
> supporting this assumption.

I think the whole-node information you want is stored by the OPERATION
skel I added today (we can still augment the OPERATION skel if you want
more information in it).

The OPERATION skel contains information as to what caused the conflict
(the source), and the TYPE skel records details about the nature of the
conflict. See notes/wc-ng/conflict-storage.

> In the same way, when we record a tree conflict, we want to record the
> {left, right, mine} versions of the whole node.  We don't have the
> infrastructure to do this yet - we don't have a pristine store for sets
> of properties, let alone for directory trees - but we could do in
> principle and I hope we will do in practice one day.

Recording left,right,mine trees would be nice, yes.

> The Part(s) of the Node Affected By The Change
> ==============================================
> 
> Briefly: the WC probably needs to know only "this node is in conflict"
> or perhaps tree-conflict versus content-conflict, and no more detail
> than that.  By "know" I mean what the WC layer itself needs to
> understand and use.

In the current model answering "is this node in conflict" means comparing
the conflict_data column to NULL. To find out what kinds of conflicts
there are, libsvn_wc can iterate the list of TYPE skels and look at the
very first atom in each skel.

> It needs to be able to *supply* much more detail,
> of course, to the client layer, but that can (and should) be in a blob
> whose content is opaque to libsvn_wc.  Layering is Good (unless the
> layers are too many, too thin; but the WC layer can hardly be accused of
> that). So I think it IS important to have a common part of the info that
> libsvn_wc understands itself, and a detailed part that libsvn_wc simply
> reports to libsvn_client and which libsvn_client composes and
> decomposes.

Then maybe we should split the TYPE-specific and OPERATION skels
into a separate list, such as:

  ((KIND (OPERATION KIND-SPECIFIC)) ((KIND (OPERATION KIND-SPECIFIC) ...)

The WC layer would only ever look at KIND, and not parse the skels any
further than that.
The client layer would supply and interpret data in (OPERATION KIND-SPECIFIC),
setting the "blob" via libsvn_wc functions.

Would this sufficiently fulfill the layering requirement?
Probably, but there's a slight problem with this scheme.

E.g. I'd like the client layer to be able to ask about a conflict
on a specific property. Or about the reject conflict caused by a
specific hunk. This means that the wc layer will need to record some
sort of identifier (e.g. a property's name) to tell several recorded
property conflicts apart.

So, we'd have something like this instead:

  ((KIND [ID] (OPERATION KIND-SPECIFIC))
   (KIND [ID] (OPERATION KIND-SPECIFIC)) ...)

For types which can only be recorded once, the ID can be ommitted.

In such as scheme the client layer would use a low-level function
interface like:

svn_wc__add_text_conflict(char *conflict_data)
svn_wc__add_prop_conflict(char *property_name, char *conflict_data)
svn_wc__add_tree_conflict(char *conflict_data)
svn_wc__add_reject_conflict(char *hunk_header, char* conflict_data)

svn_wc__list_prop_conflicts(apr_array_header_t **property_names)
svn_wc__list_reject_conflicts(apr_array_header_t **hunk_headers)

svn_wc__get_text_conflict(char *conflict_data)
svn_wc__get_prop_conflict(char *property_name, char *conflict_data)
svn_wc__get_tree_conflict(char *conflict_data)
svn_wc__get_reject_conflict(char *hunk_name, char *conflict_data)

svn_wc__del_text_conflict(char *conflict_data)
svn_wc__del_prop_conflict(char *property_name, char *conflict_data)
svn_wc__del_tree_conflict(char *conflict_data)
svn_wc__del_reject_conflict(char *hunk_header, char* conflict_data)

(parameter types above are just for illustration, I know much of them
could be const, and we'll also need pools etc.)

Higher-level conflict handling functions would then live entirely
within libsvn_client.

Note though that the current libsvn_client/libsvn_wc layering is
not really clean (e.g. update editor in libsvn_wc, merge editor
in libsvn_client), which may impose restrictions upon how dumb
libsvn_wc can really be about conflicts.

> A reject is one way to record that some text change could not be applied
> because it did not match the text found in the target file.  That is
> logically a text conflict, but recorded in a different way, and with
> less information about which lines of text go where.  It is certainly a
> conflict of text rather than of properties or of tree structure.
> 
> So, we could have said:
> 
>   There are five types of conflict *description* (text conflict with
> full texts, text conflict from a patch, property conflict, tree
> conflict, and obstruction).
> 
> But property and tree conflicts can (later) come from a patch, so:
> 
>   There are seven types of conflict *description* (text conflict with
> full texts, text conflict from a patch, property conflict with full
> sources, property conflict from a patch, tree conflict with full
> sources, tree conflict from a patch, and obstruction).
> 
> But in my opinion that not good.  As I said above, it's better to
> distinguish the source of change from the kind of change as two separate
> dimensions.

I believe we already distinguish the source of the change from the kind.
I regard the reject as a separate kind of conflict at the storage layer
because the information we store about rejects is different than what we
store about text conflicts.

Stefan

Re: allowing multiple conflicts in storage

Posted by Stefan Sperling <st...@elego.de>.

On Wed, Mar 31, 2010 at 02:14:13PM +0100, Julian Foad wrote:
> Where I talk about "libsvn_wc versus libsvn_client", I didn't mean that,
> I meant two sub-layers within libsvn_wc, the lower layer that handles
> purely storage versus the upper layer that performs updates etc. and
> needs to write the conflict descriptions into the storage.  I like to
> think of the software as layered like that, but you can ignore that part
> of the email if you like.

I guess having two layers within libsvn_wc makes it easier to
record conflicts from the update editor (unless we decide to
move that editor into libsvn_client).

Stefan

Re: allowing multiple conflicts in storage

Posted by Julian Foad <ju...@wandisco.com>.

Where I talk about "libsvn_wc versus libsvn_client", I didn't mean that,
I meant two sub-layers within libsvn_wc, the lower layer that handles
purely storage versus the upper layer that performs updates etc. and
needs to write the conflict descriptions into the storage.  I like to
think of the software as layered like that, but you can ignore that part
of the email if you like.

- Julian


On Wed, 2010-03-31 at 13:44 +0100, Julian Foad wrote:
> On Tue, 2010-03-30, Stefan Sperling wrote:
> > On Tue, Mar 30, 2010 at 05:56:54PM +0100, Julian Foad wrote:
> > > 
> > > > Conflict meta data storage in wc-ng
> > > > ===================================
> [...]
> I'm putting the last paragraph first for clarity.
> 
> > > I don't think
> > > it is right to call "patch conflict" in the same space as those others.
> > > A patch is a potential *source* of the change that conflicts, not a kind
> > > of conflict.
> > 
> > Well, I think the terminology has misled you.
> > Maybe we should just toss that name and call it "reject conflict" instead.
> 
> Ah, yes, you've got it.  That would help.
> 
> > I guess calling it "patch conflict" is confusing because "patch" is also
> > the name of an operation. An operation always uses potential sources of
> > conflicts (be it a delta to be merged, a property to be updated,
> > or a patch to be applied). But that is not why we need this new conflict
> > type. We need it because 'svn patch' added a new type of conflict that
> > didn't exist before. [...]
> 
> It added a new way of recording a text conflict; I find it confusing to
> call that a new "type of conflict".
> 
> We need to clearly distinguish the source of the incoming change from
> the description of which part(s) of the node is/are affected by this
> conflict.  For example, a patch will (later) be able to make changes to
> properties and to tree structure.  In that sense, a patch file is a
> source of changes (and not just the name of an operation), and we need a
> way to refer to this source of changes that is independent from saying
> "this conflict affects the file's text" versus "this conflict affects
> the node's properties" versus "this conflict affects the tree
> structure".
> 
> 
> The Source of the Incoming Change
> =================================
> 
> I suggest something like (? means an optional field):
> 
>   * A diff between nodes that have a location in a repository.
> 
>       left=(repo, rev, relpath, sha1?)
>       right=(repo, rev, relpath, sha1?)
> 
>       # This source description is normally from merge/up/sw.
>       # Full texts are (at least in principle) available.
> 
>   * A Patch File
> 
>       patch-file=(path, (info to identify which part of the
>                          patch file applies to this node - e.g. byte
>                          offsets or other info)?)
>       reported-left=(repo?, rev?, relpath?)
>       reported-right=(repo?, rev?, relpath?)
> 
>       # This source description is normally from operation 'patch'.
>       # The reported-{left,right} are taken from info in the patch
>       #   file if available: e.g. from "+++ foo (r1000)" we can fill
>       #   in at least 'rev' and maybe 'relpath'.  These cannot be
>       #   considered trustworthy even when given, as patch files can
>       #   be constructed and edited at will.
> 
>   * A diff between nodes that don't have a repository.
>       # e.g. a merge from a diff between two locally-modified WCs.
>       # We have no need of this yet (not supported) but we should
>       # expect to support this kind of source of change some time.
> 
> 
> A Change Affects the Whole Node
> ===============================
> 
> It is important to record the whole-node source of the change, as far as
> possible, even when only one property conflicts, because I want to be
> able to see what the other properties and text were on {left, right,
> mine} to help me resolve the conflict.  So we must not think of "a
> property conflict" as being a fundamental thing: the fundamental thing
> is "a change to this node conflicts with another change to this node".
> The assumption that all properties are independent from each other and
> independent from the text is a high-level approximation which the client
> makes, to be more convenient in simple cases, but not a fundamental
> truth, and we must not tie the conflict storage design to only
> supporting this assumption.
> 
> In the same way, when we record a tree conflict, we want to record the
> {left, right, mine} versions of the whole node.  We don't have the
> infrastructure to do this yet - we don't have a pristine store for sets
> of properties, let alone for directory trees - but we could do in
> principle and I hope we will do in practice one day.
> 
> 
> The Part(s) of the Node Affected By The Change
> ==============================================
> 
> Briefly: the WC probably needs to know only "this node is in conflict"
> or perhaps tree-conflict versus content-conflict, and no more detail
> than that.  By "know" I mean what the WC layer itself needs to
> understand and use.  It needs to be able to *supply* much more detail,
> of course, to the client layer, but that can (and should) be in a blob
> whose content is opaque to libsvn_wc.  Layering is Good (unless the
> layers are too many, too thin; but the WC layer can hardly be accused of
> that).
> 
> 
> So I think it IS important to have a common part of the info that
> libsvn_wc understands itself, and a detailed part that libsvn_wc simply
> reports to libsvn_client and which libsvn_client composes and
> decomposes.  Of course this means that libsvn_client has to be
> responsible for making, reverting and resolving individual prop changes
> and file text changes, and then calling libsvn_wc with updated conflict
> info, rather than calling a WC function that does both the change and
> updates the conflict info.  Is that sane?  Yes, I believe so.  Is it
> easy to get there from here?  Maybe not too bad: the present libsvn_wc
> "revert the node"/"resolve the node" functions can simply set the whole
> conflict info to nil without needing to know the details.
> 
> 
> > > > There are five types of conflicts (text conflicts, property conflicts,
> > > > tree conflicts, patch conflicts, and obstructions).
> > > 
> > > Whoa.  A patch can cause a text conflict
> > 
> > No. It cannot cause a text conflict. It causes a "reject" instead.
> 
> A reject is one way to record that some text change could not be applied
> because it did not match the text found in the target file.  That is
> logically a text conflict, but recorded in a different way, and with
> less information about which lines of text go where.  It is certainly a
> conflict of text rather than of properties or of tree structure.
> 
> So, we could have said:
> 
>   There are five types of conflict *description* (text conflict with
> full texts, text conflict from a patch, property conflict, tree
> conflict, and obstruction).
> 
> But property and tree conflicts can (later) come from a patch, so:
> 
>   There are seven types of conflict *description* (text conflict with
> full texts, text conflict from a patch, property conflict with full
> sources, property conflict from a patch, tree conflict with full
> sources, tree conflict from a patch, and obstruction).
> 
> But in my opinion that not good.  As I said above, it's better to
> distinguish the source of change from the kind of change as two separate
> dimensions.
> 
> 
> 
> > > Sources of a conflict are:
> > > 
> > > That's all for now.
> > 
> > Did you mean to make a list sources but Evolution wouldn't let you?
> 
> Oops, no, it was just a left-over fragment from my editing.
> 
> - Julian
> 
>

Re: allowing multiple conflicts in storage

Posted by Julian Foad <ju...@wandisco.com>.

On Tue, 2010-03-30, Stefan Sperling wrote:
> On Tue, Mar 30, 2010 at 05:56:54PM +0100, Julian Foad wrote:
> > 
> > > Conflict meta data storage in wc-ng
> > > ===================================
[...]
I'm putting the last paragraph first for clarity.

> > I don't think
> > it is right to call "patch conflict" in the same space as those others.
> > A patch is a potential *source* of the change that conflicts, not a kind
> > of conflict.
> 
> Well, I think the terminology has misled you.
> Maybe we should just toss that name and call it "reject conflict" instead.

Ah, yes, you've got it.  That would help.

> I guess calling it "patch conflict" is confusing because "patch" is also
> the name of an operation. An operation always uses potential sources of
> conflicts (be it a delta to be merged, a property to be updated,
> or a patch to be applied). But that is not why we need this new conflict
> type. We need it because 'svn patch' added a new type of conflict that
> didn't exist before. [...]

It added a new way of recording a text conflict; I find it confusing to
call that a new "type of conflict".

We need to clearly distinguish the source of the incoming change from
the description of which part(s) of the node is/are affected by this
conflict.  For example, a patch will (later) be able to make changes to
properties and to tree structure.  In that sense, a patch file is a
source of changes (and not just the name of an operation), and we need a
way to refer to this source of changes that is independent from saying
"this conflict affects the file's text" versus "this conflict affects
the node's properties" versus "this conflict affects the tree
structure".

The Source of the Incoming Change
=================================

I suggest something like (? means an optional field):

  * A diff between nodes that have a location in a repository.

      left=(repo, rev, relpath, sha1?)
      right=(repo, rev, relpath, sha1?)

      # This source description is normally from merge/up/sw.
      # Full texts are (at least in principle) available.

  * A Patch File

      patch-file=(path, (info to identify which part of the
                         patch file applies to this node - e.g. byte
                         offsets or other info)?)
      reported-left=(repo?, rev?, relpath?)
      reported-right=(repo?, rev?, relpath?)

      # This source description is normally from operation 'patch'.
      # The reported-{left,right} are taken from info in the patch
      #   file if available: e.g. from "+++ foo (r1000)" we can fill
      #   in at least 'rev' and maybe 'relpath'.  These cannot be
      #   considered trustworthy even when given, as patch files can
      #   be constructed and edited at will.

  * A diff between nodes that don't have a repository.
      # e.g. a merge from a diff between two locally-modified WCs.
      # We have no need of this yet (not supported) but we should
      # expect to support this kind of source of change some time.

A Change Affects the Whole Node
===============================

It is important to record the whole-node source of the change, as far as
possible, even when only one property conflicts, because I want to be
able to see what the other properties and text were on {left, right,
mine} to help me resolve the conflict.  So we must not think of "a
property conflict" as being a fundamental thing: the fundamental thing
is "a change to this node conflicts with another change to this node".
The assumption that all properties are independent from each other and
independent from the text is a high-level approximation which the client
makes, to be more convenient in simple cases, but not a fundamental
truth, and we must not tie the conflict storage design to only
supporting this assumption.

In the same way, when we record a tree conflict, we want to record the
{left, right, mine} versions of the whole node.  We don't have the
infrastructure to do this yet - we don't have a pristine store for sets
of properties, let alone for directory trees - but we could do in
principle and I hope we will do in practice one day.

The Part(s) of the Node Affected By The Change
==============================================

Briefly: the WC probably needs to know only "this node is in conflict"
or perhaps tree-conflict versus content-conflict, and no more detail
than that.  By "know" I mean what the WC layer itself needs to
understand and use.  It needs to be able to *supply* much more detail,
of course, to the client layer, but that can (and should) be in a blob
whose content is opaque to libsvn_wc.  Layering is Good (unless the
layers are too many, too thin; but the WC layer can hardly be accused of
that).

So I think it IS important to have a common part of the info that
libsvn_wc understands itself, and a detailed part that libsvn_wc simply
reports to libsvn_client and which libsvn_client composes and
decomposes.  Of course this means that libsvn_client has to be
responsible for making, reverting and resolving individual prop changes
and file text changes, and then calling libsvn_wc with updated conflict
info, rather than calling a WC function that does both the change and
updates the conflict info.  Is that sane?  Yes, I believe so.  Is it
easy to get there from here?  Maybe not too bad: the present libsvn_wc
"revert the node"/"resolve the node" functions can simply set the whole
conflict info to nil without needing to know the details.

> > > There are five types of conflicts (text conflicts, property conflicts,
> > > tree conflicts, patch conflicts, and obstructions).
> > 
> > Whoa.  A patch can cause a text conflict
> 
> No. It cannot cause a text conflict. It causes a "reject" instead.

A reject is one way to record that some text change could not be applied
because it did not match the text found in the target file.  That is
logically a text conflict, but recorded in a different way, and with
less information about which lines of text go where.  It is certainly a
conflict of text rather than of properties or of tree structure.

So, we could have said:

  There are five types of conflict *description* (text conflict with
full texts, text conflict from a patch, property conflict, tree
conflict, and obstruction).

But property and tree conflicts can (later) come from a patch, so:

  There are seven types of conflict *description* (text conflict with
full texts, text conflict from a patch, property conflict with full
sources, property conflict from a patch, tree conflict with full
sources, tree conflict from a patch, and obstruction).

But in my opinion that not good.  As I said above, it's better to
distinguish the source of change from the kind of change as two separate
dimensions.

> > Sources of a conflict are:
> > 
> > That's all for now.
> 
> Did you mean to make a list sources but Evolution wouldn't let you?

Oops, no, it was just a left-over fragment from my editing.

- Julian

RE: allowing multiple conflicts in storage

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Stefan Sperling [mailto:stsp@elego.de]
> Sent: dinsdag 30 maart 2010 19:34
> To: Julian Foad
> Cc: Bert Huijben; dev@subversion.apache.org
> Subject: Re: allowing multiple conflicts in storage
> 
> On Tue, Mar 30, 2010 at 05:56:54PM +0100, Julian Foad wrote:
> >
> > > Conflict meta data storage in wc-ng
> > > ===================================
> > >
> > > Conflict meta data is stored in the ACTUAL_NODE table, within the
> > > 'conflict_data' column. The data in this column is a skel containing
> > > conflict information, or NULL (meaning no conflict is present).
> >
> > Is storing a record in ACTUAL OK even for a node that (apart from the
> > conflict descripition) has no BASE and no ACTUAL in the WC?  E.g.
> > because it wasn't there and an incoming delete cause a tree conflict.
> > What values would the rest of the ACTUAL_NODE columns have in this
> case?
> 
> Not sure right now. I guess I don't know the schema well enough yet to
> answer this question. Anyone?

All columns in ACTUAL (except wc_id, local_relpath and parent_relpath) are
optional (default=NULL), so this is not an issue.

You have exactly the same thing when you would apply a changelist on a path
and then remove it.

The only special case is that you can have tree conflicts on nodes that are
neither in WORKING nor in BASE. (That is why we stored them on the parent
directory in 1.6)

We can just allow that case in WC-NG, by enumerating over the nodes in
ACTUAL in svn_wc__db_read_conflict_victims(), while
svn_wc__db_read_children() doesn't look at ACTUAL.

(svn_wc__db_read_conflict_victims() currently fakes this behavior by reading
ACTUAL and the parents ACTUAL, parsing the old conflict data)

	Bert

Re: allowing multiple conflicts in storage

Posted by Stefan Sperling <st...@elego.de>.

On Tue, Mar 30, 2010 at 05:56:54PM +0100, Julian Foad wrote:
> 
> > Conflict meta data storage in wc-ng
> > ===================================
> > 
> > Conflict meta data is stored in the ACTUAL_NODE table, within the
> > 'conflict_data' column. The data in this column is a skel containing
> > conflict information, or NULL (meaning no conflict is present).
> 
> Is storing a record in ACTUAL OK even for a node that (apart from the
> conflict descripition) has no BASE and no ACTUAL in the WC?  E.g.
> because it wasn't there and an incoming delete cause a tree conflict.
> What values would the rest of the ACTUAL_NODE columns have in this case?

Not sure right now. I guess I don't know the schema well enough yet to
answer this question. Anyone?

> > There are five types of conflicts (text conflicts, property conflicts,
> > tree conflicts, patch conflicts, and obstructions).
> 
> Whoa.  A patch can cause a text conflict

No. It cannot cause a text conflict. It causes a "reject" instead.
We cannot record it as a text/prop/tree/obstruction, because all we can tell
the user is "the following hunk in the patch did not match anywhere in the
target so svn does not know what to do with it." A text conflict implies
a 3-way merge which is not what patch does. Patch does search/replace.
(Remember how we figured out that diff3 merge doesn't work with patches?)

> and/or (in the future) property
> conflicts and/or a tree conflict and/or an obstruction.

It could in the future cause property conflicts, yes. But I don't see a
problem with recording that as a property conflict where operation="patch".

It currently does not flag obstructions, because patch simply skips anything
that is obstructed (like merge does today) without flagging a conflict.
I don't know if we'll ever change this behaviour. We might at some point,
but, likewise, this would become an "obstructed" conflict with
operation="patch".

> I don't think
> it is right to call "patch conflict" in the same space as those others.
> A patch is a potential *source* of the change that conflicts, not a kind
> of conflict.

Well, I think the terminology has misled you.
Maybe we should just toss that name and call it "reject conflict" instead.
I guess calling it "patch conflict" is confusing because "patch" is also
the name of an operation. An operation always uses potential sources of
conflicts (be it a delta to be merged, a property to be updated,
or a patch to be applied). But that is not why we need this new conflict
type. We need it because 'svn patch' added a new type of conflict that
didn't exist before. It does not flag this conflict yet but we will make
it do that as soon as we can.

> Sources of a conflict are:
> 
> That's all for now.
> 

Did you mean to make a list sources but Evolution wouldn't let you?

Stefan

Re: allowing multiple conflicts in storage

Posted by Julian Foad <ju...@wandisco.com>.

> Conflict meta data storage in wc-ng
> ===================================
> 
> Conflict meta data is stored in the ACTUAL_NODE table, within the
> 'conflict_data' column. The data in this column is a skel containing
> conflict information, or NULL (meaning no conflict is present).

Is storing a record in ACTUAL OK even for a node that (apart from the
conflict descripition) has no BASE and no ACTUAL in the WC?  E.g.
because it wasn't there and an incoming delete cause a tree conflict.
What values would the rest of the ACTUAL_NODE columns have in this case?

> There are five types of conflicts (text conflicts, property conflicts,
> tree conflicts, patch conflicts, and obstructions).

Whoa.  A patch can cause a text conflict and/or (in the future) property
conflicts and/or a tree conflict and/or an obstruction.  I don't think
it is right to call "patch conflict" in the same space as those others.
A patch is a potential *source* of the change that conflicts, not a kind
of conflict.  Sources of a conflict are:

That's all for now.

- Julian

Re: allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

Posted by Stefan Sperling <st...@elego.de>.

On Tue, Mar 30, 2010 at 08:04:55PM +0200, Stefan Sperling wrote:
> Actually, I think this becomes much cleaner once we recognize
> that what's right now stored in COMMON is actually operation-specific.

I've committed this in r929504.

We can still move the OPERATION skels around, depending on how
we end up wanting the storage layer to behave wrt operations.

Stefan

Re: allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

Posted by Stefan Sperling <st...@elego.de>.

On Tue, Mar 30, 2010 at 07:14:41PM +0200, Stefan Sperling wrote:
> On Tue, Mar 30, 2010 at 06:14:50PM +0200, Stefan Sperling wrote:
> > I want the storage layer to care about storage, not about policy.
> 
> FWIW, this diff changes the spec accordingly. Note how the "patch"
> conflict type does not use the COMMON data at all.
> 
> Any really strong reason not to do it this way?

Actually, I think this becomes much cleaner once we recognize
that what's right now stored in COMMON is actually operation-specific.
What about we apply this diff to the spec instead of my last one?


Index: notes/wc-ng/conflict-storage
===================================================================
--- notes/wc-ng/conflict-storage	(revision 929178)
+++ notes/wc-ng/conflict-storage	(working copy)
@@ -11,36 +11,23 @@
 tree conflicts, patch conflicts, and obstructions). The conflict skel
 has the form:
 
-  (COMMON (KIND KIND-SPECIFIC) (KIND KIND-SPECIFIC) ...)
+  ((KIND OPERATION KIND-SPECIFIC) (KIND OPERATION KIND-SPECIFIC) ...)
 
 where KIND is one of "text", "prop", "tree", "patch", or
-"obstructed". KIND-SPECIFIC is specific to each KIND, and is detailed
-below. The COMMON skel contains data that is common to all KINDs, and
-is detailed below.
+"obstructed". OPERATION indicates the operation which caused
+the conflict and is detailed below. KIND-SPECIFIC is specific
+to each KIND, and is detailed below.
 
-### stsp: I think the COMMON block should appear at the start of each
-###   type-specific block instead. For instance, if I apply a patch and
-###   rejects are recorded by the "patch" operation on the node,
-###   it does not hurt to allow update or merge operations on the node
-###   while it still has a patch conflict recorded.
-###   A similar situation is a text vs. prop conflicts. E.g. if a node
-###   node has only property conflicts, why not allow an update or merge
-###   which does not touch conflicted properties? Even if we do not want
-###   to allow this, I think embedding this decision into the storage
-###   layer design is a bad idea. We can make higher layers deal with it.
-###   See also http://svn.haxx.se/dev/archive-2010-03/0646.shtml
-
 ### stsp: need conflict data format version info inside skel, too?
 ###   or do we bump the entire wc.db format number if we need to tweak
 ###   this?
 ###
 ### gstein sez: the KIND can become "text-2" or somesuch if we need to
 ###   radically alter the kind-specific data. but we can easily append
-###   information to the skel without much problem. adjusting the
-###   COMMON skel shouldn't be hard, if appending is insufficient.
+###   information to the skel without much problem.
 
-If the 'conflict_data' column is not NULL, then COMMON must exist and
-at least one KIND of conflict skel, describing the conflict(s).
+If the 'conflict_data' column is not NULL, then at least one
+one KIND of conflict skel must exist, describing the conflict(s).
 
 Contrary to wc-1, wc-ng records sufficient information to help users
 understand, in hindsight, which operation led to the conflict (as long
@@ -69,54 +56,56 @@
         (which does not necessarily equal the current working version!)
 
 
-Common conflict data
---------------------
 
-Some information is shared for all conflict data that applies to a node. E.g.
-when a node has a combination of text and property conflicts these were
-always caused by the same operation. (Any later operation will skip the node
-unless the conflicts are resolved). The COMMON skel has the form:
+Operation skel
+--------------
 
-  (OPERATION LEFT_REV RIGHT_REV
-    (LEFT_UUID LEFT_ROOT_URL LEFT_RELPATH LEFT_PEG_REV)
-    (RIGHT_UUID RIGHT_ROOT_URL RIGHT_RELPATH RIGHT_PEG_REV) )
+The OPERATION skel has the following form:
 
-### BH: I don't know if all these values apply to obstructions and patch
-###     conflicts. And most of these values are not available for conflicts
-###     that are introduced via 1.6 (and some of our deprecated svn_wc apis)
+  (NAME OPERATION-SPECIFIC)
 
-### stsp: We can simply use empty strings for fields which don't make
-###   sense for the current conflict type.
+NAME is "update", "switch", "merge", or "patch".
 
-### BH: Should we have the (incoming_change local_change) block here?
-### BH: Should we have the (left_node_kind right_node_kind) block here?
-### BH: Do we need more data on 'older/mine' or is that handled via left/right?
+OPERATION-SPECIFIC is as follows:
 
-OPERATION is "update", "switch", "merge", or "patch", indicating during what
-type of operation the conflict occurred.
+For update: (BASE_REV TARGET_REV)
 
-{LEFT,RIGHT}_REV is the revision of the {left,right} side of
-the operation. With "update" and "switch", LEFT_REV is the base revision
-prior to the update/switch, and RIGHT_REV is the revision updated/switched
-to. During "merge", LEFT_REV is the merge-left revision, and RIGHT_REV
-is the merge-right revision, of a continuous revision range which was merged
-(merge tracking might split a merge up into multiple merges of continuous
-revision ranges).
+  BASE_REV is the base revision prior to the update.
+  TARGET_REV is the revision being updated to.
 
-{LEFT,RIGHT}_UUID is the UUID of the repository the {left,right}
-version of the item comes from, in order to recognize merges from foreign
-repositories.
+For switch: (BASE_REV TARGET_REV REPOS_ROOT_URL REPOS_RELPATH)
 
-{LEFT,RIGHT}_ROOT_URL is the repository root URL the {left,right}
-version of the item comes from.
+  REPOS_ROOT_URL is the repository root of the URL being switched to.
+  REPOS_RELPATH is the path in the repository being switched to.
 
-{LEFT,RIGHT}_RELPATH is the path in the repository of the {left,right}
-version of the item.
+For merge:
+  (LEFT_REV RIGHT_REV
+   (LEFT_REPOS_UUID LEFT_REPOS_ROOT_URL LEFT_REPOS_RELPATH LEFT_PEG_REV)
+   (RIGHT_REPOS_UUID RIGHT_REPOS_ROOT_URL RIGHT_REPOS_RELPATH RIGHT_PEG_REV) )
 
-{LEFT,RIGHT}_PEG_REV is the peg revision of the {left,right} version
-of the item.
+  LEFT_REV is the merge-left revision, and RIGHT_REV is the merge-right
+    revision of a continuous revision range which was merged (merge tracking
+    might split a merge up into multiple merges of continuous revision ranges).
 
+  {LEFT,RIGHT}_REPOS_UUID is the UUID of the repository the {left,right}
+    version of the item comes from, in order to recognize merges from foreign
+    repositories.
 
+  {LEFT,RIGHT}_REPOS_ROOT_URL is the repository root URL the {left,right}
+    version of the item comes from.
+
+  {LEFT,RIGHT}_REPOS_RELPATH is the path in the repository of the {left,right}
+    version of the item.
+
+  {LEFT,RIGHT}_PEG_REV is the peg revision of the {left,right} version
+    of the item.
+
+For patch: (PATCH_FILE_ABSPATH)
+
+  PATCH_FILE_ABSPATH is the absolute path of the patch file the
+  application of which led to conflicts.
+
+
 Text conflicts
 --------------
 
@@ -203,20 +192,17 @@
   ("obstructed")
 
 
-Patch conflicts (a.k.a. "rejects")
-----------------------------------
+Reject conflicts
+----------------
 For patches, the content of the left and right versions is not fully known,
 so the conflict is not a diff3-style text conflict. Rather, the conflict
 is the failure to find a match for a hunk's context in the patch target.
 
-  ("patch" PATCH_FILE_ABSPATH
+  ("reject"
     HUNK_ORIGINAL_OFFSET HUNK_ORIGINAL_LEN
     HUNK_MODIFIED_OFFSET HUNK_MODIFIED_LENGTH
     REJECT_DIFF_SHA1)
 
-PATCH_FILE_ABSPATH is the absolute path of the patch file the
-application of which to the node led to hunks being rejected.
-
 HUNK_{ORIGINAL,MODIFIED}_OFFSET and HUNK_{ORIGINAL,MODIFIED}_LENGTH
 are the hunk header values as parsed from the patch file (i.e. the "ID"
 of the hunk within the patch file). These also occur in the reject

Re: allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

Posted by Stefan Sperling <st...@elego.de>.

On Tue, Mar 30, 2010 at 06:14:50PM +0200, Stefan Sperling wrote:
> I want the storage layer to care about storage, not about policy.

FWIW, this diff changes the spec accordingly. Note how the "patch"
conflict type does not use the COMMON data at all.

Any really strong reason not to do it this way?

Note that my intention for the API is that we'll have fine-grained
functions to retreive data from skels of a specific conflict,
and then layer more general conflict-enquiry functions (which would
currently retrieve answers from COMMON only, except for patch I guess *cough*)
above these fine-grained functions. Callers of the general functions
would not get to see any redundantly stored data.

Index: notes/wc-ng/conflict-storage
===================================================================
--- notes/wc-ng/conflict-storage	(revision 929178)
+++ notes/wc-ng/conflict-storage	(working copy)
@@ -11,36 +11,22 @@
 tree conflicts, patch conflicts, and obstructions). The conflict skel
 has the form:
 
-  (COMMON (KIND KIND-SPECIFIC) (KIND KIND-SPECIFIC) ...)
+  ((KIND KIND-SPECIFIC) (KIND KIND-SPECIFIC) ...)
 
 where KIND is one of "text", "prop", "tree", "patch", or
 "obstructed". KIND-SPECIFIC is specific to each KIND, and is detailed
-below. The COMMON skel contains data that is common to all KINDs, and
-is detailed below.
+below.
 
-### stsp: I think the COMMON block should appear at the start of each
-###   type-specific block instead. For instance, if I apply a patch and
-###   rejects are recorded by the "patch" operation on the node,
-###   it does not hurt to allow update or merge operations on the node
-###   while it still has a patch conflict recorded.
-###   A similar situation is a text vs. prop conflicts. E.g. if a node
-###   node has only property conflicts, why not allow an update or merge
-###   which does not touch conflicted properties? Even if we do not want
-###   to allow this, I think embedding this decision into the storage
-###   layer design is a bad idea. We can make higher layers deal with it.
-###   See also http://svn.haxx.se/dev/archive-2010-03/0646.shtml
-
 ### stsp: need conflict data format version info inside skel, too?
 ###   or do we bump the entire wc.db format number if we need to tweak
 ###   this?
 ###
 ### gstein sez: the KIND can become "text-2" or somesuch if we need to
 ###   radically alter the kind-specific data. but we can easily append
-###   information to the skel without much problem. adjusting the
-###   COMMON skel shouldn't be hard, if appending is insufficient.
+###   information to the skel without much problem.
 
-If the 'conflict_data' column is not NULL, then COMMON must exist and
-at least one KIND of conflict skel, describing the conflict(s).
+If the 'conflict_data' column is not NULL, then at least one
+one KIND of conflict skel must exist, describing the conflict(s).
 
 Contrary to wc-1, wc-ng records sufficient information to help users
 understand, in hindsight, which operation led to the conflict (as long
@@ -69,29 +55,12 @@
         (which does not necessarily equal the current working version!)
 
 
-Common conflict data
---------------------
 
-Some information is shared for all conflict data that applies to a node. E.g.
-when a node has a combination of text and property conflicts these were
-always caused by the same operation. (Any later operation will skip the node
-unless the conflicts are resolved). The COMMON skel has the form:
+Common skel fields
+------------------
 
-  (OPERATION LEFT_REV RIGHT_REV
-    (LEFT_UUID LEFT_ROOT_URL LEFT_RELPATH LEFT_PEG_REV)
-    (RIGHT_UUID RIGHT_ROOT_URL RIGHT_RELPATH RIGHT_PEG_REV) )
+Some fields appear in several KIND-SPECIFIC conflict skels:
 
-### BH: I don't know if all these values apply to obstructions and patch
-###     conflicts. And most of these values are not available for conflicts
-###     that are introduced via 1.6 (and some of our deprecated svn_wc apis)
-
-### stsp: We can simply use empty strings for fields which don't make
-###   sense for the current conflict type.
-
-### BH: Should we have the (incoming_change local_change) block here?
-### BH: Should we have the (left_node_kind right_node_kind) block here?
-### BH: Do we need more data on 'older/mine' or is that handled via left/right?
-
 OPERATION is "update", "switch", "merge", or "patch", indicating during what
 type of operation the conflict occurred.
 
@@ -123,7 +92,10 @@
 Text conflicts only exist on files. The following skel represents the
 "text" KIND of conflict:
 
-  ("text" LEFT_SHA1 RIGHT_SHA1 MINE_SHA1)
+  ("text" (OPERATION LEFT_REV RIGHT_REV)
+    (LEFT_UUID LEFT_ROOT_URL LEFT_RELPATH LEFT_PEG_REV)
+    (RIGHT_UUID RIGHT_ROOT_URL RIGHT_RELPATH RIGHT_PEG_REV)
+    LEFT_SHA1 RIGHT_SHA1 MINE_SHA1) )
 
 ### BH: We need some marker here, but these values must also be stored
 ###     in the older_checksum, left_checksum, right_checksum colums of ACTUAL
@@ -147,7 +119,10 @@
 by one or more "prop" KIND conflicts. Each "prop" conflict has the
 following form:
 
-  ("prop" PROPERTY_NAME ([LEFT_VALUE]) ([RIGHT_VALUE]) ([MINE_VALUE]))
+  ("prop" (OPERATION LEFT_REV RIGHT_REV)
+    (LEFT_UUID LEFT_ROOT_URL LEFT_RELPATH LEFT_PEG_REV)
+    (RIGHT_UUID RIGHT_ROOT_URL RIGHT_RELPATH RIGHT_PEG_REV) 
+     PROPERTY_NAME ([LEFT_VALUE]) ([RIGHT_VALUE]) ([MINE_VALUE]))
 
 PROPERTY_NAME is the name of the property, such as "svn:eol-style".
 
@@ -162,7 +137,9 @@
 Tree conflicts exist on files or directories.
 The following information is stored if there is a tree conflict on the node:
 
-  ("tree"
+  ("tree" (OPERATION LEFT_REV RIGHT_REV)
+    (LEFT_UUID LEFT_ROOT_URL LEFT_RELPATH LEFT_PEG_REV)
+    (RIGHT_UUID RIGHT_ROOT_URL RIGHT_RELPATH RIGHT_PEG_REV)
     INCOMING_CHANGE LOCAL_CHANGE
     LEFT_NODE_KIND RIGHT_NODE_KIND
     LEFT_SHA1 RIGHT_SHA1 MINE_SHA1)
@@ -200,7 +177,9 @@
 There is no particular data which needs to be recorded for an
 obstruction. Thus, the "obstructed" conflict skel has the form:
 
-  ("obstructed")
+  ("obstructed"  (OPERATION LEFT_REV RIGHT_REV)
+    (LEFT_UUID LEFT_ROOT_URL LEFT_RELPATH LEFT_PEG_REV)
+    (RIGHT_UUID RIGHT_ROOT_URL RIGHT_RELPATH RIGHT_PEG_REV) )
 
 
 Patch conflicts (a.k.a. "rejects")

Re: allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

Posted by Stefan Sperling <st...@elego.de>.

On Tue, Mar 30, 2010 at 05:55:00PM +0200, Bert Huijben wrote:
> 
> 
> > -----Original Message-----
> > From: Stefan Sperling [mailto:stsp@elego.de]
> > Sent: dinsdag 30 maart 2010 16:30
> > To: dev@subversion.apache.org
> > Cc: commits@subversion.apache.org
> > Subject: allowing multiple conflicts in storage (Re: svn commit: r928806 -
> > /subversion/trunk/notes/wc-ng/conflict-storage)
> > 
> > On Mon, Mar 29, 2010 at 03:37:51PM -0000, rhuijben@apache.org wrote:
> > > Author: rhuijben
> > > Date: Mon Mar 29 15:37:50 2010
> > > New Revision: 928806
> > >
> > > URL: http://svn.apache.org/viewvc?rev=928806&view=rev
> > > Log:
> > > * notes/wc-ng/conflict-storage
> > >   Update schema to have some global information that applies to all
> > >   conflict data. Remove this data from the individual locations as it
> > >   can't differ between these locations anyway (or the node would have
> > >   been skipped). Finally add unversioned obstructions as a conflict
> > >   class.
> > 
> > > -Text conflicts
> > > ---------------
> > > +Common conflict data
> > > +--------------------
> > >
> > > -Text conflicts only exist on files. The following information
> > > -is stored if there is a text conflict on the node:
> > > +Some information is shared for all conflict data that applies to a
> node. E.g.
> > > +when a node has a combination of text and property conflicts these were
> > > +always caused by the same operation. (Any later operation will skip the
> > node
> > > +unless the conflicts are resolved)
> > >
> > 
> > I'm still not sure that I agree with this.
> > 
> > I don't think it's smart to restrict the conflict storage layer
> > to accept only a single set of conflicts for a node, all of which
> > belong together to the point where you cannot do anything to the
> > node unless you have resolved *all* conflicts.
> > 
> > I'd like the storage layer to allow for resolving each type of conflicts
> > separately from the other types. Higher-level operations can then decide
> > whether it is safe to do something to the partially-conflicted node.
> > E.g. if a file has only property conflicts, why should a merge skip
> > merging changes into the file itself? If a directory is tree conflicted,
> > why should I not be allowed to merge some content into a file below
> > the tree-conflicted directory if doing so helps me to resolve the tree
> > conflict?
> 
> I intended to allow resolving only some parts of the conflict; but we can't
> allow operations that can introduce conflicts on their own.

Sure.

> You can't apply a merge/update/switch/patch on a node that has a
> text/property/obstruction/patch conflict, because a can cause a conflict
> with the conflict state (How would you describe that?). You can't trust the
> in-wc state if you have a conflict.
> 
> In 1.6 the way to resolve a text conflict is to edit the in-wc text with all
> the information provided and then mark it resolved.

Yes, and it's a pain that I can't just do:

	svn merge ^/blah/foo.c@2 ^/blah/foo.c@3 bar.c

as part of my local edits.

And (speaking from a user perspective who doesn't know how svn works internally
for a minute) why doesn't svn let me do this merge if bar.c is a tree
conflict victim, but has not text conflicts recorded on it?

I know that it may be impossible for us to figure out what the merge
really does before we carry out the merge. We might not know if the
merge is safe. But why design the storage layer around this assumption?
It's easy enough to enforce the same restriction at a higher layer.

> The same applies to properties: You read the '.prej' file and decide what
> the new property value(s) must be and you mark all resolved. (Or just one
> with the new storage).

Why force users to resolve property conflicts before they can e.g. merge
additional content into a file using 'svn merge'?
Why not allow application of a patch which carries a diff that would not
touch a conflicted property before the user has dealt with property
conflicts?

> But that doesn't make it possible to introduce new conflicts on top of
> existing conflicts; that is +- impossible. 

I don't think you understand what I want to do.
Adding new conflicts on top of existing conflicts is not what I was
suggesting to support. I'm suggesting proper layering to keep doors
open for future enhancements. I want the storage layer to care about
storage, not about policy.

Stefan

RE: allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Stefan Sperling [mailto:stsp@elego.de]
> Sent: dinsdag 30 maart 2010 16:30
> To: dev@subversion.apache.org
> Cc: commits@subversion.apache.org
> Subject: allowing multiple conflicts in storage (Re: svn commit: r928806 -
> /subversion/trunk/notes/wc-ng/conflict-storage)
> 
> On Mon, Mar 29, 2010 at 03:37:51PM -0000, rhuijben@apache.org wrote:
> > Author: rhuijben
> > Date: Mon Mar 29 15:37:50 2010
> > New Revision: 928806
> >
> > URL: http://svn.apache.org/viewvc?rev=928806&view=rev
> > Log:
> > * notes/wc-ng/conflict-storage
> >   Update schema to have some global information that applies to all
> >   conflict data. Remove this data from the individual locations as it
> >   can't differ between these locations anyway (or the node would have
> >   been skipped). Finally add unversioned obstructions as a conflict
> >   class.
> 
> > -Text conflicts
> > ---------------
> > +Common conflict data
> > +--------------------
> >
> > -Text conflicts only exist on files. The following information
> > -is stored if there is a text conflict on the node:
> > +Some information is shared for all conflict data that applies to a
node. E.g.
> > +when a node has a combination of text and property conflicts these were
> > +always caused by the same operation. (Any later operation will skip the
> node
> > +unless the conflicts are resolved)
> >
> 
> I'm still not sure that I agree with this.
> 
> I don't think it's smart to restrict the conflict storage layer
> to accept only a single set of conflicts for a node, all of which
> belong together to the point where you cannot do anything to the
> node unless you have resolved *all* conflicts.
> 
> I'd like the storage layer to allow for resolving each type of conflicts
> separately from the other types. Higher-level operations can then decide
> whether it is safe to do something to the partially-conflicted node.
> E.g. if a file has only property conflicts, why should a merge skip
> merging changes into the file itself? If a directory is tree conflicted,
> why should I not be allowed to merge some content into a file below
> the tree-conflicted directory if doing so helps me to resolve the tree
> conflict?

I intended to allow resolving only some parts of the conflict; but we can't
allow operations that can introduce conflicts on their own.

You can't apply a merge/update/switch/patch on a node that has a
text/property/obstruction/patch conflict, because a can cause a conflict
with the conflict state (How would you describe that?). You can't trust the
in-wc state if you have a conflict.

In 1.6 the way to resolve a text conflict is to edit the in-wc text with all
the information provided and then mark it resolved.

The same applies to properties: You read the '.prej' file and decide what
the new property value(s) must be and you mark all resolved. (Or just one
with the new storage).

But that doesn't make it possible to introduce new conflicts on top of
existing conflicts; that is +- impossible. 

Just looking at text conflicts: merging will fail because you see conflict
markers.
Same applies to property merges: You can't merge a single property change
into the WC. (Would you use left, right or mine?)

Patch conflicts... see text conflicts.

Tree conflicts: Do you apply the change on BASE_NODE, or on WORKING_NODE
when the node is in a tree conflict?

Obstructions: You can't do anything to the in-wc data

What do we have left?

This is exactly why you need interactive conflict resolution while merging:
That is the only way you can continue a compound operation.

	Bert

RE: allowing multiple conflicts in storage (Re: svn commit: r928806 - /subversion/trunk/notes/wc-ng/conflict-storage)

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Stefan Sperling [mailto:stsp@elego.de]
> Sent: dinsdag 30 maart 2010 16:30
> To: dev@subversion.apache.org
> Cc: commits@subversion.apache.org
> Subject: allowing multiple conflicts in storage (Re: svn commit: r928806 -
> /subversion/trunk/notes/wc-ng/conflict-storage)
> 
> On Mon, Mar 29, 2010 at 03:37:51PM -0000, rhuijben@apache.org wrote:
> > Author: rhuijben
> > Date: Mon Mar 29 15:37:50 2010
> > New Revision: 928806
> >
> > URL: http://svn.apache.org/viewvc?rev=928806&view=rev
> > Log:
> > * notes/wc-ng/conflict-storage
> >   Update schema to have some global information that applies to all
> >   conflict data. Remove this data from the individual locations as it
> >   can't differ between these locations anyway (or the node would have
> >   been skipped). Finally add unversioned obstructions as a conflict
> >   class.
> 
> > -Text conflicts
> > ---------------
> > +Common conflict data
> > +--------------------
> >
> > -Text conflicts only exist on files. The following information
> > -is stored if there is a text conflict on the node:
> > +Some information is shared for all conflict data that applies to a
node. E.g.
> > +when a node has a combination of text and property conflicts these were
> > +always caused by the same operation. (Any later operation will skip the
> node
> > +unless the conflicts are resolved)
> >
> 
> I'm still not sure that I agree with this.
> 
> I don't think it's smart to restrict the conflict storage layer
> to accept only a single set of conflicts for a node, all of which
> belong together to the point where you cannot do anything to the
> node unless you have resolved *all* conflicts.
> 
> I'd like the storage layer to allow for resolving each type of conflicts
> separately from the other types. Higher-level operations can then decide
> whether it is safe to do something to the partially-conflicted node.
> E.g. if a file has only property conflicts, why should a merge skip
> merging changes into the file itself? If a directory is tree conflicted,
> why should I not be allowed to merge some content into a file below
> the tree-conflicted directory if doing so helps me to resolve the tree
> conflict?

I intended to allow resolving only some parts of the conflict; but we can't
allow operations that can introduce conflicts on their own.

You can't apply a merge/update/switch/patch on a node that has a
text/property/obstruction/patch conflict, because a can cause a conflict
with the conflict state (How would you describe that?). You can't trust the
in-wc state if you have a conflict.

In 1.6 the way to resolve a text conflict is to edit the in-wc text with all
the information provided and then mark it resolved.

The same applies to properties: You read the '.prej' file and decide what
the new property value(s) must be and you mark all resolved. (Or just one
with the new storage).

But that doesn't make it possible to introduce new conflicts on top of
existing conflicts; that is +- impossible. 

Just looking at text conflicts: merging will fail because you see conflict
markers.
Same applies to property merges: You can't merge a single property change
into the WC. (Would you use left, right or mine?)

Patch conflicts... see text conflicts.

Tree conflicts: Do you apply the change on BASE_NODE, or on WORKING_NODE
when the node is in a tree conflict?

Obstructions: You can't do anything to the in-wc data

What do we have left?

This is exactly why you need interactive conflict resolution while merging:
That is the only way you can continue a compound operation.

	Bert