You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Hyrum K. Wright" <hy...@mail.utexas.edu> on 2010/10/26 21:48:31 UTC

wc-ng: Storing conflict information on the victim: trials and pitfalls

This mail is part 2 in a series regarding the conflict information
storage (see [1] for part 1).

The current Plan (as denoted in both wc-metadata.sql and the
conflict-storage notes) is to store conflict information in the ACTUAL
node.  This works fine (and is currently being done on trunk), but
seems to break an assumed invariant that every node will also have an
entry in NODES.  The docs in wc-metadata.sql state that "A NODES row
must exist if this node exists, but an ACTUAL_NODE row can exist on
its own if it is just recording info on a non-present node - a tree
conflict or a changelist, for example."  So in theory we allow ACTAUL
nodes with conflict info without a NODE entry, but not in practice.

One of the places this bites us is when retrieving children.  Child
nodes which are victims of a conflict should be returned as part of
the child list APIs (which ultimately call
libsvn_wc/wc_db.c:gather_children()).  Those APIs currently only look
in NODES to find children, without tacking on the conflict victims
embedded in ACTUAL.tree_conflict_data.  In crafting a patch to do
this, I run into various places in wc_db where a NODE is assumed if an
ACTUAL node exists.

For instance, I'd like to rewrite
libsvn_client/commit_util.c:bail_on_tree_conflicted_children() so that
it iterates over the children of the directory in question, rather
than use the (soon-to-disappear) "give me all the tree conflict
children on this node" API.  However, if conflict victims aren't
returned as part of gather_children(), that API has no way of knowing
about those children who exist only as conflict victims.  I believe
there are other reasons to include conflicted children in the child
lists as well.

In the end, I'm not really sure what the best solution is, or how to
get there.  I've thought about the possibility of having a NODE with
op_depth=-1 for nodes who only have an ACTUAL node, if only to satisfy
the "ACTUAL must have NODE" invariant.  Any other comments or thought
are welcome.

-Hyrum

[1] http://svn.haxx.se/dev/archive-2010-10/0392.shtml

RE: wc-ng: Storing conflict information on the victim: trials and pitfalls

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: hyrum@hyrumwright.org [mailto:hyrum@hyrumwright.org] On Behalf Of
> Hyrum K. Wright
> Sent: donderdag 28 oktober 2010 6:22
> To: Bert Huijben
> Cc: Subversion Development
> Subject: Re: wc-ng: Storing conflict information on the victim: trials
> and pitfalls
> 
> On Wed, Oct 27, 2010 at 4:06 AM, Bert Huijben <be...@qqmail.nl> wrote:
> >
> >
> >> -----Original Message-----
> >> From: hyrum@hyrumwright.org [mailto:hyrum@hyrumwright.org] On Behalf
> Of
> >> Hyrum K. Wright
> >> Sent: dinsdag 26 oktober 2010 23:49
> >> To: Subversion Development
> >> Subject: wc-ng: Storing conflict information on the victim: trials
> and
> >> pitfalls
> >>
> >> This mail is part 2 in a series regarding the conflict information
> >> storage (see [1] for part 1).
> >>
> >> The current Plan (as denoted in both wc-metadata.sql and the
> >> conflict-storage notes) is to store conflict information in the
> ACTUAL
> >> node.  This works fine (and is currently being done on trunk), but
> >> seems to break an assumed invariant that every node will also have
> an
> >> entry in NODES.  The docs in wc-metadata.sql state that "A NODES row
> >> must exist if this node exists, but an ACTUAL_NODE row can exist on
> >> its own if it is just recording info on a non-present node - a tree
> >> conflict or a changelist, for example."  So in theory we allow
> ACTAUL
> >> nodes with conflict info without a NODE entry, but not in practice.
> >>
> >> One of the places this bites us is when retrieving children.  Child
> >> nodes which are victims of a conflict should be returned as part of
> >> the child list APIs (which ultimately call
> >> libsvn_wc/wc_db.c:gather_children()).  Those APIs currently only
> look
> >> in NODES to find children, without tacking on the conflict victims
> >> embedded in ACTUAL.tree_conflict_data.  In crafting a patch to do
> >> this, I run into various places in wc_db where a NODE is assumed if
> an
> >> ACTUAL node exists.
> >>
> >> For instance, I'd like to rewrite
> >> libsvn_client/commit_util.c:bail_on_tree_conflicted_children() so
> that
> >> it iterates over the children of the directory in question, rather
> >> than use the (soon-to-disappear) "give me all the tree conflict
> >> children on this node" API.  However, if conflict victims aren't
> >> returned as part of gather_children(), that API has no way of
> knowing
> >> about those children who exist only as conflict victims.  I believe
> >> there are other reasons to include conflicted children in the child
> >> lists as well.
> >
> > In the current code we have a wc_db api to retrieve 'all conflicted
> > children' of a node, which can be used for this specific use case.
> 
> This API only exists because it is an artifact of the implementation.
> And the implementation is only an artifact of limitations in wc-1.
> Currently, all tree conflict information is stored on the parent,
> since in some scenarios in wc-1, there would be no entries file in
> which to store conflict information.  This model goes against what I
> feel is one of the great benefits of wc-ng, and is something I'm
> working to change (as part of the updated conflict store).
> 
> All that being said, I don't think that the current "get all
> conflicted children" API is a very good one to continue supporting,
> since the same information could conceivably be achieved in other
> ways.  It's also worth asking "why do we need this API to begin with?"
>  Though I know the immediate answer, I wonder if there is a
> higher-level issue which could be dealt with to remove this
> dependency.
> 
> > It's also completely impossible to create a svn_wc_entry_t for a node
> in
> > this state, so making it a 'proper node' will require some work in
> more than
> > a few places.
> 
> Then just don't.  For backward compat, pretend such a node doesn't
> exist, and continue pretending the conflict information is stored on
> the parent.

A svn_wc__db_status_not_present node *does* exist (in a different revision),
so if you want to see the difference (even if it is just for backwards
compatibility) you need a new status for an unversioned node that is
recorded in wc.db.

When you perform a merge or patch that introduces a delete-delete tree
conflict, you don't know if the node ever existed (or if it maybe *does*
exist in a specific revision) so you can't record a
svn_wc__db_status_not_present status for the node.

	Bert
> 
> >> In the end, I'm not really sure what the best solution is, or how to
> >> get there.  I've thought about the possibility of having a NODE with
> >> op_depth=-1 for nodes who only have an ACTUAL node, if only to
> satisfy
> >> the "ACTUAL must have NODE" invariant.  Any other comments or
> thought
> >> are welcome.
> >
> > I would suggest dropping the 'ACTUAL must have NODE' invariant and
> keeping
> > the rest of the wc_db api as close to the current state as possible.
> >
> > Making delete-delete tree conflicts proper nodes, will also make
> simple
> > users of svn_wc__db_read_info(), as svn_wc_read_kind() return some
> node kind
> > for this node. (How would we handle that?)
> >
> > I think svn_wc__db_read_info() should just return
> SVN_ERR_WC_PATH_NOT_FOUND
> > on these nodes.
> > (Note that we can fix the delete-delete tree conflicts caused by
> switching
> > and updating, by just keeping a not-present NODE. But this doesn't
> work for
> > delete-delete tree conflicts introduced by merging, as we can only
> introduce
> > these not-present markers in BASE)
> 
> Dunno; need to think about these a bit more.
> 
> -Hyrum

Re: wc-ng: Storing conflict information on the victim: trials and pitfalls

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Wed, Oct 27, 2010 at 4:06 AM, Bert Huijben <be...@qqmail.nl> wrote:
>
>
>> -----Original Message-----
>> From: hyrum@hyrumwright.org [mailto:hyrum@hyrumwright.org] On Behalf Of
>> Hyrum K. Wright
>> Sent: dinsdag 26 oktober 2010 23:49
>> To: Subversion Development
>> Subject: wc-ng: Storing conflict information on the victim: trials and
>> pitfalls
>>
>> This mail is part 2 in a series regarding the conflict information
>> storage (see [1] for part 1).
>>
>> The current Plan (as denoted in both wc-metadata.sql and the
>> conflict-storage notes) is to store conflict information in the ACTUAL
>> node.  This works fine (and is currently being done on trunk), but
>> seems to break an assumed invariant that every node will also have an
>> entry in NODES.  The docs in wc-metadata.sql state that "A NODES row
>> must exist if this node exists, but an ACTUAL_NODE row can exist on
>> its own if it is just recording info on a non-present node - a tree
>> conflict or a changelist, for example."  So in theory we allow ACTAUL
>> nodes with conflict info without a NODE entry, but not in practice.
>>
>> One of the places this bites us is when retrieving children.  Child
>> nodes which are victims of a conflict should be returned as part of
>> the child list APIs (which ultimately call
>> libsvn_wc/wc_db.c:gather_children()).  Those APIs currently only look
>> in NODES to find children, without tacking on the conflict victims
>> embedded in ACTUAL.tree_conflict_data.  In crafting a patch to do
>> this, I run into various places in wc_db where a NODE is assumed if an
>> ACTUAL node exists.
>>
>> For instance, I'd like to rewrite
>> libsvn_client/commit_util.c:bail_on_tree_conflicted_children() so that
>> it iterates over the children of the directory in question, rather
>> than use the (soon-to-disappear) "give me all the tree conflict
>> children on this node" API.  However, if conflict victims aren't
>> returned as part of gather_children(), that API has no way of knowing
>> about those children who exist only as conflict victims.  I believe
>> there are other reasons to include conflicted children in the child
>> lists as well.
>
> In the current code we have a wc_db api to retrieve 'all conflicted
> children' of a node, which can be used for this specific use case.

This API only exists because it is an artifact of the implementation.
And the implementation is only an artifact of limitations in wc-1.
Currently, all tree conflict information is stored on the parent,
since in some scenarios in wc-1, there would be no entries file in
which to store conflict information.  This model goes against what I
feel is one of the great benefits of wc-ng, and is something I'm
working to change (as part of the updated conflict store).

All that being said, I don't think that the current "get all
conflicted children" API is a very good one to continue supporting,
since the same information could conceivably be achieved in other
ways.  It's also worth asking "why do we need this API to begin with?"
 Though I know the immediate answer, I wonder if there is a
higher-level issue which could be dealt with to remove this
dependency.

> It's also completely impossible to create a svn_wc_entry_t for a node in
> this state, so making it a 'proper node' will require some work in more than
> a few places.

Then just don't.  For backward compat, pretend such a node doesn't
exist, and continue pretending the conflict information is stored on
the parent.

>> In the end, I'm not really sure what the best solution is, or how to
>> get there.  I've thought about the possibility of having a NODE with
>> op_depth=-1 for nodes who only have an ACTUAL node, if only to satisfy
>> the "ACTUAL must have NODE" invariant.  Any other comments or thought
>> are welcome.
>
> I would suggest dropping the 'ACTUAL must have NODE' invariant and keeping
> the rest of the wc_db api as close to the current state as possible.
>
> Making delete-delete tree conflicts proper nodes, will also make simple
> users of svn_wc__db_read_info(), as svn_wc_read_kind() return some node kind
> for this node. (How would we handle that?)
>
> I think svn_wc__db_read_info() should just return SVN_ERR_WC_PATH_NOT_FOUND
> on these nodes.
> (Note that we can fix the delete-delete tree conflicts caused by switching
> and updating, by just keeping a not-present NODE. But this doesn't work for
> delete-delete tree conflicts introduced by merging, as we can only introduce
> these not-present markers in BASE)

Dunno; need to think about these a bit more.

-Hyrum

RE: wc-ng: Storing conflict information on the victim: trials and pitfalls

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: hyrum@hyrumwright.org [mailto:hyrum@hyrumwright.org] On Behalf Of
> Hyrum K. Wright
> Sent: dinsdag 26 oktober 2010 23:49
> To: Subversion Development
> Subject: wc-ng: Storing conflict information on the victim: trials and
> pitfalls
> 
> This mail is part 2 in a series regarding the conflict information
> storage (see [1] for part 1).
> 
> The current Plan (as denoted in both wc-metadata.sql and the
> conflict-storage notes) is to store conflict information in the ACTUAL
> node.  This works fine (and is currently being done on trunk), but
> seems to break an assumed invariant that every node will also have an
> entry in NODES.  The docs in wc-metadata.sql state that "A NODES row
> must exist if this node exists, but an ACTUAL_NODE row can exist on
> its own if it is just recording info on a non-present node - a tree
> conflict or a changelist, for example."  So in theory we allow ACTAUL
> nodes with conflict info without a NODE entry, but not in practice.
> 
> One of the places this bites us is when retrieving children.  Child
> nodes which are victims of a conflict should be returned as part of
> the child list APIs (which ultimately call
> libsvn_wc/wc_db.c:gather_children()).  Those APIs currently only look
> in NODES to find children, without tacking on the conflict victims
> embedded in ACTUAL.tree_conflict_data.  In crafting a patch to do
> this, I run into various places in wc_db where a NODE is assumed if an
> ACTUAL node exists.
> 
> For instance, I'd like to rewrite
> libsvn_client/commit_util.c:bail_on_tree_conflicted_children() so that
> it iterates over the children of the directory in question, rather
> than use the (soon-to-disappear) "give me all the tree conflict
> children on this node" API.  However, if conflict victims aren't
> returned as part of gather_children(), that API has no way of knowing
> about those children who exist only as conflict victims.  I believe
> there are other reasons to include conflicted children in the child
> lists as well.

In the current code we have a wc_db api to retrieve 'all conflicted
children' of a node, which can be used for this specific use case.

It's also completely impossible to create a svn_wc_entry_t for a node in
this state, so making it a 'proper node' will require some work in more than
a few places.

> In the end, I'm not really sure what the best solution is, or how to
> get there.  I've thought about the possibility of having a NODE with
> op_depth=-1 for nodes who only have an ACTUAL node, if only to satisfy
> the "ACTUAL must have NODE" invariant.  Any other comments or thought
> are welcome.

I would suggest dropping the 'ACTUAL must have NODE' invariant and keeping
the rest of the wc_db api as close to the current state as possible. 

Making delete-delete tree conflicts proper nodes, will also make simple
users of svn_wc__db_read_info(), as svn_wc_read_kind() return some node kind
for this node. (How would we handle that?)

I think svn_wc__db_read_info() should just return SVN_ERR_WC_PATH_NOT_FOUND
on these nodes.
(Note that we can fix the delete-delete tree conflicts caused by switching
and updating, by just keeping a not-present NODE. But this doesn't work for
delete-delete tree conflicts introduced by merging, as we can only introduce
these not-present markers in BASE)

	Bert