You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Stein <gs...@gmail.com> on 2010/04/07 20:15:01 UTC

fourth tree: "INHERITED" (was: wc-ng base/working nodes in a copied tree)

On Wed, Apr 7, 2010 at 04:52, Philip Martin <ph...@wandisco.com> wrote:
> Greg Stein <gs...@gmail.com> writes:
>
>> I believe that we have the following operations:
>>
>> add: plain old add
>> add-within-copy/move: add within a subtree
>
> Those two are much the same.  It makes little difference whether it's
> a plain add, an add within an add or an add within a copy/move.

Well... we couldn't distinguish these two cases before. With a few
additional presence values, we can.

>> replace: delete-base + add
>> replace-within-copy/move: delete-child + add
>
> There is also
>
> replace: base node not-present + add
>
> the old deleted=true state.  This is explicitly excluded from being a
> replace in svn_wc__internal_is_replaced but that may be a bug (it
> causes revert to erroneously remove the not-present base node).

It is excluded from being svn_wc_schedule_replace. I'd prefer to
ignore that fact.


>
>> delete-base: deleting a base node
>> delete-child: deleting a child of an add/copy/move
>> copy(-within): same as add•
>> moved-here(-within): same as add*
>> moved-away(-within): same as delete*
>>
>> I'm thinking that we add the following presence values (for
>> WORKING_NODE.presence):
>>
>> "added": same as "normal" today. clarifies what this really means. we
>> map this to status_added, so why not use this for the presence
>> anyways? no need to "minimize/conserve" the set of presence values.
>>
>> "replaced": indicates an added/copied-here/moved-here node that
>> replaces a child of a copied-here/moved-here subtree.
>>
>> "deleted": same as "not-present" today. clarifies what this really means.
>>
>> "inherit": applied to children of copied-here/moved-here and
>> deleted/base-deleted nodes. implies that no commit operations are
>> required for these nodes.
>
> Being able to distinguish add and replace is not enough for full 1.6
> compatibility.  When a node replaces a copied child it overwrites the
> child's data, things like checksum and properties.  This data is not
> derived or inherited from the copied parent, so it cannot be restored
> after being overwriten.  In 1.6 it is possible to revert the replace
> and restore to the copied child.

Urgh. Yeah. I've only ever looked at that scenario as "no partial
reverts (yet). revert the whole thing". In that case, meaning the
whole copy. But that's not right, if the user is reverting
SOME/SUBDIR/PATH. Only the change(s) to the child should be reverted.

> Another problem is a copy of a mixed revision tree that includes base
> nodes that are not-present.  In 1.6 we represent these as "fake"
> schedule deletes in the copy, so that they are explicitly deleted when
> the copy is committed.  This works but has problems, the main one
> being that if one tries to revert the delete the full node information
> is not available (because the not-present source doesn't have it).
> Perhaps we should have a distinct presence for this type of node?
> There are similar questions about absent and excluded nodes.

Okay. In this case, reverting the delete should result in an excluded
node. That's the best we can do. "svn update --depth=infinity
NOTPRESENT" can restore the file.

Note that we *could* revert a replaced-child into an excluded node,
too. That would be a feature loss compared to 1.6, however.


It seems like we need one more tree, to hold "inherit" data. If you do
a further operation on an operation-root (something in WORKING_NODE),
then it will alter that node and all inherited nodes. But if you
perform an operation on an inherited node, then you're establishing a
new operation-root in WORKING_NODE. The inherited data can still
remain elsewhere, but WORKING_NODE now refers to a new operation-root.
Reverting that operation would bring back the inherited node.

Thus, WORKING_NODE would become *just* explicit operations. ie. the
stuff we send during a commit.

Hmm. Almost...

A delete operation could mark the root in WORKING_NODE (we'd probably
want a new name for this!), and then drop a bunch of markers in
WORKING_NODE to occlude any children present in the inherited tree and
the BASE tree. Those markers couldn't be individually reverted
however, and they don't represent data to send during commit. We
*could* alter a presence-like flag in the inherited tree instead, yet
leave all the data intact for a revert. Or place markers in the
inherited tree to occlude the BASE nodes.

Thoughts?

Cheers,
-g

Re: fourth tree: "INHERITED"

Posted by Greg Stein <gs...@gmail.com>.
On Tue, Apr 13, 2010 at 11:57, Philip Martin <ph...@wandisco.com> wrote:
> Greg Stein <gs...@gmail.com> writes:
>...
>> As you note layer, operations on a node cannot be layered/stacked. You
>> can modify the operation at a node, but you can't layer over it.
>> Columns like copyfrom_* and moved_* are about the operation, rather
>> than the node's data. It says *how* the node got there, rather than
>> talk about the node itself.
>
> I'm confused by the copy of a mixed revision working copy, say a
> directory a revision N and a file at revision N+1.  We need a working
> node for the directory, to store N, and a working node for the file,
> to store N+1.

Yes.

>  Now delete the copied file and replace with something
> else, that overwrites the N+1 in the working node.  Now reverting the
> replace cannot restore the original copy since the N+1 is gone, so
> what does the copied file look like, revision N?

That is the only option, and it would be presence=excluded since we
don't have revision N of the thing. (or maybe presence=absent)

Oh hell... we don't even know if the file exists at N.

Consider:

$ svn cp D T  # at rev N
$ svn up D  # D/f now appears
$ svn cp D/f T/f  # at rev N+1

vs

$ svn cp D T
$ svn rm T/f
$ svn up D
$ svn cp D/f T/f

When using *two* copy operations, we have enough information in
WORKING_NODE to restore file@N (or omit it because it wasn't there).

But with a copy of a mixed-rev subtree, the *source* doesn't have
enough information for us do perform the kinds of reverts above.

I'll also note that we can run into similar problems with a switched
source. The repos_relpath changes, rather than the revision; thus, the
WORKING_NODE would need to generate some additional copyfrom roots to
denote the changes.

>  I suppose that is
> acceptable, the original copy, prior to the replace, was not a simple
> copy.  We could argue that the copy of a mixed revision working copy
> is already a bunch of roots.  Suppose I delete the copied file at N+1
> and then revert.  Does it go back to N+1 or N?

Reverting a replace or reverting a delete are the same thing, in this
scenario. You're reverting an operation, but the ancestor operation
doesn't have enough information to reconstruct a node.


Okay. I think that I have an idea.

I originally suggested the following columns for NODE_DATA:

  kind, [checksum], changed_*, properties, [symlink_target]

The repos_id, repos_relpath, and revnum columns from BASE_NODE were
omitted because WORKING nodes do not have a location in the repository
(yet). However, what if we put those in? For nodes corresponding to
the BASE_NODE table (op_depth==0), these are the *true* repository
locations. For op_depth>0, these are copied/moved-here source
locations.

This would enable us to represent a mixed-rev and switched copy/move
source, and a single operation root in the WORKING_NODE table.

In the above scenarios, if you replace a child, then it creates a new
operation root (which is later revertable), and it establishes a new
layer in NODE_DATA (preserving the original mixed/switched source).

>...
> So if I copy a directory with children is there a working node for
> each child?

I would say "yes", so that we have a place to store translated_size
and last_mod_time. Not entirely sure.

I think we need every child for deleted nodes. We can't alter the
presence in NODE_DATA to record a delete, since we need that original
info to revert the delete (e.g. NODE_DATA.presence may indicate
excluded/absent/not-present nodes).

Your thoughts on this piece would be welcome. I'm pretty sure we need
them, but am not (yet) fully confident. Some more brain-percolation
time is needed...

Cheers,
-g

Re: fourth tree: "INHERITED"

Posted by Philip Martin <ph...@wandisco.com>.
Greg Stein <gs...@gmail.com> writes:

> On Tue, Apr 13, 2010 at 06:45, Philip Martin <ph...@wandisco.com> wrote:
>> Greg Stein <gs...@gmail.com> writes:
>>
>> I think NODE_DATA needs more or less everything that is in the current
>> WORKING_NODE.  When a layer is reverted to uncover the layer below all
>> the old columns need to be available.  As far as I can see we need to
>> remove the WORKING_NODE tree and replace it with the NODE_DATA tree,
>> or to put it another way we need to add the op_depth column to
>> WORKING_NODE.
>
> I don't think so.
>
> As you note layer, operations on a node cannot be layered/stacked. You
> can modify the operation at a node, but you can't layer over it.
> Columns like copyfrom_* and moved_* are about the operation, rather
> than the node's data. It says *how* the node got there, rather than
> talk about the node itself.

I'm confused by the copy of a mixed revision working copy, say a
directory a revision N and a file at revision N+1.  We need a working
node for the directory, to store N, and a working node for the file,
to store N+1.  Now delete the copied file and replace with something
else, that overwrites the N+1 in the working node.  Now reverting the
replace cannot restore the original copy since the N+1 is gone, so
what does the copied file look like, revision N?  I suppose that is
acceptable, the original copy, prior to the replace, was not a simple
copy.  We could argue that the copy of a mixed revision working copy
is already a bunch of roots.  Suppose I delete the copied file at N+1
and then revert.  Does it go back to N+1 or N?

> The BASE_NODE table is "what", now "how", so it more closely resembles
> the suggested NODE_DATA table.
>
>>> Those columns, plus the key, may be about it. I don't know that this
>>> table needs a presence column, as the "visible" state is determined by
>>> the BASE and WORKING trees. This is why I suggest that maybe we're
>>> looking more at how to represent (in the database) the WORKING tree,
>>> than truly adding a new "tree".
>>
>> One thing that occurs to me is that this layering always occurs on
>> deleted children of copied parents, it never occurs on roots of
>> operations (be they adds, deletes, copies or moves).
>
> I can copy/move subtrees into another copied-subtree without
> replacement. But you're right: all the resulting nodes are disjoint.
> No true layering occurs.
>
>>  Roots can never
>> lie one on top of the other.  I wonder if we should make WORKING_NODE
>> only hold roots, and have a different node type for children.  The
>> child node would not need the columns that are inherited from the
>> parent,
>
> That is how NODE_DATA is defined :-) ... the WORKING_NODE table just
> defines operations and all the data for the nodes lives over in
> NODE_DATA.

The two ideas are probably closer than I realised :)

So if I copy a directory with children is there a working node for
each child?

-- 
Philip

Re: fourth tree: "INHERITED"

Posted by Greg Stein <gs...@gmail.com>.
On Tue, Apr 13, 2010 at 06:45, Philip Martin <ph...@wandisco.com> wrote:
> Greg Stein <gs...@gmail.com> writes:
>
>> After some further discussion on IRC, and some thought...
>>
>> I think this may be more of a representational problem, and might not
>> be a "true" fourth tree. Especially because supporting the revert
>> scenario actually implies N trees. Bert tried to describe this a while
>> back, but I didn't understand his description (too many "A" nodes).
>> Consider the following:
>>
>> $ svn cp A X  # copies A/Y/Z/file
>> $ svn cp B X/Y  # copies B/Z/file
>> $ svn cp C X/Y/Z  # copies C/file
>> $ svn cp file X/Y/Z/file
>
> Just to be clear, the second, third and fourth copies need the
> destination to be deleted first.

Ah. True, yes.

>> We have four operation roots, and four layers of "file". Reverting
>> each op-root will reveal the previous layer.
>>
>> In 1.6, we probably had just one layer, but if we're going to solve
>> this, then let's do it right.
>
> The current three tree model can support the creation of all those
> copies, it's only the step-by-step revert that is a problem.  The
> current wc-ng only really allows the revert of all the copies in one
> go.

Right.

And you demonstrated where 1.6 could do one level of revert.
Therefore, we should be able to do (at least) one level in 1.7.

>> I propose that we create a new table called NODE_DATA which is keyed
>> by <wc_id, local_relpath, op_depth>. The first two are the usual, and
>> op_depth is the "operation depth". In the above example, we have four
>> WORKING_NODE rows, each establishing an operation root, with
>> local_relpath values of [X, X/Y, X/Y/Z, X/Y/Z/file]. In the NODE_DATA
>> table, we have the following four rows:
>>
>> <1, X/Y/Z/file, 1>  # from the X op-root
>> <1, X/Y/Z/file, 2>  # from the X/Y op-root
>> <1, X/Y/Z/file, 3>  # from the X/Y/Z op-root
>> <1, X/Y/Z/file, 4>  # from the X/Y/Z/file op-root
>>
>> Essentially, op_depth = oproot_relpath.count('/') + 1
>>
>> We can record BASE node data as op_depth == 0.
>>
>> Looking up the data for "file" is a query like this:
>>
>> SELECT * from NODE_DATA
>> WHERE wc_id = ?1 AND local_relpath = ?2
>> ORDER BY op_depth DESC
>> LIMIT 1;
>>
>> That provides the "current" file data.
>>
>> Some of the common columns between BASE_NODE and WORKING_NODE move to
>> this new NODE_DATA table. I think they are:
>>
>>   kind, [checksum], changed_*, properties
>
> I think NODE_DATA needs more or less everything that is in the current
> WORKING_NODE.  When a layer is reverted to uncover the layer below all
> the old columns need to be available.  As far as I can see we need to
> remove the WORKING_NODE tree and replace it with the NODE_DATA tree,
> or to put it another way we need to add the op_depth column to
> WORKING_NODE.

I don't think so.

As you note layer, operations on a node cannot be layered/stacked. You
can modify the operation at a node, but you can't layer over it.
Columns like copyfrom_* and moved_* are about the operation, rather
than the node's data. It says *how* the node got there, rather than
talk about the node itself.

The BASE_NODE table is "what", now "how", so it more closely resembles
the suggested NODE_DATA table.

>> Those columns, plus the key, may be about it. I don't know that this
>> table needs a presence column, as the "visible" state is determined by
>> the BASE and WORKING trees. This is why I suggest that maybe we're
>> looking more at how to represent (in the database) the WORKING tree,
>> than truly adding a new "tree".
>
> One thing that occurs to me is that this layering always occurs on
> deleted children of copied parents, it never occurs on roots of
> operations (be they adds, deletes, copies or moves).

I can copy/move subtrees into another copied-subtree without
replacement. But you're right: all the resulting nodes are disjoint.
No true layering occurs.

>  Roots can never
> lie one on top of the other.  I wonder if we should make WORKING_NODE
> only hold roots, and have a different node type for children.  The
> child node would not need the columns that are inherited from the
> parent,

That is how NODE_DATA is defined :-) ... the WORKING_NODE table just
defines operations and all the data for the nodes lives over in
NODE_DATA.

> but it would have a column that defined how many generations
> the child is from the root.

Using op_depth allows you to find all the children for a given
operation. Using a descending sort on op_depth still allows you to use
LIMIT 1 to fetch the most-recent/current node.

>  Selecting a nodes data then involves
> looking in WORKING_CHILD_NODE, WORKING_NODE and BASE_NODE.
>
> SELECT * from WORKING_CHILD_NODE
> where wc_id = ?1 AND local_relpath = ?2
> ORDER BY generation
> LIMIT 1
>
> If a WORKING_CHILD_NODE is found then the generation column allows
> easy access to the related WORKING_NODE root, if it is not found then
> look in WORKING_NODE directly for a root (and if not found there then
> look in BASE_NODE).

op_depth also provides this quick access to the root.

A/B/C/D/file, op_depth=2 means that A/B is the operation root. I think
it is a bit clearer than the generation version.

Cheers,
-g

Re: fourth tree: "INHERITED"

Posted by Philip Martin <ph...@wandisco.com>.
Greg Stein <gs...@gmail.com> writes:

> After some further discussion on IRC, and some thought...
>
> I think this may be more of a representational problem, and might not
> be a "true" fourth tree. Especially because supporting the revert
> scenario actually implies N trees. Bert tried to describe this a while
> back, but I didn't understand his description (too many "A" nodes).
> Consider the following:
>
> $ svn cp A X  # copies A/Y/Z/file
> $ svn cp B X/Y  # copies B/Z/file
> $ svn cp C X/Y/Z  # copies C/file
> $ svn cp file X/Y/Z/file

Just to be clear, the second, third and fourth copies need the
destination to be deleted first.

> We have four operation roots, and four layers of "file". Reverting
> each op-root will reveal the previous layer.
>
> In 1.6, we probably had just one layer, but if we're going to solve
> this, then let's do it right.

The current three tree model can support the creation of all those
copies, it's only the step-by-step revert that is a problem.  The
current wc-ng only really allows the revert of all the copies in one
go.

> I propose that we create a new table called NODE_DATA which is keyed
> by <wc_id, local_relpath, op_depth>. The first two are the usual, and
> op_depth is the "operation depth". In the above example, we have four
> WORKING_NODE rows, each establishing an operation root, with
> local_relpath values of [X, X/Y, X/Y/Z, X/Y/Z/file]. In the NODE_DATA
> table, we have the following four rows:
>
> <1, X/Y/Z/file, 1>  # from the X op-root
> <1, X/Y/Z/file, 2>  # from the X/Y op-root
> <1, X/Y/Z/file, 3>  # from the X/Y/Z op-root
> <1, X/Y/Z/file, 4>  # from the X/Y/Z/file op-root
>
> Essentially, op_depth = oproot_relpath.count('/') + 1
>
> We can record BASE node data as op_depth == 0.
>
> Looking up the data for "file" is a query like this:
>
> SELECT * from NODE_DATA
> WHERE wc_id = ?1 AND local_relpath = ?2
> ORDER BY op_depth DESC
> LIMIT 1;
>
> That provides the "current" file data.
>
> Some of the common columns between BASE_NODE and WORKING_NODE move to
> this new NODE_DATA table. I think they are:
>
>   kind, [checksum], changed_*, properties

I think NODE_DATA needs more or less everything that is in the current
WORKING_NODE.  When a layer is reverted to uncover the layer below all
the old columns need to be available.  As far as I can see we need to
remove the WORKING_NODE tree and replace it with the NODE_DATA tree,
or to put it another way we need to add the op_depth column to
WORKING_NODE.

> Those columns, plus the key, may be about it. I don't know that this
> table needs a presence column, as the "visible" state is determined by
> the BASE and WORKING trees. This is why I suggest that maybe we're
> looking more at how to represent (in the database) the WORKING tree,
> than truly adding a new "tree".

One thing that occurs to me is that this layering always occurs on
deleted children of copied parents, it never occurs on roots of
operations (be they adds, deletes, copies or moves).  Roots can never
lie one on top of the other.  I wonder if we should make WORKING_NODE
only hold roots, and have a different node type for children.  The
child node would not need the columns that are inherited from the
parent, but it would have a column that defined how many generations
the child is from the root.  Selecting a nodes data then involves
looking in WORKING_CHILD_NODE, WORKING_NODE and BASE_NODE.

SELECT * from WORKING_CHILD_NODE
where wc_id = ?1 AND local_relpath = ?2
ORDER BY generation
LIMIT 1

If a WORKING_CHILD_NODE is found then the generation column allows
easy access to the related WORKING_NODE root, if it is not found then
look in WORKING_NODE directly for a root (and if not found there then
look in BASE_NODE).

-- 
Philip

Re: fourth tree: "INHERITED" (was: wc-ng base/working nodes in a copied tree)

Posted by Greg Stein <gs...@gmail.com>.
On Wed, Apr 7, 2010 at 17:46, Bert Huijben <be...@qqmail.nl> wrote:
>...
>> Some of the common columns between BASE_NODE and WORKING_NODE move to
>> this new NODE_DATA table. I think they are:
>>
>>   kind, [checksum], changed_*, properties

Note that I missed symlink_target above.

> I think you need some kind of 'not-present'/'excluded' status in this
> copied/inherited/fourth tree too.
>
> If you have a working copy
>
> /A
> /A/B
> /A/B/C
>
> Where you exclude C
>
> /A
> /A/B
> [/A/B/C] (excluded)
>
> Then you make a copy of /A to /D
>
> /A
> [/A/B/C] (excluded)
> /D (copy)
> /D/B (child of copy)
> [/D/B/C] (excluded child of copy)
>
> And then we delete /D/B
>
> /A
> /A/B[/C]
> /D (copy)
> /D/B (deleted)
>
> Then I want to revert the delete of /D/B.
>
> /A
> /A/B[/C]
> /D (copy)
> /D/B (child of copy)
>
> Note that there is no:
>  [/D/B/C] (excluded child of copy)
>
> The new COPIED table allows me to get all the information on B back, but I
> miss the information that it once had a child C that was excluded.

Ah. Gotcha.

Will ponder...

>...

Cheers,
-g

RE: fourth tree: "INHERITED" (was: wc-ng base/working nodes in a copied tree)

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Greg Stein [mailto:gstein@gmail.com]
> Sent: woensdag 7 april 2010 23:25
> To: Philip Martin
> Cc: dev@subversion.apache.org
> Subject: Re: fourth tree: "INHERITED" (was: wc-ng base/working nodes in
> a copied tree)
> 
> After some further discussion on IRC, and some thought...
> 
> I think this may be more of a representational problem, and might not
> be a "true" fourth tree. Especially because supporting the revert
> scenario actually implies N trees. Bert tried to describe this a while
> back, but I didn't understand his description (too many "A" nodes).
> Consider the following:
> 
> $ svn cp A X  # copies A/Y/Z/file
> $ svn cp B X/Y  # copies B/Z/file
> $ svn cp C X/Y/Z  # copies C/file
> $ svn cp file X/Y/Z/file
> 
> We have four operation roots, and four layers of "file". Reverting
> each op-root will reveal the previous layer.
> 
> In 1.6, we probably had just one layer, but if we're going to solve
> this, then let's do it right.
> 
> I propose that we create a new table called NODE_DATA which is keyed
> by <wc_id, local_relpath, op_depth>. The first two are the usual, and
> op_depth is the "operation depth". In the above example, we have four
> WORKING_NODE rows, each establishing an operation root, with
> local_relpath values of [X, X/Y, X/Y/Z, X/Y/Z/file]. In the NODE_DATA
> table, we have the following four rows:
> 
> <1, X/Y/Z/file, 1>  # from the X op-root
> <1, X/Y/Z/file, 2>  # from the X/Y op-root
> <1, X/Y/Z/file, 3>  # from the X/Y/Z op-root
> <1, X/Y/Z/file, 4>  # from the X/Y/Z/file op-root
> 
> Essentially, op_depth = oproot_relpath.count('/') + 1
> 
> We can record BASE node data as op_depth == 0.
> 
> Looking up the data for "file" is a query like this:
> 
> SELECT * from NODE_DATA
> WHERE wc_id = ?1 AND local_relpath = ?2
> ORDER BY op_depth DESC
> LIMIT 1;
> 
> That provides the "current" file data.
> 
> Some of the common columns between BASE_NODE and WORKING_NODE move to
> this new NODE_DATA table. I think they are:
> 
>   kind, [checksum], changed_*, properties

I think you need some kind of 'not-present'/'excluded' status in this
copied/inherited/fourth tree too.

If you have a working copy

/A
/A/B
/A/B/C

Where you exclude C

/A
/A/B
[/A/B/C] (excluded)

Then you make a copy of /A to /D

/A
[/A/B/C] (excluded)
/D (copy)
/D/B (child of copy)
[/D/B/C] (excluded child of copy)

And then we delete /D/B

/A
/A/B[/C]
/D (copy)
/D/B (deleted)

Then I want to revert the delete of /D/B.

/A
/A/B[/C]
/D (copy)
/D/B (child of copy)

Note that there is no:
 [/D/B/C] (excluded child of copy)

The new COPIED table allows me to get all the information on B back, but I
miss the information that it once had a child C that was excluded.

So if I commit the tree I get a working copy that assumes that I still have
/D/B/C, because nothing recorded that I don't have that node locally.

	Bert

Re: fourth tree: "INHERITED" (was: wc-ng base/working nodes in a copied tree)

Posted by Greg Stein <gs...@gmail.com>.
After some further discussion on IRC, and some thought...

I think this may be more of a representational problem, and might not
be a "true" fourth tree. Especially because supporting the revert
scenario actually implies N trees. Bert tried to describe this a while
back, but I didn't understand his description (too many "A" nodes).
Consider the following:

$ svn cp A X  # copies A/Y/Z/file
$ svn cp B X/Y  # copies B/Z/file
$ svn cp C X/Y/Z  # copies C/file
$ svn cp file X/Y/Z/file

We have four operation roots, and four layers of "file". Reverting
each op-root will reveal the previous layer.

In 1.6, we probably had just one layer, but if we're going to solve
this, then let's do it right.

I propose that we create a new table called NODE_DATA which is keyed
by <wc_id, local_relpath, op_depth>. The first two are the usual, and
op_depth is the "operation depth". In the above example, we have four
WORKING_NODE rows, each establishing an operation root, with
local_relpath values of [X, X/Y, X/Y/Z, X/Y/Z/file]. In the NODE_DATA
table, we have the following four rows:

<1, X/Y/Z/file, 1>  # from the X op-root
<1, X/Y/Z/file, 2>  # from the X/Y op-root
<1, X/Y/Z/file, 3>  # from the X/Y/Z op-root
<1, X/Y/Z/file, 4>  # from the X/Y/Z/file op-root

Essentially, op_depth = oproot_relpath.count('/') + 1

We can record BASE node data as op_depth == 0.

Looking up the data for "file" is a query like this:

SELECT * from NODE_DATA
WHERE wc_id = ?1 AND local_relpath = ?2
ORDER BY op_depth DESC
LIMIT 1;

That provides the "current" file data.

Some of the common columns between BASE_NODE and WORKING_NODE move to
this new NODE_DATA table. I think they are:

  kind, [checksum], changed_*, properties

Those columns, plus the key, may be about it. I don't know that this
table needs a presence column, as the "visible" state is determined by
the BASE and WORKING trees. This is why I suggest that maybe we're
looking more at how to represent (in the database) the WORKING tree,
than truly adding a new "tree".

Cheers,
-g

On Wed, Apr 7, 2010 at 16:15, Greg Stein <gs...@gmail.com> wrote:
> On Wed, Apr 7, 2010 at 04:52, Philip Martin <ph...@wandisco.com> wrote:
>> Greg Stein <gs...@gmail.com> writes:
>>
>>> I believe that we have the following operations:
>>>
>>> add: plain old add
>>> add-within-copy/move: add within a subtree
>>
>> Those two are much the same.  It makes little difference whether it's
>> a plain add, an add within an add or an add within a copy/move.
>
> Well... we couldn't distinguish these two cases before. With a few
> additional presence values, we can.
>
>>> replace: delete-base + add
>>> replace-within-copy/move: delete-child + add
>>
>> There is also
>>
>> replace: base node not-present + add
>>
>> the old deleted=true state.  This is explicitly excluded from being a
>> replace in svn_wc__internal_is_replaced but that may be a bug (it
>> causes revert to erroneously remove the not-present base node).
>
> It is excluded from being svn_wc_schedule_replace. I'd prefer to
> ignore that fact.
>
>
>>
>>> delete-base: deleting a base node
>>> delete-child: deleting a child of an add/copy/move
>>> copy(-within): same as add•
>>> moved-here(-within): same as add*
>>> moved-away(-within): same as delete*
>>>
>>> I'm thinking that we add the following presence values (for
>>> WORKING_NODE.presence):
>>>
>>> "added": same as "normal" today. clarifies what this really means. we
>>> map this to status_added, so why not use this for the presence
>>> anyways? no need to "minimize/conserve" the set of presence values.
>>>
>>> "replaced": indicates an added/copied-here/moved-here node that
>>> replaces a child of a copied-here/moved-here subtree.
>>>
>>> "deleted": same as "not-present" today. clarifies what this really means.
>>>
>>> "inherit": applied to children of copied-here/moved-here and
>>> deleted/base-deleted nodes. implies that no commit operations are
>>> required for these nodes.
>>
>> Being able to distinguish add and replace is not enough for full 1.6
>> compatibility.  When a node replaces a copied child it overwrites the
>> child's data, things like checksum and properties.  This data is not
>> derived or inherited from the copied parent, so it cannot be restored
>> after being overwriten.  In 1.6 it is possible to revert the replace
>> and restore to the copied child.
>
> Urgh. Yeah. I've only ever looked at that scenario as "no partial
> reverts (yet). revert the whole thing". In that case, meaning the
> whole copy. But that's not right, if the user is reverting
> SOME/SUBDIR/PATH. Only the change(s) to the child should be reverted.
>
>> Another problem is a copy of a mixed revision tree that includes base
>> nodes that are not-present.  In 1.6 we represent these as "fake"
>> schedule deletes in the copy, so that they are explicitly deleted when
>> the copy is committed.  This works but has problems, the main one
>> being that if one tries to revert the delete the full node information
>> is not available (because the not-present source doesn't have it).
>> Perhaps we should have a distinct presence for this type of node?
>> There are similar questions about absent and excluded nodes.
>
> Okay. In this case, reverting the delete should result in an excluded
> node. That's the best we can do. "svn update --depth=infinity
> NOTPRESENT" can restore the file.
>
> Note that we *could* revert a replaced-child into an excluded node,
> too. That would be a feature loss compared to 1.6, however.
>
>
> It seems like we need one more tree, to hold "inherit" data. If you do
> a further operation on an operation-root (something in WORKING_NODE),
> then it will alter that node and all inherited nodes. But if you
> perform an operation on an inherited node, then you're establishing a
> new operation-root in WORKING_NODE. The inherited data can still
> remain elsewhere, but WORKING_NODE now refers to a new operation-root.
> Reverting that operation would bring back the inherited node.
>
> Thus, WORKING_NODE would become *just* explicit operations. ie. the
> stuff we send during a commit.
>
> Hmm. Almost...
>
> A delete operation could mark the root in WORKING_NODE (we'd probably
> want a new name for this!), and then drop a bunch of markers in
> WORKING_NODE to occlude any children present in the inherited tree and
> the BASE tree. Those markers couldn't be individually reverted
> however, and they don't represent data to send during commit. We
> *could* alter a presence-like flag in the inherited tree instead, yet
> leave all the data intact for a revert. Or place markers in the
> inherited tree to occlude the BASE nodes.
>
> Thoughts?
>
> Cheers,
> -g
>