You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Philip Martin <ph...@wandisco.com> on 2010/03/22 16:48:25 UTC

wc-ng base/working nodes in a copied tree

Consider copying an unmodified directory. Assume the source of the
copy exists in the repository, the target does not.  The copy could be
repo-to-wc or wc-to-wc.  The result of the copy is an added directory
in the working copy with a working_node and no base_node.  The
working_node has copyfrom data to mark this as a copy rather than a
plain add.  I believe this is the correct behaviour.

If the source directory contains a subdir the copied directory also
contains a subdir (assuming a full depth copy).  At present the copied
subdir has a base_node and no working_node and it doesn't have
copyfrom data as there is no such data in base_node. Is that the
correct behaviour or a bug?  Does it make sense to for a node to have
a base_node when the parent has only a working_node?  If the subdir
should really have a working_node instead of a base_node how do we
distinguish a copied subdir from a plain added subdir?  Do we set
copyfrom data to the subdir working_node?  I thought copyfrom only
gets set on the root of the copy.

-- 
Philip

RE: wc-ng base/working nodes in a copied tree

Posted by Bert Huijben <be...@qqmail.nl>.


> -----Original Message-----
> From: Philip Martin [mailto:philip.martin@wandisco.com]
> Sent: woensdag 31 maart 2010 11:20
> To: Greg Stein
> Cc: neels; dev@subversion.apache.org
> Subject: Re: wc-ng base/working nodes in a copied tree
> 
> Greg Stein <gs...@gmail.com> writes:
> 
> > On Wed, Mar 24, 2010 at 16:41, Philip Martin
> <ph...@wandisco.com> wrote:
> >> neels <ne...@gmail.com> writes:
> >>> On 23 March 2010 09:11, Greg Stein <gs...@gmail.com> wrote:
> >>>> On Mon, Mar 22, 2010 at 17:59, Philip Martin
> <ph...@wandisco.com> wrote:
> >>
> >> We should consider using copyfrom_repos_path.  The current method of
> >> only storing copyfrom_* on the root of the copy means that
> >> copyfrom_repos_path needs to be calculated every time its value is
> >> required.
> >
> > I doubt that we use it independently of the other fields, so scanning
> > upwards for the others can also compute the relpath.
> >
> > We do the same thing for the regular repos_id and repos_relpath.
> 
> I see there is a comment to that effect in wc-metadata.sql so perhaps
> that was the intent, but in practice repos_id and repos_relpath appear
> to be set in every base_node.

This is mostly caused by the write from entry code, which will disappear before 1.7. 

But I expect that we will find quite some code that relies on the data being available anyway. One thing that certainly relies on this is the retrieval of locks via a JOIN with the locks table.

	Bert

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

On Thu, Apr 8, 2010 at 05:53, Philip Martin <ph...@wandisco.com> wrote:
> Greg Stein <gs...@gmail.com> writes:
>
>> Okay. So I guess we continue to mark that as sched-delete.
>
> sched-delete is what we can do in 1.6, but it's not perfect.  This is
> not a normal delete by the user it's more of an internal Subversion
> delete.  One of the big differences is that the user cannot revert the
> delete, because the information about the node is not available.
> wc-ng could distinguish this from a normal delete.

We could distinguish it in some way, yes. At a minimum, we can mark
the node excluded when the user reverts the "deletion". (today, 1.6
fails, as I recall)

Cheers,
-g

Re: wc-ng base/working nodes in a copied tree

Posted by Philip Martin <ph...@wandisco.com>.

Greg Stein <gs...@gmail.com> writes:

> Okay. So I guess we continue to mark that as sched-delete.

sched-delete is what we can do in 1.6, but it's not perfect.  This is
not a normal delete by the user it's more of an internal Subversion
delete.  One of the big differences is that the user cannot revert the
delete, because the information about the node is not available.
wc-ng could distinguish this from a normal delete.

-- 
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

On Wed, Apr 7, 2010 at 07:09, Philip Martin <ph...@wandisco.com> wrote:
> Greg Stein <gs...@gmail.com> writes:
>
>> Excluded in the wc is just that. It does not mean "delete upon commit." We
>> have other statii to mean that.
>
> In 1.6 when we copy a tree containing deleted=true we mark the copied
> node so that it gets deleted upon commit.  Are we going to change
> that?

That's my suggestion, yes. I believe that behavior is a bug. Consider
that you have DIR@10 and DIR/A@11 is not-present. If you copy DIR@10
to OTHER, then you should get DIR/A@10 along with it. We don't have
the data on the client, so we just mark that node excluded.

Hmm. But I guess the operation is "set up the local wc so that, when I
commit, it will look like DIR." Mixed-rev and all. And any local ops
in DIR would carry over, too.

Okay. So I guess we continue to mark that as sched-delete.

>  If we mark the node excluded do we use some additional mark to
> indicate delete?

To delete a node, it needs to be present in the working copy. ie. depth=empty.

But never mind this particular scenario.

>...

Now back to the prior messages...

Cheers,
-g

Re: wc-ng base/working nodes in a copied tree

Posted by Philip Martin <ph...@wandisco.com>.

Greg Stein <gs...@gmail.com> writes:

> Excluded in the wc is just that. It does not mean "delete upon commit." We
> have other statii to mean that.

In 1.6 when we copy a tree containing deleted=true we mark the copied
node so that it gets deleted upon commit.  Are we going to change
that?  If we mark the node excluded do we use some additional mark to
indicate delete?

> Imagine a local-copy of a large tree, simul with excluding a large portion
> so that u don't have to keep/copy as much locally. That doesn't mean
> "delete". It is simply an organizational mechanism.

When one tags such a working copy should the organizational mechanism
be included in the tag?  I can see arguments for and against.  The
current behaviour is not even consistent: exporting is sparse,
wc-to-wc copy is sparse, wc-to-repo copy is not sparse, commit is not
sparse.

However that's not really an urgent question.  How we represent
replaces is more pressing.

-- 
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

Excluded in the wc is just that. It does not mean "delete upon commit." We
have other statii to mean that.

Imagine a local-copy of a large tree, simul with excluding a large portion
so that u don't have to keep/copy as much locally. That doesn't mean
"delete". It is simply an organizational mechanism.

On Apr 7, 2010 5:44 AM, "Philip Martin" <ph...@wandisco.com> wrote:

Greg Stein <gs...@gmail.com> writes:

> > Another problem is a copy of a mixed revision tree that includes base
> > nodes that are not-pre...

> on phone, so this will be terse, but wanted you to consider: I'd thought
> about the copy-of-not-p...
If the source has both excluded and not-present nodes, do we need to
distinguish them in the copy?  Would we delete all the excluded nodes
in the copy when committing?  There was a thread a few months ago
where a user reported that a wc-to-repo copy of a sparse working copy
didn't result in a sparse copy and asked if this was a bug; we didn't
really reach a consensus about what would be the correct behaviour.

--
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

On Wed, Apr 7, 2010 at 05:44, Philip Martin <ph...@wandisco.com> wrote:
> Greg Stein <gs...@gmail.com> writes:
>
>> > Another problem is a copy of a mixed revision tree that includes base
>> > nodes that are not-present.  In 1.6 we represent these as "fake"
>> > schedule deletes in the copy, so that they are explicitly deleted when
>> > the copy is committed.  This works but has problems, the main one
>> > being that if one tries to revert the delete the full node information
>> > is not available (because the not-present source doesn't have it).
>> > Perhaps we should have a distinct presence for this type of node?
>> > There are similar questions about absent and excluded nodes.
>>
>> on phone, so this will be terse, but wanted you to consider: I'd thought
>> about the copy-of-not-present case, and think we should actually represent
>> those nodes as excluded.
>
> If the source has both excluded and not-present nodes, do we need to
> distinguish them in the copy?  Would we delete all the excluded nodes
> in the copy when committing?  There was a thread a few months ago
> where a user reported that a wc-to-repo copy of a sparse working copy
> didn't result in a sparse copy and asked if this was a bug; we didn't
> really reach a consensus about what would be the correct behaviour.

Ignoring my "convert not-present to excluded" concept...

I think excluding a node is just cleaning up your working copy, but
those nodes participate in any ancestor operation (delete, move,
copy).

Note that we don't have a copy-with-depth operation at the moment, but
I think we could do so in the future. e.g copy immediates, with sets
child dirs to depth=empty. or copy files which sets child dirs to
excluded. In these cases, the children are still "there", but just not
in the working copy.

Cheers,
-g

Re: wc-ng base/working nodes in a copied tree

Posted by Philip Martin <ph...@wandisco.com>.

Greg Stein <gs...@gmail.com> writes:

> > Another problem is a copy of a mixed revision tree that includes base
> > nodes that are not-present.  In 1.6 we represent these as "fake"
> > schedule deletes in the copy, so that they are explicitly deleted when
> > the copy is committed.  This works but has problems, the main one
> > being that if one tries to revert the delete the full node information
> > is not available (because the not-present source doesn't have it).
> > Perhaps we should have a distinct presence for this type of node?
> > There are similar questions about absent and excluded nodes.
>
> on phone, so this will be terse, but wanted you to consider: I'd thought
> about the copy-of-not-present case, and think we should actually represent
> those nodes as excluded.

If the source has both excluded and not-present nodes, do we need to
distinguish them in the copy?  Would we delete all the excluded nodes
in the copy when committing?  There was a thread a few months ago
where a user reported that a wc-to-repo copy of a sparse working copy
didn't result in a sparse copy and asked if this was a bug; we didn't
really reach a consensus about what would be the correct behaviour.

-- 
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

on phone, so this will be terse, but wanted you to consider: I'd thought
about the copy-of-not-present case, and think we should actually represent
those nodes as excluded.

Thoughts?

On Apr 7, 2010 4:52 AM, "Philip Martin" <ph...@wandisco.com> wrote:

Greg Stein <gs...@gmail.com> writes:

> I believe that we have the following operations:
>
> add: plain old add
> add-within-copy/move: ad...
Those two are much the same.  It makes little difference whether it's
a plain add, an add within an add or an add within a copy/move.


> replace: delete-base + add
> replace-within-copy/move: delete-child + add
There is also

replace: base node not-present + add

the old deleted=true state.  This is explicitly excluded from being a
replace in svn_wc__internal_is_replaced but that may be a bug (it
causes revert to erroneously remove the not-present base node).


> delete-base: deleting a base node
> delete-child: deleting a child of an add/copy/move
> copy(-wi...

> I'm thinking that we add the following presence values (for
> WORKING_NODE.presence):
>
> "added":...
Being able to distinguish add and replace is not enough for full 1.6
compatibility.  When a node replaces a copied child it overwrites the
child's data, things like checksum and properties.  This data is not
derived or inherited from the copied parent, so it cannot be restored
after being overwriten.  In 1.6 it is possible to revert the replace
and restore to the copied child.

Another problem is a copy of a mixed revision tree that includes base
nodes that are not-present.  In 1.6 we represent these as "fake"
schedule deletes in the copy, so that they are explicitly deleted when
the copy is committed.  This works but has problems, the main one
being that if one tries to revert the delete the full node information
is not available (because the not-present source doesn't have it).
Perhaps we should have a distinct presence for this type of node?
There are similar questions about absent and excluded nodes.
--
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by Philip Martin <ph...@wandisco.com>.

Greg Stein <gs...@gmail.com> writes:

> I believe that we have the following operations:
>
> add: plain old add
> add-within-copy/move: add within a subtree

Those two are much the same.  It makes little difference whether it's
a plain add, an add within an add or an add within a copy/move.

> replace: delete-base + add
> replace-within-copy/move: delete-child + add

There is also

replace: base node not-present + add

the old deleted=true state.  This is explicitly excluded from being a
replace in svn_wc__internal_is_replaced but that may be a bug (it
causes revert to erroneously remove the not-present base node).

> delete-base: deleting a base node
> delete-child: deleting a child of an add/copy/move
> copy(-within): same as add•
> moved-here(-within): same as add*
> moved-away(-within): same as delete*
>
> I'm thinking that we add the following presence values (for
> WORKING_NODE.presence):
>
> "added": same as "normal" today. clarifies what this really means. we
> map this to status_added, so why not use this for the presence
> anyways? no need to "minimize/conserve" the set of presence values.
>
> "replaced": indicates an added/copied-here/moved-here node that
> replaces a child of a copied-here/moved-here subtree.
>
> "deleted": same as "not-present" today. clarifies what this really means.
>
> "inherit": applied to children of copied-here/moved-here and
> deleted/base-deleted nodes. implies that no commit operations are
> required for these nodes.

Being able to distinguish add and replace is not enough for full 1.6
compatibility.  When a node replaces a copied child it overwrites the
child's data, things like checksum and properties.  This data is not
derived or inherited from the copied parent, so it cannot be restored
after being overwriten.  In 1.6 it is possible to revert the replace
and restore to the copied child.

Another problem is a copy of a mixed revision tree that includes base
nodes that are not-present.  In 1.6 we represent these as "fake"
schedule deletes in the copy, so that they are explicitly deleted when
the copy is committed.  This works but has problems, the main one
being that if one tries to revert the delete the full node information
is not available (because the not-present source doesn't have it).
Perhaps we should have a distinct presence for this type of node?
There are similar questions about absent and excluded nodes.
-- 
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

On Wed, Mar 31, 2010 at 05:19, Philip Martin <ph...@wandisco.com> wrote:
> Greg Stein <gs...@gmail.com> writes:
>...
>> I doubt that we use it independently of the other fields, so scanning
>> upwards for the others can also compute the relpath.
>>
>> We do the same thing for the regular repos_id and repos_relpath.
>
> I see there is a comment to that effect in wc-metadata.sql so perhaps
> that was the intent, but in practice repos_id and repos_relpath appear
> to be set in every base_node.

Yes. That isn't intentional, but is a current artifact of the implementation.

>...
>>> I suppose we could, but I think we already have enough storage for
>>> this problem.  If we were to adopt a new presence I think I'd make the
>>> copied child have the new value.  Nodes that are simple adds are very
>>> similar to nodes that are the root of a copy: both represent new items
>>> in the repository and it seems reasonable that they have the same
>>> presence value.  Copied children are the ones that are special or
>>> different.
>>
>> Children of copies/moves/adds are all about the same.
>
> I'd disagree.
>
>> I think it is
>> the root that stands out, especially because it is *that* node where
>> we store the copy/move information. Thus, I would suggest a presence
>> named "root" or "oproot" (operation root).
>
> Yes, roots stand out.  But plain adds are like roots: they show up in
> status, they need to be explicitly reported during a commit, they can
> be reverted.  Children of adds are themselves adds and are thus roots.
> It's the children of copies/moves that are different, they are not
> roots, their presence is implied by the presence of the parent.

After pondering... yup. I'd agree.

And note that deletes are similar -- the children do not need to be
reported/acted-upon separately from the parent operation.

>...

It is also near time for us to pick a solution here. I just had to
mark several tests as XFail() specifically because we cannot represent
an "add" within a copied subtree.

I would prefer a presence marker over "tricky interpretation" of the
copyfrom_* fields.

I believe that we have the following operations:

add: plain old add
add-within-copy/move: add within a subtree
replace: delete-base + add
replace-within-copy/move: delete-child + add
delete-base: deleting a base node
delete-child: deleting a child of an add/copy/move
copy(-within): same as add•
moved-here(-within): same as add*
moved-away(-within): same as delete*

I think the last three are the same as other stuff (meaning: no
consideration needed), but with some extra markers.

I'm thinking that we add the following presence values (for
WORKING_NODE.presence):

"added": same as "normal" today. clarifies what this really means. we
map this to status_added, so why not use this for the presence
anyways? no need to "minimize/conserve" the set of presence values.

"replaced": indicates an added/copied-here/moved-here node that
replaces a child of a copied-here/moved-here subtree.

"deleted": same as "not-present" today. clarifies what this really means.

"inherit": applied to children of copied-here/moved-here and
deleted/base-deleted nodes. implies that no commit operations are
required for these nodes.

The APIs from wc_db should remain the same. This is a database-only
change. The concepts are surfaced through read_info(),
scan_addition(), and scan_deletion() as before.

Thoughts welcome. There is no particular rush on this, but I'd hope to
complete it in the next few weeks. This change will require a format
bump, which I would perform *before* the props-in-database change
(which still has db-state inconsistency problems to work through).

Cheers,
-g

Re: wc-ng base/working nodes in a copied tree

Posted by Philip Martin <ph...@wandisco.com>.

Greg Stein <gs...@gmail.com> writes:

> On Wed, Mar 24, 2010 at 16:41, Philip Martin <ph...@wandisco.com> wrote:
>> neels <ne...@gmail.com> writes:
>>> On 23 March 2010 09:11, Greg Stein <gs...@gmail.com> wrote:
>>>> On Mon, Mar 22, 2010 at 17:59, Philip Martin <ph...@wandisco.com> wrote:
>>
>> We should consider using copyfrom_repos_path.  The current method of
>> only storing copyfrom_* on the root of the copy means that
>> copyfrom_repos_path needs to be calculated every time its value is
>> required.
>
> I doubt that we use it independently of the other fields, so scanning
> upwards for the others can also compute the relpath.
>
> We do the same thing for the regular repos_id and repos_relpath.

I see there is a comment to that effect in wc-metadata.sql so perhaps
that was the intent, but in practice repos_id and repos_relpath appear
to be set in every base_node.

>> The other copyfrom_* fields contain the same value through
>> the copy, so it makes sense to elide those where possible.  We could
>> use something like:
>>
>>  copyfrom_repos_id == NULL copyfrom_repos_path == NULL : added, no copy
>>  copyfrom_repos_id != NULL copyfrom_repos_path != NULL : root of copy/move
>>  copyfrom_repos_id == NULL copyfrom_repos_path != NULL : child of copy/move
>
> Not sure about this.
>
>>> May I suggest to use the WORKING node's 'presence', as we already do
>>> with subpath deletions inside copied trees. A presence of
>>> 'not-present' currently indicates that a subpath of a recursive copy
>>> is excluded from the copy, IOW that it is the root of a delete
>>> operation inside a copy. A new value called 'not-related' could
>>> indicate that a path is the root of an *add* operation that is not
>>> related to the add operation of its parent.
>>
>> I suppose we could, but I think we already have enough storage for
>> this problem.  If we were to adopt a new presence I think I'd make the
>> copied child have the new value.  Nodes that are simple adds are very
>> similar to nodes that are the root of a copy: both represent new items
>> in the repository and it seems reasonable that they have the same
>> presence value.  Copied children are the ones that are special or
>> different.
>
> Children of copies/moves/adds are all about the same.

I'd disagree.

> I think it is
> the root that stands out, especially because it is *that* node where
> we store the copy/move information. Thus, I would suggest a presence
> named "root" or "oproot" (operation root).

Yes, roots stand out.  But plain adds are like roots: they show up in
status, they need to be explicitly reported during a commit, they can
be reverted.  Children of adds are themselves adds and are thus roots.
It's the children of copies/moves that are different, they are not
roots, their presence is implied by the presence of the parent.

> This will solve the "add into another operation's tree", but it does
> not solve the "did I replace a node in that tree? thus, do I need to
> issue a DELETE before issuing this new operation?"
>
> Having a "root" presence also means we can easily scan upwards for the
> base of an operation. Hmm. Tho I guess the base of a deletion wouldn't
> be marked root...

-- 
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

On Wed, Mar 24, 2010 at 16:41, Philip Martin <ph...@wandisco.com> wrote:
> neels <ne...@gmail.com> writes:
>> On 23 March 2010 09:11, Greg Stein <gs...@gmail.com> wrote:
>>> On Mon, Mar 22, 2010 at 17:59, Philip Martin <ph...@wandisco.com> wrote:
>>>>   $ svn cp $url/A wc
>>>>   $ svn add wc/A/Y
>>>>
>>>> Suppose $url/A contains $url/A/X.  How do I distinguish between a
>>>> copied child, like wc/A/X, and an added node like wc/A/Y?  Neither has
>>>> copyfrom set.  How do I know that A/X inherits from it's parent A
>>>> while A/Y does not?
>>>
>>> Yes, you brought up this hole in the design a while back, and we've
>>> had some discussion on ways to solve it.
>
> As I recall the last problem was a copied child that was first deleted
> and then replaced, and the problem is that there is only one working
> node to represent both the deleted and added nodes.
>
> This problem is vaguely similar but here we have two working nodes, so
> we have enough storage is just a matter of deciding which values to
> store.

Okay.

>>> As Bert points out, you can
>>> use changed_* to detect the local-add, rather than local-copies.
>>>
>>> We may introduce a special copyfrom_* value to indicate "local-add"
>>> rather than "copy-from". Or maybe rely on changed_*. It is unclear on
>>> what is the best approach right now.
>
> We should consider using copyfrom_repos_path.  The current method of
> only storing copyfrom_* on the root of the copy means that
> copyfrom_repos_path needs to be calculated every time its value is
> required.

I doubt that we use it independently of the other fields, so scanning
upwards for the others can also compute the relpath.

We do the same thing for the regular repos_id and repos_relpath.

> The other copyfrom_* fields contain the same value through
> the copy, so it makes sense to elide those where possible.  We could
> use something like:
>
>  copyfrom_repos_id == NULL copyfrom_repos_path == NULL : added, no copy
>  copyfrom_repos_id != NULL copyfrom_repos_path != NULL : root of copy/move
>  copyfrom_repos_id == NULL copyfrom_repos_path != NULL : child of copy/move

Not sure about this.

>> May I suggest to use the WORKING node's 'presence', as we already do
>> with subpath deletions inside copied trees. A presence of
>> 'not-present' currently indicates that a subpath of a recursive copy
>> is excluded from the copy, IOW that it is the root of a delete
>> operation inside a copy. A new value called 'not-related' could
>> indicate that a path is the root of an *add* operation that is not
>> related to the add operation of its parent.
>
> I suppose we could, but I think we already have enough storage for
> this problem.  If we were to adopt a new presence I think I'd make the
> copied child have the new value.  Nodes that are simple adds are very
> similar to nodes that are the root of a copy: both represent new items
> in the repository and it seems reasonable that they have the same
> presence value.  Copied children are the ones that are special or
> different.

Children of copies/moves/adds are all about the same. I think it is
the root that stands out, especially because it is *that* node where
we store the copy/move information. Thus, I would suggest a presence
named "root" or "oproot" (operation root).

This will solve the "add into another operation's tree", but it does
not solve the "did I replace a node in that tree? thus, do I need to
issue a DELETE before issuing this new operation?"

Having a "root" presence also means we can easily scan upwards for the
base of an operation. Hmm. Tho I guess the base of a deletion wouldn't
be marked root...

Anyways. Any thoughts?

Cheers,
-g

Re: wc-ng base/working nodes in a copied tree

Posted by Philip Martin <ph...@wandisco.com>.

neels <ne...@gmail.com> writes:

> On 23 March 2010 09:11, Greg Stein <gs...@gmail.com> wrote:
>> On Mon, Mar 22, 2010 at 17:59, Philip Martin <ph...@wandisco.com> wrote:
>>>   $ svn cp $url/A wc
>>>   $ svn add wc/A/Y
>>>
>>> Suppose $url/A contains $url/A/X.  How do I distinguish between a
>>> copied child, like wc/A/X, and an added node like wc/A/Y?  Neither has
>>> copyfrom set.  How do I know that A/X inherits from it's parent A
>>> while A/Y does not?
>>
>> Yes, you brought up this hole in the design a while back, and we've
>> had some discussion on ways to solve it.

As I recall the last problem was a copied child that was first deleted
and then replaced, and the problem is that there is only one working
node to represent both the deleted and added nodes.

This problem is vaguely similar but here we have two working nodes, so
we have enough storage is just a matter of deciding which values to
store.

>> As Bert points out, you can
>> use changed_* to detect the local-add, rather than local-copies.
>>
>> We may introduce a special copyfrom_* value to indicate "local-add"
>> rather than "copy-from". Or maybe rely on changed_*. It is unclear on
>> what is the best approach right now.

We should consider using copyfrom_repos_path.  The current method of
only storing copyfrom_* on the root of the copy means that
copyfrom_repos_path needs to be calculated every time its value is
required. The other copyfrom_* fields contain the same value through
the copy, so it makes sense to elide those where possible.  We could
use something like:

  copyfrom_repos_id == NULL copyfrom_repos_path == NULL : added, no copy
  copyfrom_repos_id != NULL copyfrom_repos_path != NULL : root of copy/move
  copyfrom_repos_id == NULL copyfrom_repos_path != NULL : child of copy/move

> May I suggest to use the WORKING node's 'presence', as we already do
> with subpath deletions inside copied trees. A presence of
> 'not-present' currently indicates that a subpath of a recursive copy
> is excluded from the copy, IOW that it is the root of a delete
> operation inside a copy. A new value called 'not-related' could
> indicate that a path is the root of an *add* operation that is not
> related to the add operation of its parent.

I suppose we could, but I think we already have enough storage for
this problem.  If we were to adopt a new presence I think I'd make the
copied child have the new value.  Nodes that are simple adds are very
similar to nodes that are the root of a copy: both represent new items
in the repository and it seems reasonable that they have the same
presence value.  Copied children are the ones that are special or
different.

-- 
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by neels <ne...@gmail.com>.

On 23 March 2010 09:11, Greg Stein <gs...@gmail.com> wrote:
> On Mon, Mar 22, 2010 at 17:59, Philip Martin <ph...@wandisco.com> wrote:
>>   $ svn cp $url/A wc
>>   $ svn add wc/A/Y
>>
>> Suppose $url/A contains $url/A/X.  How do I distinguish between a
>> copied child, like wc/A/X, and an added node like wc/A/Y?  Neither has
>> copyfrom set.  How do I know that A/X inherits from it's parent A
>> while A/Y does not?
>
> Yes, you brought up this hole in the design a while back, and we've
> had some discussion on ways to solve it. As Bert points out, you can
> use changed_* to detect the local-add, rather than local-copies.
>
> We may introduce a special copyfrom_* value to indicate "local-add"
> rather than "copy-from". Or maybe rely on changed_*. It is unclear on
> what is the best approach right now.

May I suggest to use the WORKING node's 'presence', as we already do
with subpath deletions inside copied trees. A presence of
'not-present' currently indicates that a subpath of a recursive copy
is excluded from the copy, IOW that it is the root of a delete
operation inside a copy. A new value called 'not-related' could
indicate that a path is the root of an *add* operation that is not
related to the add operation of its parent. (Copies inside copies
would also have this presence. And we'd have a new svn_wc__db_status_t
value.)

subversion/libsvn_wc/wc-metadata.sql
[[[
 CREATE TABLE WORKING_NODE (
...
/* Is this node "present" or has it been excluded for some reason?
     Only allowed values: normal, not-present, incomplete, base-deleted.
     (the others do not make sense for the WORKING tree)

     normal: this node has been added/copied/moved-here. There may be an
       underlying BASE node at this location, implying this is a replace.
       Scan upwards from here looking for copyfrom or moved_here values
       to detect the type of operation constructing this node.

     not-present: the node (or parent) was originally copied or moved-here.
[Note: only makes sense when a *parent* was originally copied here!]
       A subtree of that source has since been deleted. There may be
       underlying BASE node to replace. For a move-here or copy-here, the
       records are simply removed rather than switched to not-present.
       Note this reflects a deletion only. It is not possible move-away
       nodes from the WORKING tree. The purported destination would receive
       a copy from the original source of a copy-here/move-here, or if the
       nodes were plain adds, those nodes would be shifted to that target
       for addition.

     incomplete: nodes are being added into the WORKING tree, and the full
       information about this node is not (yet) present.

     base-deleted: the underlying BASE node has been marked for deletion due
       to a delete or a move-away (see the moved_to column to determine
       which), and has not been replaced.  */
  presence  TEXT NOT NULL,
...
]]]

~Neels

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

On Mon, Mar 22, 2010 at 17:59, Philip Martin <ph...@wandisco.com> wrote:
> Greg Stein <gs...@gmail.com> writes:
>
>> The tree at copied-here should only have WORKING nodes. No BASE nodes.
>>
>> If it has BASE nodes, then that is a bug.
>>
>> The tree is distinguished as a copy because of the copyfrom_*
>> information at the operation root. All the children have empty
>> copyfrom_* data. If you make a second copy into that tree, then that
>> new subtree will have copyfrom_* at its root.
>
> My question was about an add rather than a second copy.  Consider
>
>   $ svn cp $url/A wc
>   $ svn add wc/A/Y
>
> Suppose $url/A contains $url/A/X.  How do I distinguish between a
> copied child, like wc/A/X, and an added node like wc/A/Y?  Neither has
> copyfrom set.  How do I know that A/X inherits from it's parent A
> while A/Y does not?

Yes, you brought up this hole in the design a while back, and we've
had some discussion on ways to solve it. As Bert points out, you can
use changed_* to detect the local-add, rather than local-copies.

We may introduce a special copyfrom_* value to indicate "local-add"
rather than "copy-from". Or maybe rely on changed_*. It is unclear on
what is the best approach right now.

Cheers,
-g

RE: wc-ng base/working nodes in a copied tree

Posted by Bert Huijben <be...@qqmail.nl>.


> -----Original Message-----
> From: Philip Martin [mailto:philip.martin@wandisco.com]
> Sent: maandag 22 maart 2010 23:00
> To: Greg Stein
> Cc: dev@subversion.apache.org
> Subject: Re: wc-ng base/working nodes in a copied tree
> 
> Greg Stein <gs...@gmail.com> writes:
> 
> > The tree at copied-here should only have WORKING nodes. No BASE
> nodes.
> >
> > If it has BASE nodes, then that is a bug.
> >
> > The tree is distinguished as a copy because of the copyfrom_*
> > information at the operation root. All the children have empty
> > copyfrom_* data. If you make a second copy into that tree, then that
> > new subtree will have copyfrom_* at its root.
> 
> My question was about an add rather than a second copy.  Consider
> 
>    $ svn cp $url/A wc
>    $ svn add wc/A/Y
> 
> Suppose $url/A contains $url/A/X.  How do I distinguish between a
> copied child, like wc/A/X, and an added node like wc/A/Y?  Neither has
> copyfrom set.  How do I know that A/X inherits from it's parent A
> while A/Y does not?

There is no clean way to determine this, which should be fixed by making it
explicit (how is still undetermined)

For the time being you can see a difference in the changed_* columns. Local
additions have no (=NULL) values here, while copies have at least a
changed_rev value. (This is how the entries read code currently determines
whether it should set .copied or not)

	Bert

Re: wc-ng base/working nodes in a copied tree

Posted by Philip Martin <ph...@wandisco.com>.

Greg Stein <gs...@gmail.com> writes:

> The tree at copied-here should only have WORKING nodes. No BASE nodes.
>
> If it has BASE nodes, then that is a bug.
>
> The tree is distinguished as a copy because of the copyfrom_*
> information at the operation root. All the children have empty
> copyfrom_* data. If you make a second copy into that tree, then that
> new subtree will have copyfrom_* at its root.

My question was about an add rather than a second copy.  Consider

   $ svn cp $url/A wc
   $ svn add wc/A/Y

Suppose $url/A contains $url/A/X.  How do I distinguish between a
copied child, like wc/A/X, and an added node like wc/A/Y?  Neither has
copyfrom set.  How do I know that A/X inherits from it's parent A
while A/Y does not?

-- 
Philip

Re: wc-ng base/working nodes in a copied tree

Posted by Greg Stein <gs...@gmail.com>.

The tree at copied-here should only have WORKING nodes. No BASE nodes.

If it has BASE nodes, then that is a bug.

The tree is distinguished as a copy because of the copyfrom_*
information at the operation root. All the children have empty
copyfrom_* data. If you make a second copy into that tree, then that
new subtree will have copyfrom_* at its root.

On Mon, Mar 22, 2010 at 12:48, Philip Martin <ph...@wandisco.com> wrote:
> Consider copying an unmodified directory. Assume the source of the
> copy exists in the repository, the target does not.  The copy could be
> repo-to-wc or wc-to-wc.  The result of the copy is an added directory
> in the working copy with a working_node and no base_node.  The
> working_node has copyfrom data to mark this as a copy rather than a
> plain add.  I believe this is the correct behaviour.
>
> If the source directory contains a subdir the copied directory also
> contains a subdir (assuming a full depth copy).  At present the copied
> subdir has a base_node and no working_node and it doesn't have
> copyfrom data as there is no such data in base_node. Is that the
> correct behaviour or a bug?  Does it make sense to for a node to have
> a base_node when the parent has only a working_node?  If the subdir
> should really have a working_node instead of a base_node how do we
> distinguish a copied subdir from a plain added subdir?  Do we set
> copyfrom data to the subdir working_node?  I thought copyfrom only
> gets set on the root of the copy.
>
> --
> Philip
>