You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Erik Huelsmann <eh...@gmail.com> on 2008/09/11 14:26:58 UTC

Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Hi Greg,

> +svn_error_t *
> +generic_walker(svn_wc__db_t *db,
> +               const char *path,
> +               walker_func_t walk_func,
> +               void *walk_baton,
> +               apr_pool_t *scratch_pool)
> +{
> +    apr_array_header_t *queue = apr_array_make(scratch_pool, 0, 10);
> +    apr_pool_t *iterpool = svn_pool_create(scratch_pool);
> +    struct walker_entry *entry = apr_palloc(scratch_pool, sizeof(*entry));
> +
> +    entry->dirpath = path;
> +    entry->name = "";
> +    APR_ARRAY_PUSH(queue, struct walker_entry *) = entry;
> +
> +    while (queue->nelts > 0)
> +      {
> +        const char *nodepath;
> +        svn_wc__db_kind_t kind;
> +
> +        svn_pool_clear(iterpool);
> +
> +        /* pull entries off the end of the queue */
> +        entry = APR_ARRAY_IDX(queue, queue->nelts - 1, struct walker_entry *);
> +
> +        nodepath = svn_path_join(entry->dirpath, entry->name, iterpool);
> +        SVN_ERR(svn_wc__db_read_info(&kind, NULL, NULL, NULL, NULL, NULL,
> +                                     NULL, NULL, NULL, NULL, NULL, NULL,
> +                                     db, nodepath, iterpool, iterpool));
> +        if (kind == svn_wc__db_kind_dir)
> +          {
> +            const apr_array_header_t *children;
> +
> +            /* copy the path into a long-lived pool */
> +            const char *dirpath = apr_pstrdup(scratch_pool, nodepath);
> +
> +            SVN_ERR(svn_wc__db_read_children(&children, db, nodepath,
> +                                             scratch_pool, iterpool));
> +
> +            append_entries(queue, dirpath, children, scratch_pool);
> +          }
> +
> +        (*walk_func)(nodepath, walk_baton, iterpool);
> +      }
> +
> +    svn_pool_destroy(iterpool);
> +    return NULL;
> +}

When I look at one of the current problems in the WC is that we use a
database to determine what *should* be there, then using stat() to
find out if it's actually there. This is one of our biggest
performance problems.

I have been thinking that we could try to shift that around: using
readdir() to determine what's there and then using the database to
find out if our administration agrees. This way, stat()s could be
concentrated, probably even optimized by the OS, instead of requiring
single stat() calls for everything we want to know.

Not that there's anything wrong with the design above, even with what
I describe there may still be use-cases for this API, I just wanted to
point out that using a BASE or WORKING tree as the basis to operate
from (instead of the ACTUAL tree) may perpetuate existing problems.

It's just one of those mind-dumps. Sorry if this is completely out of
context. :-)


Bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-09-11 at 12:19 -0700, Greg Stein wrote:
> On Thu, Sep 11, 2008 at 9:13 AM, Julian Foad <ju...@btopenworld.com> wrote:
> >...
> > That definition of ACTUAL is a rather special case of a "tree": it is
> > not a complete Subversion tree on its own, because it doesn't know
> > anything about properties.
> 
> I see no problem with that.

I see a problem with that as a potential API user, if the API exposes
the concept of "ACTUAL tree" in a way that can involve properties. I
want to know what the API is going to tell me about these nodes'
properties. Or must the WC design have absolutely no place in the API
where it is possible to ask the WC to expose information about
properties when asking about the ACTUAL tree? If that is the case, for
example, if there is a "diff between BASE and WORKING" API which does
expose property diffs, and there is a "diff between WORKING and ACTUAL"
API which cannot expose property diffs, then they have to have two
different diff-representation interfaces.

What I'm saying is it would be better to define that the properties of
the nodes in the ACTUAL tree are none (or that they are the same as the
WORKING tree, or something, anything, as long as it's defined) so that
we can then share APIs like 'svn_wc_diff_callbacks3_t' and use them in a
deterministic way.

> > I feel that each definition of a "tree" needs to be the definition of a
> > complete tree including properties and file contents, so that it is
> > meaningful to perform any generic tree operation such as iteration, diff
> > against another tree, recursive propget, etc. without throwing an error.
> 
> Ugh. One of the problems with the current WC is that it is hard to
> tell *which* tree is being operated upon. Part of this effort is to
> always make it clear which trees are being read/modified. In other
> words, "generic tree operations" might be counter to cleaning this up.

If every API was generic and you had to specify which trees you wanted
it to operate on, that would be ugly. And if the three trees were
considered equal by every API, then the APIs and the WC in total
wouldn't be doing the things that make it special. There has to be a
bunch of special treatment, but it is immensely helpful if some of the
information-getting APIs like "proplist" and "get file content" can be
generic w.r.t. which tree they get from.

One obvious form of special treatment is in the APIs that modify the
working version. (mkdir, propset, etc.) These obviously only apply to
the WORKING/ACTUAL trees (perhaps jointly).

> > I think the most important concept to represent through one of these
> > trees is the user's view of their "working copy": the complete tree that
> 
> This is WORKING plus the file modifications that are found in ACTUAL.
> In the API when you read file contents for WORKING, they're going to
> come from ACTUAL.

Ahhh... good. I didn't notice you saying that anywhere else. So, from an
API user's point of view, the WORKING tree DOES include the contents of
the disk files, yes?

> ACTUAL is only relevant when you're looking for unversioned or missing nodes.

OK.

> >...
> > Do we want to amend the definition in that way?
> 
> I don't think so.

Well, maybe not in that way exactly, but there's a lot it doesn't
say :-)

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by Greg Stein <gs...@gmail.com>.
On Thu, Sep 11, 2008 at 9:13 AM, Julian Foad <ju...@btopenworld.com> wrote:
>...
> That definition of ACTUAL is a rather special case of a "tree": it is
> not a complete Subversion tree on its own, because it doesn't know
> anything about properties.

I see no problem with that.

> I feel that each definition of a "tree" needs to be the definition of a
> complete tree including properties and file contents, so that it is
> meaningful to perform any generic tree operation such as iteration, diff
> against another tree, recursive propget, etc. without throwing an error.

Ugh. One of the problems with the current WC is that it is hard to
tell *which* tree is being operated upon. Part of this effort is to
always make it clear which trees are being read/modified. In other
words, "generic tree operations" might be counter to cleaning this up.

>...
> In the above definition of WORKING, I don't quite understand the comment
> about "overlap with BASE" and "add-with-history". Does it mean, "...
> where the file contents in this tree are those in the BASE tree"? That
> would seem odd to me.

As I said in my other note: it simply means that many of the nodes are
the same. Not that one tree includes another.

> I think the most important concept to represent through one of these
> trees is the user's view of their "working copy": the complete tree that

This is WORKING plus the file modifications that are found in ACTUAL.
In the API when you read file contents for WORKING, they're going to
come from ACTUAL.

ACTUAL is only relevant when you're looking for unversioned or missing nodes.

>...
> Do we want to amend the definition in that way?

I don't think so.

Cheers,
-g

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Greg Stein <gs...@gmail.com>.
On Tue, Sep 16, 2008 at 8:42 AM, Julian Foad <ju...@btopenworld.com> wrote:
>...
>> The design is currently along the lines of (2), and the user of the
>> API will never have to redirect. You use one of the two tree APIs
>> based on what you're looking for.
>
> I'll take another look when we've got something more concrete to look
> at.

Haven't you looked at libsvn_wc/wc_db.h yet? It's quite concrete :-P
I'm experimenting with some client code using that API now. I expect
the underlying datastore API to look much like that, unless I run into
severe problems writing stuff on top of it.

Cheers,
-g

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Julian Foad <ju...@btopenworld.com>.
On Fri, 2008-09-12 at 09:36 -0700, Greg Stein wrote:
> Hey Julian,
> 
> Thanks for all the thinking about this, but I'm just not seeing it as
> being all that complicated. Near the end of this note, you point out
> more or less what I'm thinking:
> 
> * one BASE tree and its API
> * one API to access the WORKING/ACTUAL "tree" in a blended form

Two trees should be fine as long as the APIs are common wherever
possible. For example, I don't want the "diff BASE against REPOS" API to
be completely different from the "diff WORKING/ACTUAL against REPOS"
API. I was looking to make it simpler by defining independent concepts,
not more complicated!

> The WORKING and ACTUAL trees are conceptually different, but are
> generally used together. I've detailed the potential differences in
> wc-ng-design.
> 
> I also tend to disagree with the notion of trying to work with these
> trees independently. All three are tied to a specific path in the
> local filesystem. Given PATH, you will have an associated BASE tree, a
> WORKING tree, and at PATH on the disk, the ACTUAL tree. I don't see a
> need to work with them independently because that will simply never
> happen (nor need to, afaik).

My example didn't strike a chord with you? OK, we'll see how it goes.

> And note that we generally shouldn't try to construct trees (and
> especially not their contents!) in memory since that is unbounded.
> Yes, I know we do, but we should avoid it whenever possible.

Yes - I pointed that out and the relevance of my example didn't depend
on that.

> The design is currently along the lines of (2), and the user of the
> API will never have to redirect. You use one of the two tree APIs
> based on what you're looking for.

I'll take another look when we've got something more concrete to look
at.

Thanks,
- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Greg Stein <gs...@gmail.com>.
Hey Julian,

Thanks for all the thinking about this, but I'm just not seeing it as
being all that complicated. Near the end of this note, you point out
more or less what I'm thinking:

* one BASE tree and its API
* one API to access the WORKING/ACTUAL "tree" in a blended form

The WORKING and ACTUAL trees are conceptually different, but are
generally used together. I've detailed the potential differences in
wc-ng-design.

I also tend to disagree with the notion of trying to work with these
trees independently. All three are tied to a specific path in the
local filesystem. Given PATH, you will have an associated BASE tree, a
WORKING tree, and at PATH on the disk, the ACTUAL tree. I don't see a
need to work with them independently because that will simply never
happen (nor need to, afaik).

And note that we generally shouldn't try to construct trees (and
especially not their contents!) in memory since that is unbounded.
Yes, I know we do, but we should avoid it whenever possible.

The design is currently along the lines of (2), and the user of the
API will never have to redirect. You use one of the two tree APIs
based on what you're looking for.

Cheers,
-g

On Fri, Sep 12, 2008 at 3:38 AM, Julian Foad <ju...@btopenworld.com> wrote:
> Hi Greg.
>
> Here's a bunch more theoretical waffle from me on the subject! Enjoy :-)
>
>
> STAND-ALONE TREES, OR TREES LINKED INTO A WC
>
> One big distinction I now see is this:
>
> When we define the meaning of one kind of Tree (say WORKING), I prefer
> to define it as a stand-alone entity which can answer questions about
> itself. However, I did say "WORKING gets its file content from ACTUAL",
> which is contrary to that. The alternative design is that the concept of
> "WORKING tree" has meaning only when it is embedded in a WC which links
> it to a corresponding BASE tree and a corresponding ACTUAL tree. In that
> case, it can answer questions that involve getting data from its
> corresponding BASE or ACTUAL tree.
>
> I then went on to suggest as an option that ACTUAL could present the
> properties from WORKING. But it would be a bad idea to have each tree
> depend on the other like this, because it would introduce a cyclic
> dependency between those two trees. That doesn't sound too bad when you
> first think about reading from an existing tree, but when you think
> about preparing some modifications, or especially building a new tree
> from scratch, it would get really hairy.
>
> If we define the trees as stand-alone concepts that can exist with or
> without being linked in to a WC, it becomes relatively easy to build a
> new tree in memory, such as from the "dry run" result of a merge. All
> the tree manipulation functions can be used, and we don't have to link
> this dry-run tree into the WC in order to create and examine it. This
> could remove a whole bunch of complexity that is currently in wc-1.0 to
> handle dry runs of certain client-layer operations such as "merge" which
> the WC would otherwise not need to know so much about.
>
>
> On Thu, 2008-09-11 at 15:27 -0700, Greg Stein wrote:
>> On Thu, Sep 11, 2008 at 1:38 PM, Julian Foad <ju...@btopenworld.com> wrote:
>> >...
>> > Maybe you don't have the same definition of "a tree" as I do. I am
>> > assuming we mean the sort of tree that is described by a Subversion
>> > delta editor. A tree of nodes; each node is either a file or a dir; each
>> > node has properties; each dir has 0 or more child nodes; each file has
>> > content which is a blob of 0 or more bytes.
>>
>> Sure...
>>
>> > When you say, "Files/dirs present, but not in WORKING: unversioned
>> > nodes", what about them? They are part of the ACTUAL tree? Yes, I say.
>>
>> ACTUAL, yes.
>>
>> > When you say, "Files/dirs in WORKING, but not present: missing nodes",
>> > what about them? They are part of the ACTUAL tree? Yes, I say.
>>
>> No. Those nodes are *missing* from ACTUAL. They should be there since
>> WORKING says they should be. Thus, they are missing.
>
> I agree. Sorry, I made a careless copy-n-paste-o mistake here. So, we
> agree that such files/dirs are NOT part of the ACTUAL tree.
>
>> > And files/dirs that are in WORKING and present on disk as nodes of the
>> > correct type? Yes, I say. How about you?
>>
>> In both WORKING and ACTUAL, yes.
>>
>> > And files/dirs that are on disk where WORKING says there's a node of the
>> > other type? Yes, I say. How about you?
>>
>> The node is in both WORKING and ACTUAL, but there is now a problem. I
>> don't know that we have a name for this kind of change. This isn't
>> really unversioned or missing... something else.
>
> "Obstructed" is a word that we use.
>
> But, in terms of defining the meaning of the tree kind "ACTUAL", I don't
> see a problem with saying that the ACTUAL tree contains a directory at
> path "foo/bar", while the corresponding WORKING tree contains a file at
> path "foo/bar" (or a directory, or a symlink, or nothing).
>
> When we want to describe the CHANGE of node kind, that's when we're
> considering the relationship between two kinds of tree rooted at the
> same path. As far as I'm concerned, I am presently concentrating on the
> definition of one kind of tree. We'll come to expressing relationships
> between different kinds later.
>
>> Also: note that we should be talking about symlinks, too. They are
>> moving to a first-order node type in the new WC library.
>
> OK. Good.
>
>> > And the properties of each node are?
>>
>> Whatever WORKING says about the properties. ACTUAL cannot represent them.
>
> Ah... Two different levels of abstraction, perhaps.
>
> Your first answer, "Whatever WORKING says", is the answer to "from a
> high level point of view, what are the properties of the working node at
> PATH?" Indeed, from this point of view, ACTUAL does not have the
> answer[1], and so the desired answer needs to be fetched from WORKING
> instead. The question is, which layer redirects and fetches the answer
> from WORKING instead: the caller, or the ACTUAL tree?
>
> Let design 1 be: the caller redirects its question. The caller has to
> know that ACTUAL does not have the working properties. The model could
> be that the ACTUAL tree says "You ask me? I tell you there are no
> properties." The caller knows to ignore this answer and go elsewhere if
> it really wants to find the WORKING properties.[2]
>
> Let design 2 be: the ACTUAL tree redirects to the WORKING tree. The
> model of the ACTUAL tree is that it knows the properties, even though
> under the hood it has to go to the corresponding WORKING tree to find
> them. This model is very different because the trees are not
> independent. Whenever we ask a question about an ACTUAL tree there has
> to be a corresponding WORKING tree linked to it, or provided by the
> caller.
>
> Let's implement the "svn add" subcommand, in pseudo-Python, assuming
> design (1).
>
>  # Add the disk node at PATH to the tree TREE, recursively.
>  # Make a full in-memory representation, including file contents.
>  #   (That's not a good example of how to implement for real.)
>  # Give every node in the tree no properties.
>
>  def build_actual_tree(tree, path):
>    disk_node_kind = os.get_node_kind(path)
>    if disk_node_kind == file:
>      tree.add_file(path = target_path,
>                    content = os.readfile(target_path),
>                    properties = {})
>    elif disk_node_kind == dir:
>      tree.add_dir(path = target_path,
>                   properties = {})
>      for child_path in os.readdir(target_path):
>        build_actual_tree(tree, child_path)
>
>  # Take an unversioned "actual" tree NEW_ACTUAL_SUBTREE, and
>  # schedule it for addition in the working copy WC.
>  # Assume NEW_ACTUAL_SUBTREE has no properties, and set the
>  # "working" properties to ones calculated by the auto-props
>  # mechanism.
>
>  def wc.add_unversioned_tree(new_actual_subtree):
>    new_working_subtree = new_actual_subtree.deep_copy()
>    for node in new_working_subtree:
>      assert node.properties == {}
>      node.properties = generate_auto_props(node)
>    new_base_subtree = SvnTreeCreateEmpty()
>    wc.add_subtree(new_base_subtree,
>                   new_working_subtree,
>                   new_actual_subtree)
>
>  # Make the unversioned disk tree at TARGET_PATH become versioned
>  # in the working copy WC which must already include TARGET_PATH's
>  # parent dir as a versioned directory.
>
>  def svn_client_add(wc, target_path):
>    new_actual_subtree = SvnTreeCreateEmpty()
>    build_actual_tree(new_actual_subtree, target_path)
>    wc.add_unversioned_tree(new_actual_subtree)
>
>
> The point I hope it demonstrates is that we can construct and manipulate
> an ACTUAL tree model by itself, and only later link it to a WORKING tree
> and a BASE tree within a WC.
>
> I should repeat the experiment with design (2) and contrast them, but I
> haven't time.
>
>
>> Unversioned nodes (things in ACTUAL, but not WORKING) will (obviously)
>> have no properties.
>
>> >> >> BASE + Subversion-managed changes = WORKING.
>> >> >> WORKING + non-Subversion-managed changes = ACTUAL.
>> >>
>> >> Yup. Note that WORKING *may* include text-mod flags. If somebody does
>> >> an "svn edit", then a flag will get recorded saying "looks like this
>> >> file was modified" (or is likely to have been). But WORKING is purely
>> >> an admin thing. You have to look at ACTUAL to find *real* text mods.
>> >
>> > You're now talking about WORKING including "flags". This is not
>> > impossible: I've wondered whether these "trees" need to be augmented by
>> > bits of metadata like this. So, are you're saying that the term "WORKING
>> > tree" defines of a set of state recorded in the implementation, rather
>> > than defining an abstract tree concept?
>>
>> Not sure what you mean by "recorded in the implementation". The
>> WORKING tree has a set of flags (and other state) that records its
>> delta from BASE. Simple as that.
>
> Right, in the implementation of the WC library with its metadata store.
> But in the MODEL of the working tree, i.e. what the API user sees when
> asking questions about it, the tree consists of only files and
> directories and properties. Well, and some other metadata about it (its
> relationship to the repository etc.), but it should not expose the flags
> that record its delta from BASE. In other words, I would expect an API
> like
>
>  svn_wc_get_property 'MIME type' of path 'A/foo' in WORKING tree ()
>
> to respond with
>
>  "text/plain"
>
> not with
>
>  "same as in BASE"
>
> Basically, I am talking purely from a black-box perspective as a user of
> the WC library, whereas I think you are talking about what goes on
> inside the WC library.
>
>
> [...]
>> > Whether we need to expose three
>> > trees, to be able to distinguish not only the pristine version but also
>> > between the working version as told to Subversion, and the nodes on disk
>> > as modified outside Subversion, I'm not 100% sure, but it seems
>> > reasonable that we do need to distinguish these.
>>
>> Yes. The BASE is very distinct, and has separate APIs to operate on
>> it. The WORKING/ACTUAL is a much more grey boundary, and the API
>> doesn't try to expose them as two entirely separate trees.
>
> I was trying to say we should expose three trees (BASE, WORKING, ACTUAL)
> separately, but you're saying we should expose two (BASE,
> WORKING/ACTUAL). You may well be right. In that case, we need an
> out-of-band (out-of-tree) mechanism for describing the differences
> between WORKING and ACTUAL.
>
>> Definitely seems that it would be a Good Thing to enumerate how these
>> two trees can differ. It is a finite list. I'll update the doc with
>> that.
>>
>> Cheers,
>> -g
>
> [1] assuming that the definitions of WORKING and ACTUAL are, as we have
> mostly been assuming, fairly closely tied to what data is stored where
> in the implementation. For example, saying that ACTUAL does not directly
> "have" the properties because they are not operating-system artifacts.
>
> [2] Or the model could be that the ACTUAL tree says "You ask me for
> properties? Don't be daft. ERROR!"
>
> - Julian
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Julian Foad <ju...@btopenworld.com>.
Hi Greg.

Here's a bunch more theoretical waffle from me on the subject! Enjoy :-)


STAND-ALONE TREES, OR TREES LINKED INTO A WC

One big distinction I now see is this:

When we define the meaning of one kind of Tree (say WORKING), I prefer
to define it as a stand-alone entity which can answer questions about
itself. However, I did say "WORKING gets its file content from ACTUAL",
which is contrary to that. The alternative design is that the concept of
"WORKING tree" has meaning only when it is embedded in a WC which links
it to a corresponding BASE tree and a corresponding ACTUAL tree. In that
case, it can answer questions that involve getting data from its
corresponding BASE or ACTUAL tree.

I then went on to suggest as an option that ACTUAL could present the
properties from WORKING. But it would be a bad idea to have each tree
depend on the other like this, because it would introduce a cyclic
dependency between those two trees. That doesn't sound too bad when you
first think about reading from an existing tree, but when you think
about preparing some modifications, or especially building a new tree
from scratch, it would get really hairy.

If we define the trees as stand-alone concepts that can exist with or
without being linked in to a WC, it becomes relatively easy to build a
new tree in memory, such as from the "dry run" result of a merge. All
the tree manipulation functions can be used, and we don't have to link
this dry-run tree into the WC in order to create and examine it. This
could remove a whole bunch of complexity that is currently in wc-1.0 to
handle dry runs of certain client-layer operations such as "merge" which
the WC would otherwise not need to know so much about.


On Thu, 2008-09-11 at 15:27 -0700, Greg Stein wrote:
> On Thu, Sep 11, 2008 at 1:38 PM, Julian Foad <ju...@btopenworld.com> wrote:
> >...
> > Maybe you don't have the same definition of "a tree" as I do. I am
> > assuming we mean the sort of tree that is described by a Subversion
> > delta editor. A tree of nodes; each node is either a file or a dir; each
> > node has properties; each dir has 0 or more child nodes; each file has
> > content which is a blob of 0 or more bytes.
> 
> Sure...
> 
> > When you say, "Files/dirs present, but not in WORKING: unversioned
> > nodes", what about them? They are part of the ACTUAL tree? Yes, I say.
> 
> ACTUAL, yes.
> 
> > When you say, "Files/dirs in WORKING, but not present: missing nodes",
> > what about them? They are part of the ACTUAL tree? Yes, I say.
> 
> No. Those nodes are *missing* from ACTUAL. They should be there since
> WORKING says they should be. Thus, they are missing.

I agree. Sorry, I made a careless copy-n-paste-o mistake here. So, we
agree that such files/dirs are NOT part of the ACTUAL tree.

> > And files/dirs that are in WORKING and present on disk as nodes of the
> > correct type? Yes, I say. How about you?
> 
> In both WORKING and ACTUAL, yes.
> 
> > And files/dirs that are on disk where WORKING says there's a node of the
> > other type? Yes, I say. How about you?
> 
> The node is in both WORKING and ACTUAL, but there is now a problem. I
> don't know that we have a name for this kind of change. This isn't
> really unversioned or missing... something else.

"Obstructed" is a word that we use.

But, in terms of defining the meaning of the tree kind "ACTUAL", I don't
see a problem with saying that the ACTUAL tree contains a directory at
path "foo/bar", while the corresponding WORKING tree contains a file at
path "foo/bar" (or a directory, or a symlink, or nothing).

When we want to describe the CHANGE of node kind, that's when we're
considering the relationship between two kinds of tree rooted at the
same path. As far as I'm concerned, I am presently concentrating on the
definition of one kind of tree. We'll come to expressing relationships
between different kinds later.

> Also: note that we should be talking about symlinks, too. They are
> moving to a first-order node type in the new WC library.

OK. Good.

> > And the properties of each node are?
> 
> Whatever WORKING says about the properties. ACTUAL cannot represent them.

Ah... Two different levels of abstraction, perhaps.

Your first answer, "Whatever WORKING says", is the answer to "from a
high level point of view, what are the properties of the working node at
PATH?" Indeed, from this point of view, ACTUAL does not have the
answer[1], and so the desired answer needs to be fetched from WORKING
instead. The question is, which layer redirects and fetches the answer
from WORKING instead: the caller, or the ACTUAL tree?

Let design 1 be: the caller redirects its question. The caller has to
know that ACTUAL does not have the working properties. The model could
be that the ACTUAL tree says "You ask me? I tell you there are no
properties." The caller knows to ignore this answer and go elsewhere if
it really wants to find the WORKING properties.[2]

Let design 2 be: the ACTUAL tree redirects to the WORKING tree. The
model of the ACTUAL tree is that it knows the properties, even though
under the hood it has to go to the corresponding WORKING tree to find
them. This model is very different because the trees are not
independent. Whenever we ask a question about an ACTUAL tree there has
to be a corresponding WORKING tree linked to it, or provided by the
caller.

Let's implement the "svn add" subcommand, in pseudo-Python, assuming
design (1).

  # Add the disk node at PATH to the tree TREE, recursively.
  # Make a full in-memory representation, including file contents.
  #   (That's not a good example of how to implement for real.)
  # Give every node in the tree no properties.

  def build_actual_tree(tree, path):
    disk_node_kind = os.get_node_kind(path)
    if disk_node_kind == file:
      tree.add_file(path = target_path,
                    content = os.readfile(target_path),
                    properties = {})
    elif disk_node_kind == dir:
      tree.add_dir(path = target_path,
                   properties = {})
      for child_path in os.readdir(target_path):
        build_actual_tree(tree, child_path)

  # Take an unversioned "actual" tree NEW_ACTUAL_SUBTREE, and
  # schedule it for addition in the working copy WC.
  # Assume NEW_ACTUAL_SUBTREE has no properties, and set the
  # "working" properties to ones calculated by the auto-props
  # mechanism.

  def wc.add_unversioned_tree(new_actual_subtree):
    new_working_subtree = new_actual_subtree.deep_copy()
    for node in new_working_subtree:
      assert node.properties == {}
      node.properties = generate_auto_props(node)
    new_base_subtree = SvnTreeCreateEmpty()
    wc.add_subtree(new_base_subtree,
                   new_working_subtree,
                   new_actual_subtree)
 
  # Make the unversioned disk tree at TARGET_PATH become versioned
  # in the working copy WC which must already include TARGET_PATH's
  # parent dir as a versioned directory.

  def svn_client_add(wc, target_path):
    new_actual_subtree = SvnTreeCreateEmpty()
    build_actual_tree(new_actual_subtree, target_path)
    wc.add_unversioned_tree(new_actual_subtree)


The point I hope it demonstrates is that we can construct and manipulate
an ACTUAL tree model by itself, and only later link it to a WORKING tree
and a BASE tree within a WC.

I should repeat the experiment with design (2) and contrast them, but I
haven't time.
      

> Unversioned nodes (things in ACTUAL, but not WORKING) will (obviously)
> have no properties.

> >> >> BASE + Subversion-managed changes = WORKING.
> >> >> WORKING + non-Subversion-managed changes = ACTUAL.
> >>
> >> Yup. Note that WORKING *may* include text-mod flags. If somebody does
> >> an "svn edit", then a flag will get recorded saying "looks like this
> >> file was modified" (or is likely to have been). But WORKING is purely
> >> an admin thing. You have to look at ACTUAL to find *real* text mods.
> >
> > You're now talking about WORKING including "flags". This is not
> > impossible: I've wondered whether these "trees" need to be augmented by
> > bits of metadata like this. So, are you're saying that the term "WORKING
> > tree" defines of a set of state recorded in the implementation, rather
> > than defining an abstract tree concept?
> 
> Not sure what you mean by "recorded in the implementation". The
> WORKING tree has a set of flags (and other state) that records its
> delta from BASE. Simple as that.

Right, in the implementation of the WC library with its metadata store.
But in the MODEL of the working tree, i.e. what the API user sees when
asking questions about it, the tree consists of only files and
directories and properties. Well, and some other metadata about it (its
relationship to the repository etc.), but it should not expose the flags
that record its delta from BASE. In other words, I would expect an API
like

  svn_wc_get_property 'MIME type' of path 'A/foo' in WORKING tree ()

to respond with

  "text/plain"

not with

  "same as in BASE"

Basically, I am talking purely from a black-box perspective as a user of
the WC library, whereas I think you are talking about what goes on
inside the WC library.


[...]
> > Whether we need to expose three
> > trees, to be able to distinguish not only the pristine version but also
> > between the working version as told to Subversion, and the nodes on disk
> > as modified outside Subversion, I'm not 100% sure, but it seems
> > reasonable that we do need to distinguish these.
> 
> Yes. The BASE is very distinct, and has separate APIs to operate on
> it. The WORKING/ACTUAL is a much more grey boundary, and the API
> doesn't try to expose them as two entirely separate trees.

I was trying to say we should expose three trees (BASE, WORKING, ACTUAL)
separately, but you're saying we should expose two (BASE,
WORKING/ACTUAL). You may well be right. In that case, we need an
out-of-band (out-of-tree) mechanism for describing the differences
between WORKING and ACTUAL.

> Definitely seems that it would be a Good Thing to enumerate how these
> two trees can differ. It is a finite list. I'll update the doc with
> that.
> 
> Cheers,
> -g

[1] assuming that the definitions of WORKING and ACTUAL are, as we have
mostly been assuming, fairly closely tied to what data is stored where
in the implementation. For example, saying that ACTUAL does not directly
"have" the properties because they are not operating-system artifacts.

[2] Or the model could be that the ACTUAL tree says "You ask me for
properties? Don't be daft. ERROR!"

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Stefan Sperling <st...@elego.de>.
On Thu, Sep 11, 2008 at 03:27:58PM -0700, Greg Stein wrote:
> On Thu, Sep 11, 2008 at 1:38 PM, Julian Foad <ju...@btopenworld.com> wrote:
> > And files/dirs that are on disk where WORKING says there's a node of the
> > other type? Yes, I say. How about you?
> 
> The node is in both WORKING and ACTUAL, but there is now a problem. I
> don't know that we have a name for this kind of change. This isn't
> really unversioned or missing... something else.

I'd put it like this:

  The node in WORKING is obstructed by an incompatible node in ACTUAL.

So we could use "obstructed", or "incompatible", for example.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Greg Stein <gs...@gmail.com>.
On Thu, Sep 11, 2008 at 1:38 PM, Julian Foad <ju...@btopenworld.com> wrote:
>...
> Maybe you don't have the same definition of "a tree" as I do. I am
> assuming we mean the sort of tree that is described by a Subversion
> delta editor. A tree of nodes; each node is either a file or a dir; each
> node has properties; each dir has 0 or more child nodes; each file has
> content which is a blob of 0 or more bytes.

Sure...

> When you say, "Files/dirs present, but not in WORKING: unversioned
> nodes", what about them? They are part of the ACTUAL tree? Yes, I say.

ACTUAL, yes.

> When you say, "Files/dirs in WORKING, but not present: missing nodes",
> what about them? They are part of the ACTUAL tree? Yes, I say.

No. Those nodes are *missing* from ACTUAL. They should be there since
WORKING says they should be. Thus, they are missing.

> And files/dirs that are in WORKING and present on disk as nodes of the
> correct type? Yes, I say. How about you?

In both WORKING and ACTUAL, yes.

> And files/dirs that are on disk where WORKING says there's a node of the
> other type? Yes, I say. How about you?

The node is in both WORKING and ACTUAL, but there is now a problem. I
don't know that we have a name for this kind of change. This isn't
really unversioned or missing... something else.

Also: note that we should be talking about symlinks, too. They are
moving to a first-order node type in the new WC library.

> And the properties of each node are?

Whatever WORKING says about the properties. ACTUAL cannot represent them.

Unversioned nodes (things in ACTUAL, but not WORKING) will (obviously)
have no properties.

>> >> BASE + Subversion-managed changes = WORKING.
>> >> WORKING + non-Subversion-managed changes = ACTUAL.
>>
>> Yup. Note that WORKING *may* include text-mod flags. If somebody does
>> an "svn edit", then a flag will get recorded saying "looks like this
>> file was modified" (or is likely to have been). But WORKING is purely
>> an admin thing. You have to look at ACTUAL to find *real* text mods.
>
> You're now talking about WORKING including "flags". This is not
> impossible: I've wondered whether these "trees" need to be augmented by
> bits of metadata like this. So, are you're saying that the term "WORKING
> tree" defines of a set of state recorded in the implementation, rather
> than defining an abstract tree concept?

Not sure what you mean by "recorded in the implementation". The
WORKING tree has a set of flags (and other state) that records its
delta from BASE. Simple as that.

>...
>> > represent? It seems to me that it represents an implementation artifact:
>> > the set of modifications that Subversion records explicitly in its
>> > meta-data rather than the modifications that Subversion scans for
>> > dynamically. That's not a distinction of much interest to the higher
>> > layers of software.
>>
>> WORKING is entirely an admin thing. To find the complete set of
>> modifications, you also have to look at the ACTUAL files which
>> correspond to WORKING files. (you don't have to examine the entire
>> ACTUAL tree! ... sometimes unversioned files are irrelevant)
>
> Ahh... I was envisaging these "kinds of tree" as concepts that would be
> visible through the API. That there would be ways to ask through the
> API, "what are the differences between our WORKING tree and our ACTUAL
> tree?" (so I can remind the user that they need to issue some Subversion
> tree-rearrangement commands), or "what is the value of svn:mime-type on
> the WORKING version of file 'foo'?" (so I can display it in an
> appropriate editor).
>
> That's where I want there to be a clear external concept of "I'm asking
> about the user's working version" versus "Now I'm asking the same
> question about the pristine version".

Right. And the API embodies that. There are functions with "_base_" in
the name. Those operate *only* on the BASE tree (aka "pristine").

There are other APIs that operate on the WORKING/ACTUAL tree. The line
here gets a bit fuzzier. The ACTUAL tree can only differ from WORKING
in very limited ways. And the idea of a "modified file in the WORKING
tree" comes from the ACTUAL tree: that is where the contents are
located, where they get modified, and where the state information is
recorded.

> Whether we need to expose three
> trees, to be able to distinguish not only the pristine version but also
> between the working version as told to Subversion, and the nodes on disk
> as modified outside Subversion, I'm not 100% sure, but it seems
> reasonable that we do need to distinguish these.

Yes. The BASE is very distinct, and has separate APIs to operate on
it. The WORKING/ACTUAL is a much more grey boundary, and the API
doesn't try to expose them as two entirely separate trees.

Definitely seems that it would be a Good Thing to enumerate how these
two trees can differ. It is a finite list. I'll update the doc with
that.

Cheers,
-g

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-09-11 at 11:57 -0700, Greg Stein wrote:
> In short, I'm not seeing any reason to modify the definitions. Mike
> explained them rather well, and I don't see any problems with those
> definitions.

Huh.

> On Thu, Sep 11, 2008 at 10:47 AM, Julian Foad
> <ju...@btopenworld.com> wrote:
> >...
> > Getting warmer. How about:
> >
> >  * ACTUAL is the tree on the local disk, ignoring Subversion
> >    administrative directories, and regarding every node as having
> >    no Subversion properties.
> 
> Eh? That's just what is on disk. I'm not sure that is a relevant tree.
> 
> What is on disk is only relevant in the context of a working copy.
> More specifically, how it relates/differs from WORKING.
> 
> Files/dirs present, but not in WORKING: unversioned nodes
> Files/dirs in WORKING, but not present: missing nodes

Maybe you don't have the same definition of "a tree" as I do. I am
assuming we mean the sort of tree that is described by a Subversion
delta editor. A tree of nodes; each node is either a file or a dir; each
node has properties; each dir has 0 or more child nodes; each file has
content which is a blob of 0 or more bytes.

When you say, "Files/dirs present, but not in WORKING: unversioned
nodes", what about them? They are part of the ACTUAL tree? Yes, I say.

When you say, "Files/dirs in WORKING, but not present: missing nodes",
what about them? They are part of the ACTUAL tree? Yes, I say.

And files/dirs that are in WORKING and present on disk as nodes of the
correct type? Yes, I say. How about you?

And files/dirs that are on disk where WORKING says there's a node of the
other type? Yes, I say. How about you?

And the properties of each node are?


> >...
> >> BASE + Subversion-managed changes = WORKING.
> >> WORKING + non-Subversion-managed changes = ACTUAL.
> 
> Yup. Note that WORKING *may* include text-mod flags. If somebody does
> an "svn edit", then a flag will get recorded saying "looks like this
> file was modified" (or is likely to have been). But WORKING is purely
> an admin thing. You have to look at ACTUAL to find *real* text mods.

You're now talking about WORKING including "flags". This is not
impossible: I've wondered whether these "trees" need to be augmented by
bits of metadata like this. So, are you're saying that the term "WORKING
tree" defines of a set of state recorded in the implementation, rather
than defining an abstract tree concept?

> > I was also questioning the intent of defining WORKING as a tree that has
> > the BASE file contents. That seems silly: what useful concept does that
> 
> It doesn't "have them" ... it is just that most of the WORKING tree's
> contents == BASE's contents, simply because they haven't been
> modified. There isn't any real "container" or "superset" or anything.

OK, I think we agree there.

> > represent? It seems to me that it represents an implementation artifact:
> > the set of modifications that Subversion records explicitly in its
> > meta-data rather than the modifications that Subversion scans for
> > dynamically. That's not a distinction of much interest to the higher
> > layers of software.
> 
> WORKING is entirely an admin thing. To find the complete set of
> modifications, you also have to look at the ACTUAL files which
> correspond to WORKING files. (you don't have to examine the entire
> ACTUAL tree! ... sometimes unversioned files are irrelevant)

Ahh... I was envisaging these "kinds of tree" as concepts that would be
visible through the API. That there would be ways to ask through the
API, "what are the differences between our WORKING tree and our ACTUAL
tree?" (so I can remind the user that they need to issue some Subversion
tree-rearrangement commands), or "what is the value of svn:mime-type on
the WORKING version of file 'foo'?" (so I can display it in an
appropriate editor).

That's where I want there to be a clear external concept of "I'm asking
about the user's working version" versus "Now I'm asking the same
question about the pristine version". Whether we need to expose three
trees, to be able to distinguish not only the pristine version but also
between the working version as told to Subversion, and the nodes on disk
as modified outside Subversion, I'm not 100% sure, but it seems
reasonable that we do need to distinguish these.

Clearly we're not yet seeing eye to eye. I hope we're getting a bit
closer.

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Greg Stein <gs...@gmail.com>.
In short, I'm not seeing any reason to modify the definitions. Mike
explained them rather well, and I don't see any problems with those
definitions.

On Thu, Sep 11, 2008 at 10:47 AM, Julian Foad
<ju...@btopenworld.com> wrote:
>...
> Getting warmer. How about:
>
>  * ACTUAL is the tree on the local disk, ignoring Subversion
>    administrative directories, and regarding every node as having
>    no Subversion properties.

Eh? That's just what is on disk. I'm not sure that is a relevant tree.

What is on disk is only relevant in the context of a working copy.
More specifically, how it relates/differs from WORKING.

Files/dirs present, but not in WORKING: unversioned nodes
Files/dirs in WORKING, but not present: missing nodes

>...
>> BASE + Subversion-managed changes = WORKING.
>> WORKING + non-Subversion-managed changes = ACTUAL.

Yup. Note that WORKING *may* include text-mod flags. If somebody does
an "svn edit", then a flag will get recorded saying "looks like this
file was modified" (or is likely to have been). But WORKING is purely
an admin thing. You have to look at ACTUAL to find *real* text mods.

> I was also questioning the intent of defining WORKING as a tree that has
> the BASE file contents. That seems silly: what useful concept does that

It doesn't "have them" ... it is just that most of the WORKING tree's
contents == BASE's contents, simply because they haven't been
modified. There isn't any real "container" or "superset" or anything.

> represent? It seems to me that it represents an implementation artifact:
> the set of modifications that Subversion records explicitly in its
> meta-data rather than the modifications that Subversion scans for
> dynamically. That's not a distinction of much interest to the higher
> layers of software.

WORKING is entirely an admin thing. To find the complete set of
modifications, you also have to look at the ACTUAL files which
correspond to WORKING files. (you don't have to examine the entire
ACTUAL tree! ... sometimes unversioned files are irrelevant)

> I submit that it is much more sensible to define WORKING as I suggested
> in my earlier mail. (i.e. having the disk file contents)

Sorry, but I'm not seeing it.

Thanks,
-g

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Erik Huelsmann <eh...@gmail.com>.
On Thu, Sep 11, 2008 at 8:43 PM, C. Michael Pilato <cm...@collab.net> wrote:
> Erik Huelsmann wrote:
>>>> I submit that it is much more sensible to define WORKING as I suggested
>>>> in my earlier mail. (i.e. having the disk file contents)
>>> I agree.  I had considered this in the course of composing my mail, but lost
>>> it by the time I finished editing.  I should have said
>>>
>>>   BASE + Subversion-managed changes + file content modifications = WORKING.
>>>   WORKING + non-textual-and-non-Subversiony-changes = ACTUAL.
>>
>> Well, I would have liked to put it that way, but how does that work in
>> this scenario:
>>
>>  $ rm file.txt
>>  $ mkdir file.txt
>>
>> ?
>
> EVIL!  I suppose you wanna pitch a fork() and a 'chmod 666' into that
> recipe, too, right?!  SICKO!

Great! :-) It's nice to know we now agree that WORKING and ACTUAL are
actually different trees. Too bad this complicates the design of the
working copy though.

Bye,


Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by "C. Michael Pilato" <cm...@collab.net>.
Erik Huelsmann wrote:
>>> I submit that it is much more sensible to define WORKING as I suggested
>>> in my earlier mail. (i.e. having the disk file contents)
>> I agree.  I had considered this in the course of composing my mail, but lost
>> it by the time I finished editing.  I should have said
>>
>>   BASE + Subversion-managed changes + file content modifications = WORKING.
>>   WORKING + non-textual-and-non-Subversiony-changes = ACTUAL.
> 
> Well, I would have liked to put it that way, but how does that work in
> this scenario:
> 
>  $ rm file.txt
>  $ mkdir file.txt
> 
> ?

EVIL!  I suppose you wanna pitch a fork() and a 'chmod 666' into that
recipe, too, right?!  SICKO!

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Erik Huelsmann <eh...@gmail.com>.
>> I submit that it is much more sensible to define WORKING as I suggested
>> in my earlier mail. (i.e. having the disk file contents)
>
> I agree.  I had considered this in the course of composing my mail, but lost
> it by the time I finished editing.  I should have said
>
>   BASE + Subversion-managed changes + file content modifications = WORKING.
>   WORKING + non-textual-and-non-Subversiony-changes = ACTUAL.

Well, I would have liked to put it that way, but how does that work in
this scenario:

 $ rm file.txt
 $ mkdir file.txt

?

> Are file contents the only exception of note here?  For example, I'm not
> sure how to categorize today's handling of merge reject files.  They live in
> the ACTUAL space, but their absence affects interpretation of the WORKING
> tree (conflicted flag staleness and ultimate irrelevance).

The same problem applies to properties merge failures. I have no
immediate answer.

Bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by "C. Michael Pilato" <cm...@collab.net>.
Julian Foad wrote:
> On Thu, 2008-09-11 at 13:16 -0400, C. Michael Pilato wrote:
>> Julian Foad wrote:
>>> On Thu, 2008-09-11 at 12:51 -0400, Greg Stein wrote:
>>>> Those are the correct definitions, per wc-by-design.
>>> (I assume you mean, "per 'notes/wc-ng-design'".)
>>>
>>> Both of you:
>>>
>>> Huh? I was trying to refine the definitions to something more precise
>>> than "ACTUAL is how the working copy really looks."  :-)
>> I don't know how better to explain it.  Uh... ACTUAL is the state of a
>> working copy directory if you imagine its not a Subversion working copy?
>> ACTUAL is what you see with shell tools like 'ls'?
> 
> Getting warmer. How about:
> 
>   * ACTUAL is the tree on the local disk, ignoring Subversion
>     administrative directories, and regarding every node as having
>     no Subversion properties.
> 
> (Note that we might want to consider variations such as
> 
>   "... having no Subversion properties except for 'svn:executable' set
> to '*' iff the node's executable-by-owner permission is set."
> 
>   "... except as defined by the user's auto-props configuration."
> 
>   "... excluding nodes with the operating system's 'hidden' flag set."
> )
> 
> 
>> A "missing" file is caused by the ACTUAL directory lacking a file for which
>> the BASE tree contains a file of the same name, and for which the WORKING
>> tree does *not* indicate that the physical file should be absent (such as
>> would be expected were the file scheduled for deletion).
>>
>> BASE + Subversion-managed changes = WORKING.
>> WORKING + non-Subversion-managed changes = ACTUAL.
> 
> I was also questioning the intent of defining WORKING as a tree that has
> the BASE file contents. That seems silly: what useful concept does that
> represent? It seems to me that it represents an implementation artifact:
> the set of modifications that Subversion records explicitly in its
> meta-data rather than the modifications that Subversion scans for
> dynamically. That's not a distinction of much interest to the higher
> layers of software.
> 
> I submit that it is much more sensible to define WORKING as I suggested
> in my earlier mail. (i.e. having the disk file contents)

I agree.  I had considered this in the course of composing my mail, but lost
it by the time I finished editing.  I should have said

   BASE + Subversion-managed changes + file content modifications = WORKING.
   WORKING + non-textual-and-non-Subversiony-changes = ACTUAL.

Are file contents the only exception of note here?  For example, I'm not
sure how to categorize today's handling of merge reject files.  They live in
the ACTUAL space, but their absence affects interpretation of the WORKING
tree (conflicted flag staleness and ultimate irrelevance).

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


WC-NG: the trees BASE, WORKING and ACTUAL [was: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc]

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-09-11 at 13:16 -0400, C. Michael Pilato wrote:
> Julian Foad wrote:
> > On Thu, 2008-09-11 at 12:51 -0400, Greg Stein wrote:
> >> Those are the correct definitions, per wc-by-design.
> > 
> > (I assume you mean, "per 'notes/wc-ng-design'".)
> > 
> > Both of you:
> > 
> > Huh? I was trying to refine the definitions to something more precise
> > than "ACTUAL is how the working copy really looks."  :-)
> 
> I don't know how better to explain it.  Uh... ACTUAL is the state of a
> working copy directory if you imagine its not a Subversion working copy?
> ACTUAL is what you see with shell tools like 'ls'?

Getting warmer. How about:

  * ACTUAL is the tree on the local disk, ignoring Subversion
    administrative directories, and regarding every node as having
    no Subversion properties.

(Note that we might want to consider variations such as

  "... having no Subversion properties except for 'svn:executable' set
to '*' iff the node's executable-by-owner permission is set."

  "... except as defined by the user's auto-props configuration."

  "... excluding nodes with the operating system's 'hidden' flag set."
)


> A "missing" file is caused by the ACTUAL directory lacking a file for which
> the BASE tree contains a file of the same name, and for which the WORKING
> tree does *not* indicate that the physical file should be absent (such as
> would be expected were the file scheduled for deletion).
> 
> BASE + Subversion-managed changes = WORKING.
> WORKING + non-Subversion-managed changes = ACTUAL.

I was also questioning the intent of defining WORKING as a tree that has
the BASE file contents. That seems silly: what useful concept does that
represent? It seems to me that it represents an implementation artifact:
the set of modifications that Subversion records explicitly in its
meta-data rather than the modifications that Subversion scans for
dynamically. That's not a distinction of much interest to the higher
layers of software.

I submit that it is much more sensible to define WORKING as I suggested
in my earlier mail. (i.e. having the disk file contents)

- Julian



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by "C. Michael Pilato" <cm...@collab.net>.
Julian Foad wrote:
> On Thu, 2008-09-11 at 12:51 -0400, Greg Stein wrote:
>> Those are the correct definitions, per wc-by-design.
> 
> (I assume you mean, "per 'notes/wc-ng-design'".)
> 
> Both of you:
> 
> Huh? I was trying to refine the definitions to something more precise
> than "ACTUAL is how the working copy really looks."  :-)

I don't know how better to explain it.  Uh... ACTUAL is the state of a
working copy directory if you imagine its not a Subversion working copy?
ACTUAL is what you see with shell tools like 'ls'?

A "missing" file is caused by the ACTUAL directory lacking a file for which
the BASE tree contains a file of the same name, and for which the WORKING
tree does *not* indicate that the physical file should be absent (such as
would be expected were the file scheduled for deletion).

BASE + Subversion-managed changes = WORKING.
WORKING + non-Subversion-managed changes = ACTUAL.

Is any of this helping?

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by Greg Stein <gs...@gmail.com>.
I know, but can't effectively respond right now. When I get back to my  
laptop...



On Sep 11, 2008, at 13:10, Julian Foad <ju...@btopenworld.com>  
wrote:

> On Thu, 2008-09-11 at 12:51 -0400, Greg Stein wrote:
>> Those are the correct definitions, per wc-by-design.
>
> (I assume you mean, "per 'notes/wc-ng-design'".)
>
> Both of you:
>
> Huh? I was trying to refine the definitions to something more precise
> than "ACTUAL is how the working copy really looks."  :-)
>
> - Julian
>
>
>> On Sep 11, 2008, at 12:18, "C. Michael Pilato" <cm...@collab.net>
>> wrote:
>>
>>> Julian Foad wrote:
>>>> On Thu, 2008-09-11 at 08:27 -0700, Greg Stein wrote:
>>>>> Thanks for the thoughts.
>>>>>
>>>>> Note that we need to be able to iterate over all three trees. This
>>>>> code just iterates over WORKING, so it won't even have to stat().
>>>>> I'll
>>>>> need to expand this now to be able to iterate over BASE (no stats)
>>>>> and
>>>>> over ACTUAL (use readdir), so I can see what those look like.
>>>>>
>>>>> Now that I think about it, the read_info() needs to be able to
>>>>> signal
>>>>> an unversioned item when iterating over ACTUAL.
>>>>
>>>> When we say "WORKING" and "ACTUAL", I'm wondering what exactly we
>>>> mean
>>>> by each of them.
>>>
>>> My guess:  BASE is what BASE is today.  WORKING is how the working
>>> copy
>>> ought to look given BASE plus the state changes recorded by the
>>> working copy
>>> library.  ACTUAL is how the working copy really looks.
>>>
>>>  $ svn rm file.txt
>>>  $ touch file.txt
>>>
>>> BASE has file.txt.  WORKING has not.  ACTUAL has.
>>>
>>> If I'm wrong, and if my guess is the same as Julian's, we might need
>>> some
>>> clearer terms.
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-09-11 at 12:51 -0400, Greg Stein wrote:
> Those are the correct definitions, per wc-by-design.

(I assume you mean, "per 'notes/wc-ng-design'".)

Both of you:

Huh? I was trying to refine the definitions to something more precise
than "ACTUAL is how the working copy really looks."  :-)

- Julian


> On Sep 11, 2008, at 12:18, "C. Michael Pilato" <cm...@collab.net>  
> wrote:
> 
> > Julian Foad wrote:
> >> On Thu, 2008-09-11 at 08:27 -0700, Greg Stein wrote:
> >>> Thanks for the thoughts.
> >>>
> >>> Note that we need to be able to iterate over all three trees. This
> >>> code just iterates over WORKING, so it won't even have to stat().  
> >>> I'll
> >>> need to expand this now to be able to iterate over BASE (no stats)  
> >>> and
> >>> over ACTUAL (use readdir), so I can see what those look like.
> >>>
> >>> Now that I think about it, the read_info() needs to be able to  
> >>> signal
> >>> an unversioned item when iterating over ACTUAL.
> >>
> >> When we say "WORKING" and "ACTUAL", I'm wondering what exactly we  
> >> mean
> >> by each of them.
> >
> > My guess:  BASE is what BASE is today.  WORKING is how the working  
> > copy
> > ought to look given BASE plus the state changes recorded by the  
> > working copy
> > library.  ACTUAL is how the working copy really looks.
> >
> >   $ svn rm file.txt
> >   $ touch file.txt
> >
> > BASE has file.txt.  WORKING has not.  ACTUAL has.
> >
> > If I'm wrong, and if my guess is the same as Julian's, we might need  
> > some
> > clearer terms.




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by Greg Stein <gs...@gmail.com>.
Those are the correct definitions, per wc-by-design.



On Sep 11, 2008, at 12:18, "C. Michael Pilato" <cm...@collab.net>  
wrote:

> Julian Foad wrote:
>> On Thu, 2008-09-11 at 08:27 -0700, Greg Stein wrote:
>>> Thanks for the thoughts.
>>>
>>> Note that we need to be able to iterate over all three trees. This
>>> code just iterates over WORKING, so it won't even have to stat().  
>>> I'll
>>> need to expand this now to be able to iterate over BASE (no stats)  
>>> and
>>> over ACTUAL (use readdir), so I can see what those look like.
>>>
>>> Now that I think about it, the read_info() needs to be able to  
>>> signal
>>> an unversioned item when iterating over ACTUAL.
>>
>> When we say "WORKING" and "ACTUAL", I'm wondering what exactly we  
>> mean
>> by each of them.
>
> My guess:  BASE is what BASE is today.  WORKING is how the working  
> copy
> ought to look given BASE plus the state changes recorded by the  
> working copy
> library.  ACTUAL is how the working copy really looks.
>
>   $ svn rm file.txt
>   $ touch file.txt
>
> BASE has file.txt.  WORKING has not.  ACTUAL has.
>
> If I'm wrong, and if my guess is the same as Julian's, we might need  
> some
> clearer terms.
>
> -- 
> C. Michael Pilato <cm...@collab.net>
> CollabNet   <>   www.collab.net   <>   Distributed Development On  
> Demand
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by "C. Michael Pilato" <cm...@collab.net>.
Julian Foad wrote:
> On Thu, 2008-09-11 at 08:27 -0700, Greg Stein wrote:
>> Thanks for the thoughts.
>>
>> Note that we need to be able to iterate over all three trees. This
>> code just iterates over WORKING, so it won't even have to stat(). I'll
>> need to expand this now to be able to iterate over BASE (no stats) and
>> over ACTUAL (use readdir), so I can see what those look like.
>>
>> Now that I think about it, the read_info() needs to be able to signal
>> an unversioned item when iterating over ACTUAL.
> 
> When we say "WORKING" and "ACTUAL", I'm wondering what exactly we mean
> by each of them.

My guess:  BASE is what BASE is today.  WORKING is how the working copy
ought to look given BASE plus the state changes recorded by the working copy
library.  ACTUAL is how the working copy really looks.

   $ svn rm file.txt
   $ touch file.txt

BASE has file.txt.  WORKING has not.  ACTUAL has.

If I'm wrong, and if my guess is the same as Julian's, we might need some
clearer terms.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2008-09-11 at 08:27 -0700, Greg Stein wrote:
> Thanks for the thoughts.
> 
> Note that we need to be able to iterate over all three trees. This
> code just iterates over WORKING, so it won't even have to stat(). I'll
> need to expand this now to be able to iterate over BASE (no stats) and
> over ACTUAL (use readdir), so I can see what those look like.
> 
> Now that I think about it, the read_info() needs to be able to signal
> an unversioned item when iterating over ACTUAL.

When we say "WORKING" and "ACTUAL", I'm wondering what exactly we mean
by each of them.

From notes/wc-ng-design: 
>  * WORKING: The tree as it is in modified form, based on the
>      administrative information recorded by the transforming
>      'svn ..' commands
>      Note: This tree will -as far as text bases goes- generally
>            overlap with BASE, but isn't required to;
>            e.g. "add-with-history"
> 
>  * ACTUAL: The tree as it is in modified form on the local disk.
>      This tree may differ from WORKING when having been modified
>      with non-Subversion transforming commands (such as plain 'rm').

That definition of ACTUAL is a rather special case of a "tree": it is
not a complete Subversion tree on its own, because it doesn't know
anything about properties.

I feel that each definition of a "tree" needs to be the definition of a
complete tree including properties and file contents, so that it is
meaningful to perform any generic tree operation such as iteration, diff
against another tree, recursive propget, etc. without throwing an error.
If so, then maybe we ought to say that ACTUAL is defined as the above,
with a note that in this tree's view there are no properties on any
node.

In the above definition of WORKING, I don't quite understand the comment
about "overlap with BASE" and "add-with-history". Does it mean, "...
where the file contents in this tree are those in the BASE tree"? That
would seem odd to me.

I think the most important concept to represent through one of these
trees is the user's view of their "working copy": the complete tree that
they think of as the one on which Subversion will perform operations
such as commit, diff, status and propset. This could be defined as:

  * WORKING: The tree that represent's the user's view of their
      Subversion working copy with their local modifications. That is,
      the tree structure and properties defined by the administrative
      information recorded by the transforming 'svn ...' commands,
      and the file content on the local disk. (Where a file cannot
      be accessed because the tree structure on the local disk does
      not accord, ...?)

Do we want to amend the definition in that way?

(If we were not to define ACTUAL and WORKING as complete trees, but were
to define them partially and require the higher layer libraries to
access both WORKING and ACTUAL and combine the results themselves, that
would be escalating a task that is, I think, the WC's responsibility.)

- Julian


> On Thu, Sep 11, 2008 at 7:26 AM, Erik Huelsmann <eh...@gmail.com> wrote:
> > Hi Greg,
> >
> >> +svn_error_t *
> >> +generic_walker(svn_wc__db_t *db,
> >> +               const char *path,
> >> +               walker_func_t walk_func,
> >> +               void *walk_baton,
> >> +               apr_pool_t *scratch_pool)
> >...
> >
> > When I look at one of the current problems in the WC is that we use a
> > database to determine what *should* be there, then using stat() to
> > find out if it's actually there. This is one of our biggest
> > performance problems.
> >
> > I have been thinking that we could try to shift that around: using
> > readdir() to determine what's there and then using the database to
> > find out if our administration agrees. This way, stat()s could be
> > concentrated, probably even optimized by the OS, instead of requiring
> > single stat() calls for everything we want to know.
> >
> > Not that there's anything wrong with the design above, even with what
> > I describe there may still be use-cases for this API, I just wanted to
> > point out that using a BASE or WORKING tree as the basis to operate
> > from (instead of the ACTUAL tree) may perpetuate existing problems.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn commit: r33021 - branches/explore-wc/subversion/libsvn_wc

Posted by Greg Stein <gs...@gmail.com>.
Thanks for the thoughts.

Note that we need to be able to iterate over all three trees. This
code just iterates over WORKING, so it won't even have to stat(). I'll
need to expand this now to be able to iterate over BASE (no stats) and
over ACTUAL (use readdir), so I can see what those look like.

Now that I think about it, the read_info() needs to be able to signal
an unversioned item when iterating over ACTUAL.

Cheers,
-g

On Thu, Sep 11, 2008 at 7:26 AM, Erik Huelsmann <eh...@gmail.com> wrote:
> Hi Greg,
>
>> +svn_error_t *
>> +generic_walker(svn_wc__db_t *db,
>> +               const char *path,
>> +               walker_func_t walk_func,
>> +               void *walk_baton,
>> +               apr_pool_t *scratch_pool)
>...
>
> When I look at one of the current problems in the WC is that we use a
> database to determine what *should* be there, then using stat() to
> find out if it's actually there. This is one of our biggest
> performance problems.
>
> I have been thinking that we could try to shift that around: using
> readdir() to determine what's there and then using the database to
> find out if our administration agrees. This way, stat()s could be
> concentrated, probably even optimized by the OS, instead of requiring
> single stat() calls for everything we want to know.
>
> Not that there's anything wrong with the design above, even with what
> I describe there may still be use-cases for this API, I just wanted to
> point out that using a BASE or WORKING tree as the basis to operate
> from (instead of the ACTUAL tree) may perpetuate existing problems.
>
> It's just one of those mind-dumps. Sorry if this is completely out of
> context. :-)
>
>
> Bye,
>
> Erik.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org