You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@wandisco.com> on 2010/08/12 17:11:30 UTC

[PATCH] Tweak description of BASE_NODE table

Can someone check this before I commit?

[[[
Index: subversion/libsvn_wc/wc_db.h
===================================================================
--- subversion/libsvn_wc/wc_db.h	(revision 984862)
+++ subversion/libsvn_wc/wc_db.h	(working copy)
@@ -438,14 +438,19 @@ svn_wc__db_get_wcroot(const char **wcroo
 
 /* @defgroup svn_wc__db_base  BASE tree management
 
-   BASE should be what we get from the server. The *absolute* pristine copy.
-   Nothing can change it -- it is always a reflection of the repository.
+   BASE is what we get from the server.  It is the *absolute* pristine copy.
    You need to use checkout, update, switch, or commit to alter your view of
    the repository.
 
+   In the BASE tree, each node corresponds to a particular node-rev in the
+   repository.  It can be a mixed-revision tree.  Each node holds either a
+   copy of the node-rev as it exists in the repository (if presence =
+   'normal'), or a place-holder (if presence = 'absent' or 'excluded' or
+   'not-present').
+
    @{
 */
]]]

- Julian


Re: [PATCH] Tweak description of BASE_NODE table

Posted by Julian Foad <ju...@wandisco.com>.
On Thu, 2010-08-12, Greg Stein wrote:
> Looks good!

Committed.  Now I'm trying to understand and document BASE_NODE (first)
in more detail.  Then I/we can tackle the other tables, and especially
it should give me (and anyone else who may be struggling to follow) a
starting point for understanding the NODE_DATA change.

I want to help write a top-down description that new developers can
understand, and at the same time I'm aiming for something we can use as
a *definitive* specification when it comes to deciding what's a bug,
writing unit tests, etc., so it must not gloss over any important
details.

My starting point is trying to describe what I think it means now.  I
haven't tested each statement against current behaviour.  If we find any
unnecessary quirks and exceptions, I want to first document them as
such, and then we can decide whether and when to eliminate them.

Are any parts of this inaccurate?


[[[
Index: subversion/libsvn_wc/wc-metadata.sql
===================================================================
/home/julianfoad/bin/svn-external-diff: warning: files are different types: text/x-c; charset=us-ascii text/x-lisp; charset=us-ascii
--- subversion/libsvn_wc/wc-metadata.sql	(revision 985836)
+++ subversion/libsvn_wc/wc-metadata.sql	(working copy)
@@ -73,21 +73,111 @@ CREATE TABLE WCROOT (
 
 CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath);
 
 
 /* ------------------------------------------------------------------------- */
 
+/*
+The BASE_NODE table:
+
+  BASE is what we get from the server.  It is the *absolute* pristine copy.
+  You need to use checkout, update, switch, or commit to alter your view of
+  the repository.
+
+  In the BASE tree, each node corresponds to a particular node-rev in the
+  repository.  It can be a mixed-revision tree.  Each node holds either a
+  copy of the node-rev as it exists in the repository (if presence ==
+  'normal'), or a place-holder (if presence == 'absent' or 'excluded' or
+  'not-present').
+                                                  [Quoted from wc_db.h]
+
+Overview of BASE_NODE columns:
+
+  Indexing columns: (wc_id, local_relpath, parent_relpath)
+
+  (presence)
+    - One of the following values:
+
+      'presence'      Meaning       Node-Rev?     Content?  Last-Change?
+      ----------      -----------   -----------   --------  ------------
+      normal      =>  Present       Existing      Yes       Yes
+      absent      =>  Unauthz       Existing      No        No
+      excluded    =>  Unwanted      Existing      No        No
+      not-pres    =>  Nonexistent   Nonexistent   No        No
+      incomplete  =>  ### undefined
+
+  Node-Rev columns: (repos_id, repos_relpath, revnum)
+    - Always points to the corresponding repository node-rev.
+    - Points to an existing node-rev, unless presence==not-present in which
+      case it points to a nonexistent node-rev.
+    - ### A comment on 'repos_id' and 'repos_relpath' says they may be null;
+      is this true and wanted?
+    - ### A comment on 'revnum' says, "this could be NULL for non-present
+      nodes -- no info"; is this true and wanted?
+
+  Content columns: (kind, properties, depth, target, checksum)
+    - One of:       ----  ----------  -----  ------  --------
+                    'dir'    Yes      Yes    null    null
+                'symlink'    Yes      null   Yes     null
+                   'file'    Yes      null   null    Yes
+                'unknown'    null     null   null    null
+    - Content is present iff presence==normal, otherwise kind=unknown and
+      the other columns are null.
+    - If kind==dir: the children are represented by the existence of other
+      BASE_NODE rows.  For each immediate child of 'repos_relpath'@'revnum'
+      that is included by 'depth', a BASE_NODE row exists with its
+      'local_relpath' being this node's 'local_relpath' plus the child's
+      basename.  (Rows may also exist for additional children which are
+      outside the scope of 'depth' or do not exist as children of this
+      node-rev in the repository, including 'externals' and paths updated to
+      a revision in which they do exist.)  There is no distinction between
+      depth=immediates and depth=infinity here.
+    - If kind==symlink: the target path is contained in 'target'.
+    - If kind==file: the content is contained in the Pristine Store,
+      referenced by its SHA-1 checksum 'checksum'.
+
+  Last-Change columns: (changed_rev, changed_date, changed_author)
+    - Last-Change info is present iff presence==normal, otherwise null.
+    - Specifies the revision in which the content was last changed before
+      Node-Rev, following copies and not counting the copy operation itself
+      as a change.
+    - Does not specify the revision in which this node first appeared at
+      the repository path 'repos_relpath', which could be more recent than
+      the last change of this node's content.
+    - Includes a copy of the corresponding date and author rev-props.
+
+  Working file status: (translated_size, last_mod_time)
+    - Present iff kind==file and node has no WORKING_NODE row, otherwise
+      null.  (If kind==file and node has a WORKING_NODE row, the info is
+      recorded in that row).  ### True?
+    - Records the status of the working file on disk, for the purpose of
+      detecting quickly whether that file has been modified.
+    - Logically belongs to the ACTUAL_NODE table but is recorded in the
+      BASE_NODE and WORKING_NODE tables instead to avoid the overhead of
+      storing an ACTUAL_NODE row for each unmodified file.
+    - Records the actual size and mod-time of the disk file at the time when
+      its content was last determined to be logically unmodified relative to
+      its base, taking account of keywords and EOL style.
+
+  (dav_cache)
+
+  (incomplete_children)
+    - Obsolete, unused.
+
+  (file_external)
+*/
+
 CREATE TABLE BASE_NODE (
   /* specifies the location of this node in the local filesystem. wc_id
      implies an absolute path, and local_relpath is relative to that
      location (meaning it will be "" for the wcroot). */
   wc_id  INTEGER NOT NULL REFERENCES WCROOT (id),
   local_relpath  TEXT NOT NULL,
 
-  /* the repository this node is part of, and the relative path [to its
-     root] within revision "revnum" of that repository.  These may be NULL,
+  /* The repository this node is part of, and the relative path (from its
+     root) within revision "revnum" of that repository.  These may be NULL,
      implying they should be derived from the parent and local_relpath.
      Non-NULL typically indicates a switched node.
 
      Note: they must both be NULL, or both non-NULL. */
   repos_id  INTEGER REFERENCES REPOSITORY (id),
   repos_relpath  TEXT,
@@ -109,20 +199,18 @@ CREATE TABLE BASE_NODE (
      this could be NULL for non-present nodes -- no info. */
   revnum  INTEGER,
 
   /* If this node is a file, then the SHA-1 checksum of the pristine text. */
   checksum  TEXT,
 
-  /* The size in bytes of the working file when it had no local text
-     modifications. This means the size of the text when translated from
-     repository-normal format to working copy format with EOL style
-     translated and keywords expanded according to the properties in the
-     "properties" column of this row.
+  /* The size in bytes of the working file when it was last determined to be
+     logically unmodified relative to its base, taking account of keywords
+     and EOL style.
 
-     NULL if this node is not a file or if the size has not (yet) been
-     computed. */
+     NULL if this node is not a file or if this info has not yet been
+     determined. */
   translated_size  INTEGER,
 
   /* Information about the last change to this node. changed_rev must be
      not-null if this node has presence=="normal". changed_date and
      changed_author may be null if the corresponding revprops are missing.
 
@@ -134,18 +222,25 @@ CREATE TABLE BASE_NODE (
   /* NULL depth means "default" (typically svn_depth_infinity) */
   depth  TEXT,
 
   /* for kind==symlink, this specifies the target. */
   symlink_target  TEXT,
 
+  /* The mod-time of the working file when it was last determined to be
+     logically unmodified relative to its base, taking account of keywords
+     and EOL style.
+
+     NULL if this node is not a file or if this info has not yet been
+     determined.
+   */
   /* ### Do we need this?  We've currently got various mod time APIs
      ### internal to libsvn_wc, but those might be used in answering some
      ### question which is better answered some other way. */
   last_mod_time  INTEGER,  /* an APR date/time (usec since 1970) */
 
-  /* serialized skel of this node's properties. could be NULL if we
+  /* serialized skel of this node's properties. NULL if we
      have no information about the properties (a non-present node). */
   properties  BLOB,
 
   /* serialized skel of this node's dav-cache.  could be NULL if the
      node does not have any dav-cache. */
   dav_cache  BLOB,

]]]


wc-metadata.sql looks like a good home for this documentation for the
time being.  Perhaps it will eventually live separately.

- Julian


Re: [PATCH] Tweak description of BASE_NODE table

Posted by Greg Stein <gs...@gmail.com>.
Looks good!

On Aug 12, 2010 1:12 PM, "Julian Foad" <ju...@wandisco.com> wrote:
> Can someone check this before I commit?
>
> [[[
> Index: subversion/libsvn_wc/wc_db.h
> ===================================================================
> --- subversion/libsvn_wc/wc_db.h (revision 984862)
> +++ subversion/libsvn_wc/wc_db.h (working copy)
> @@ -438,14 +438,19 @@ svn_wc__db_get_wcroot(const char **wcroo
>
> /* @defgroup svn_wc__db_base BASE tree management
>
> - BASE should be what we get from the server. The *absolute* pristine
copy.
> - Nothing can change it -- it is always a reflection of the repository.
> + BASE is what we get from the server. It is the *absolute* pristine copy.
> You need to use checkout, update, switch, or commit to alter your view of
> the repository.
>
> + In the BASE tree, each node corresponds to a particular node-rev in the
> + repository. It can be a mixed-revision tree. Each node holds either a
> + copy of the node-rev as it exists in the repository (if presence =
> + 'normal'), or a place-holder (if presence = 'absent' or 'excluded' or
> + 'not-present').
> +
> @{
> */
> ]]]
>
> - Julian
>
>