You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by st...@apache.org on 2011/07/30 03:18:21 UTC

svn commit: r1152410 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

Author: stsp
Date: Sat Jul 30 01:18:20 2011
New Revision: 1152410

URL: http://svn.apache.org/viewvc?rev=1152410&view=rev
Log:
Store moved-to relpaths in the BASE tree (op_depth = 0) instead of
nodes_current. All move operations are relative to the BASE, so having
moved-to information in op_depth layer 0 is semantically more correct.

It will also make it easier to deal with stuff like cyclic moves and
replacements later. E.g. before this commit moved-to information was lost
if the delete-half of a move was replaced, and fixing this as a special
case in the code that adds the replacing node would have been ugly.

Also, clear moved-to relpaths from the BASE tree during 'revert' so
we don't leave phantom moved-to information in the DB (are there any
other places where we need to clear it?).

With help, ideas, testing, and lots of sanity-checking by Neels.

* subversion/libsvn_wc/wc-queries.sql
  (STMT_INSERT_DELETE_NODE): Drop MOVED_TO parameter, we don't use this
   query anymore for adding moved-to relpaths because this query does not
   operate on BASE.
  (STMT_SELECT_MOVED_FROM_RELPATH,
   STMT_SELECT_MOVED_HERE_CHILDREN): Look in BASE instead of nodes_current.
  (STMT_UPDATE_MOVED_TO_RELPATH): Update BASE instead of nodes_current.
  (STMT_CLEAR_MOVED_TO_RELPATH,
   STMT_CLEAR_MOVED_TO_RELPATH_RECURSIVE): New statements for op_revert
    and op_revert_recursive.

* subversion/libsvn_wc/wc_db.c
  (op_revert_txn, op_revert_recursive_txn): Clear moved-to relpaths in BASE.
  (op_delete_txn): Use STMT_UPDATE_MOVED_TO_RELPATH instead of
   STMT_INSERT_DELETE_NODE to insert/update moved-to relpaths.

Patch by: me
          neels

Modified:
    subversion/trunk/subversion/libsvn_wc/wc-queries.sql
    subversion/trunk/subversion/libsvn_wc/wc_db.c

Modified: subversion/trunk/subversion/libsvn_wc/wc-queries.sql
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_wc/wc-queries.sql?rev=1152410&r1=1152409&r2=1152410&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_wc/wc-queries.sql (original)
+++ subversion/trunk/subversion/libsvn_wc/wc-queries.sql Sat Jul 30 01:18:20 2011
@@ -817,9 +817,9 @@ WHERE wc_id = ?1
 
 -- STMT_INSERT_DELETE_NODE
 INSERT INTO nodes (
-    wc_id, local_relpath, op_depth, parent_relpath, presence, kind, moved_to)
+    wc_id, local_relpath, op_depth, parent_relpath, presence, kind)
 SELECT wc_id, local_relpath, ?4 /*op_depth*/, parent_relpath, 'base-deleted',
-       kind, ?5 /* moved_to */
+       kind
 FROM nodes
 WHERE wc_id = ?1
   AND local_relpath = ?2
@@ -1332,19 +1332,29 @@ WHERE wc_id = ?1
   AND file_external IS NULL
 
 -- STMT_SELECT_MOVED_FROM_RELPATH
-SELECT local_relpath FROM nodes_current
-WHERE wc_id = ?1 AND moved_to = ?2
+SELECT local_relpath FROM nodes
+WHERE wc_id = ?1 AND moved_to = ?2 AND op_depth = 0
 
 -- STMT_UPDATE_MOVED_TO_RELPATH
-UPDATE NODES SET moved_to = ?3
-WHERE wc_id = ?1 AND local_relpath = ?2
-  AND op_depth =
-   (SELECT MAX(op_depth) FROM nodes
-    WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth > 0)
+UPDATE nodes SET moved_to = ?3
+WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth = 0
+
+-- STMT_CLEAR_MOVED_TO_RELPATH
+UPDATE nodes SET moved_to = NULL
+WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth = 0
+
+-- STMT_CLEAR_MOVED_TO_RELPATH_RECURSIVE
+UPDATE nodes SET moved_to = NULL
+WHERE wc_id = ?1
+  AND (?2 = ''
+       OR local_relpath = ?2
+       OR (local_relpath > ?2 || '/' AND local_relpath < ?2 || '0'))
+  AND op_depth = 0
 
 -- STMT_SELECT_MOVED_HERE_CHILDREN
-SELECT moved_to, local_relpath FROM nodes_current
-WHERE wc_id = ?1 AND (moved_to > ?2 || '/' AND moved_to < ?2 || '0')
+SELECT moved_to, local_relpath FROM nodes
+WHERE wc_id = ?1 AND op_depth = 0
+  AND (moved_to > ?2 || '/' AND moved_to < ?2 || '0')
 
 /* ------------------------------------------------------------------------- */
 

Modified: subversion/trunk/subversion/libsvn_wc/wc_db.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_wc/wc_db.c?rev=1152410&r1=1152409&r2=1152410&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_wc/wc_db.c (original)
+++ subversion/trunk/subversion/libsvn_wc/wc_db.c Sat Jul 30 01:18:20 2011
@@ -5251,6 +5251,13 @@ op_revert_txn(void *baton,
                                         STMT_DELETE_WC_LOCK_ORPHAN));
       SVN_ERR(svn_sqlite__bindf(stmt, "is", wcroot->wc_id, local_relpath));
       SVN_ERR(svn_sqlite__step_done(stmt));
+
+      /* Clear the moved-to path of the BASE node. */
+      SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
+                                        STMT_CLEAR_MOVED_TO_RELPATH));
+      SVN_ERR(svn_sqlite__bindf(stmt, "is", wcroot->wc_id, local_relpath));
+      SVN_ERR(svn_sqlite__step_done(stmt));
+      SVN_ERR(svn_sqlite__reset(stmt));
     }
 
   SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
@@ -5347,6 +5354,13 @@ op_revert_recursive_txn(void *baton,
                             local_relpath));
   SVN_ERR(svn_sqlite__step_done(stmt));
 
+  /* Clear any moved-to paths of the BASE nodes. */
+  SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
+                                    STMT_CLEAR_MOVED_TO_RELPATH_RECURSIVE));
+  SVN_ERR(svn_sqlite__bindf(stmt, "is", wcroot->wc_id, local_relpath));
+  SVN_ERR(svn_sqlite__step_done(stmt));
+  SVN_ERR(svn_sqlite__reset(stmt));
+
   return SVN_NO_ERROR;
 }
 
@@ -6083,11 +6097,8 @@ op_delete_txn(void *baton,
       if (status == svn_wc__db_status_moved_here)
         {
           /* The node has already been moved, possibly along with a parent,
-           * and is being moved again.
-           * Update the existing moved_to path at the delete-half of
-           * the prior move. The source of a move is in the BASE tree
-           * so it remains constant if a node is moved around multiple
-           * times. */
+           * and is being moved again. Update the existing moved_to path
+           * in the BASE node. */
           SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
                                             STMT_UPDATE_MOVED_TO_RELPATH));
           SVN_ERR(svn_sqlite__bindf(stmt, "iss", wcroot->wc_id,
@@ -6256,18 +6267,22 @@ op_delete_txn(void *baton,
       /* Delete the node at LOCAL_RELPATH, and possibly mark it as moved. */
       SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
                                  STMT_INSERT_DELETE_NODE));
-      if (b->moved_to_relpath)
-        SVN_ERR(svn_sqlite__bindf(stmt, "isiis",
-                                  wcroot->wc_id, local_relpath,
-                                  select_depth, b->delete_depth,
-                                  b->moved_to_relpath));
-      else
-        SVN_ERR(svn_sqlite__bindf(stmt, "isii",
-                                  wcroot->wc_id, local_relpath,
-                                  select_depth, b->delete_depth));
-
+      SVN_ERR(svn_sqlite__bindf(stmt, "isii",
+                                wcroot->wc_id, local_relpath,
+                                select_depth, b->delete_depth));
       SVN_ERR(svn_sqlite__step_done(stmt));
 
+      if (b->moved_to_relpath)
+        {
+          /* Record moved-to relpath in BASE. */
+          SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
+                                            STMT_UPDATE_MOVED_TO_RELPATH));
+          SVN_ERR(svn_sqlite__bindf(stmt, "iss",
+                                    wcroot->wc_id, local_relpath,
+                                    b->moved_to_relpath));
+          SVN_ERR(svn_sqlite__step_done(stmt));
+        }
+
       /* Delete children, if any. */
       SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb,
                                  STMT_INSERT_DELETE_FROM_NODE_RECURSIVE));



Re: svn commit: r1152410 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

Posted by Neels J Hofmeyr <ne...@elego.de>.

On 08/01/2011 07:10 PM, Stefan Sperling wrote:
> On Mon, Aug 01, 2011 at 06:56:13PM +0200, Neels J Hofmeyr wrote:
>> On 08/01/2011 04:17 PM, Stefan Sperling wrote:
>>>> And at the same time this update of op_depth==0 rows during revert was not
>>>> necessary before this patch.
>>>
>>> It wasn't necessary because it was cleared by the existing revert code.
>>
>> We had to add to the recursive revert code AFAIR
> 
> We added code to recursive revert when we started storing moved-to
> in op_depth=0. Before then moved-to was in op_depth > 0 and cleared
> out as part of existing operation of the revert code.

exactly. What were we talking about again? ;)

> 
>> So pretty much my only concern is: do we want to avoid modifying the
>> op_depth==0 nodes?
> 
> The benefits of storing moved-to at op_depth 0 should now be quite
> clear. But what's the benefit of not modifying op_depth = 0?
> Note that op_depth=0 is always modified during commit and update.
> It isn't read-only as such.

No, of course not. But it remains untouched all the while that local mods,
including merges, get tossed around in the working copy. There's one
certainty: blow away op_depth>0 and you're back to upstream.

I don't really know what that certainty is good for, and it seems there is
no performance penalty with keeping moved-to in op_depth==0. FWIW the new
certainty goes: blow away all op_depth>0 and set all moved-to to NULL...

While it's only us two arguing and there's no general uproar, I'd say we're
done here. (Y)our patch is good.

~Neels


Re: svn commit: r1152410 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

Posted by Stefan Sperling <st...@elego.de>.
On Mon, Aug 01, 2011 at 06:56:13PM +0200, Neels J Hofmeyr wrote:
> On 08/01/2011 04:17 PM, Stefan Sperling wrote:
> >> And at the same time this update of op_depth==0 rows during revert was not
> >> necessary before this patch.
> > 
> > It wasn't necessary because it was cleared by the existing revert code.
> 
> We had to add to the recursive revert code AFAIR

We added code to recursive revert when we started storing moved-to
in op_depth=0. Before then moved-to was in op_depth > 0 and cleared
out as part of existing operation of the revert code.

> So pretty much my only concern is: do we want to avoid modifying the
> op_depth==0 nodes?

The benefits of storing moved-to at op_depth 0 should now be quite
clear. But what's the benefit of not modifying op_depth = 0?
Note that op_depth=0 is always modified during commit and update.
It isn't read-only as such.

Re: svn commit: r1152410 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

Posted by Neels J Hofmeyr <ne...@elego.de>.

On 08/01/2011 04:17 PM, Stefan Sperling wrote:
> On Mon, Aug 01, 2011 at 03:04:38PM +0200, Neels J Hofmeyr wrote:
>> - but on moves and reverts, the op_depth==0 row needs to be updated. So
>> - the op_depth==0 rows' moved-to columns can be corrupted by interrupted
>> operations. Yet this is easily remedied by a revert, clearing that column.
> 
> Not sure what you mean here. In what way could columns become corrupted?
> 
> Interruptions only happen on a higher level than individual column updates.
> We wrap most (all?) operations that run multiple statements in sqlite
> transactions -- see the various *_txn() functions in wc_db.c.
> Either all SQL statements in such functions fail or they all succeed.

Of course. Forget that point of mine.

> 
> So if an op_depth==0 row has a non-NULL moved-to relpath column
> the relpath was inserted by a successful operation.
> 
>>> Also, clear moved-to relpaths from the BASE tree during 'revert' so
>>> we don't leave phantom moved-to information in the DB (are there any
>>> other places where we need to clear it?).
>>
>> ^ here would be the remedy to any interrupted operations involving moved-to.
>> And at the same time this update of op_depth==0 rows during revert was not
>> necessary before this patch.
> 
> It wasn't necessary because it was cleared by the existing revert code.

We had to add to the recursive revert code AFAIR


> I.e. this part of the required maintenance happened to be performed by
> existing code. But that doesn't mean less maintenance. We need to maintain
> moved-to data either way. I think it is easier to manage it in op_depth==0.
> 
> Because of the transactional nature of our DB operations there are
> only two kinds of inconsistencies we have to worry about, at a fairly
> high level:

so true.

So pretty much my only concern is: do we want to avoid modifying the
op_depth==0 nodes?

~Neels


Re: svn commit: r1152410 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

Posted by Stefan Sperling <st...@elego.de>.
On Mon, Aug 01, 2011 at 03:04:38PM +0200, Neels J Hofmeyr wrote:
> - but on moves and reverts, the op_depth==0 row needs to be updated. So
> - the op_depth==0 rows' moved-to columns can be corrupted by interrupted
> operations. Yet this is easily remedied by a revert, clearing that column.

Not sure what you mean here. In what way could columns become corrupted?

Interruptions only happen on a higher level than individual column updates.
We wrap most (all?) operations that run multiple statements in sqlite
transactions -- see the various *_txn() functions in wc_db.c.
Either all SQL statements in such functions fail or they all succeed.

So if an op_depth==0 row has a non-NULL moved-to relpath column
the relpath was inserted by a successful operation.

> > Also, clear moved-to relpaths from the BASE tree during 'revert' so
> > we don't leave phantom moved-to information in the DB (are there any
> > other places where we need to clear it?).
> 
> ^ here would be the remedy to any interrupted operations involving moved-to.
> And at the same time this update of op_depth==0 rows during revert was not
> necessary before this patch.

It wasn't necessary because it was cleared by the existing revert code.
I.e. this part of the required maintenance happened to be performed by
existing code. But that doesn't mean less maintenance. We need to maintain
moved-to data either way. I think it is easier to manage it in op_depth==0.

Because of the transactional nature of our DB operations there are
only two kinds of inconsistencies we have to worry about, at a fairly
high level:

1) A copied node says "I was moved here" (this is a boolean field) but
   there is no corresponding node in the DB with a moved-to field that
   matches the relpath of the copied node.

   This can happen if the user hits Ctrl-C in the time window between
   the copy and the delete in svn_wc_move(). But the user is not able to
   interrupt the sqlite transaction that inserts the moved-to field,
   and we perform the copy before the delete. So any moved-to that
   made it into the DB is definitely valid. Or it was valid at the time
   it was entered and inconsistency 2) has happened since.
   
   (Moving both the copy and the delete into the same sqlite transaction
    might be possible but is hard given how existing code is structured.)

2) A moved-to field points to a node that doesn't exit, or doesn't say
   that it was moved-here.

   This can happen if something operates on a moved-here node (i.e. a
   copied node with moved-here=TRUE) and fails to update/clear the
   corresponding moved-to relpath at the deleted node, either because
   of some unexpected error or an interruption.
   It should be possible to prevent this type of inconsistency in many
   cases by writing code that changes moved-here nodes very carefully.
   I.e. if the moved-here status has changed, always update/clear moved-to
   data even in face of some unrelated error. IOW, don't do something
   like this outside of an sqlite transaction:
     change_moved_here(); SVN_ERR(foo()); update_moved_to();
   Because if foo() throws an error we've created the inconsistency.
   Likewise, if possible don't allow cancellation between changing
   moved_here and moved_to, etc.

We should always add assertions for either inconsistency in debug mode
to catch bugs that could introduce these inconsistencies.
In release mode we detect them at runtime, maybe even repair the DB
state by removing invalid data, and fall back to copy+delete behaviour.

Re: svn commit: r1152410 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c

Posted by Neels J Hofmeyr <ne...@elego.de>.
On 07/30/2011 03:18 AM, stsp@apache.org wrote:
> Author: stsp
> Date: Sat Jul 30 01:18:20 2011
> New Revision: 1152410
> 
> URL: http://svn.apache.org/viewvc?rev=1152410&view=rev
> Log:
> Store moved-to relpaths in the BASE tree (op_depth = 0) instead of
> nodes_current. All move operations are relative to the BASE, so having
> moved-to information in op_depth layer 0 is semantically more correct.

I've been thinking some more about this, and I'm quite 50/50 on which level
is best suited for moved-to (op_depth == 0 vs. op_depth > 0). (If a BASE
node was moved, there always is an op_depth > 0 node available.)

I thought I'd jot some details of the choice down so others can have their
opinions.

When storing moved-to in op_depth == 0 (as this patch introduces),
- the meaning of the column is more intuitively clear. The moved-to
information tells you where the BASE was moved to locally. Straightforward.
- but on moves and reverts, the op_depth==0 row needs to be updated. So
- the op_depth==0 rows' moved-to columns can be corrupted by interrupted
operations. Yet this is easily remedied by a revert, clearing that column.

If moved-to is stored in op_depth > 0,
- it's true to the idea that a full revert is simply wiping all op_depth > 0
nodes away (is that still true up to now?)
- but the meaning of the column moved-to becomes more complex to explain and
code. An op_depth>0 row's moved-to column does not refer to any new node
added at the given op_depth, but always refers to the (unrelated) BASE node
with the same path. Say alpha was replaced after a move-away:
  mv alpha beta
  touch alpha; svn add alpha
Then there is a new op_depth=1 node for alpha, representing the new node
that has nothing to do with the history of BASE. But, confusingly, the
location where the BASE was moved-to is still saved in the row for that new
node.
- So code needs to take care that the moved-to column in op_depth>0 rows
remains unchanged, i.e. remains tied to that specific path, no matter how
much adds, deletes, moves etc. happen on the nodes of that path. Ugh.


So it's a dilemma, and I can't know which advantage is cooler.


> It will also make it easier to deal with stuff like cyclic moves and
> replacements later. E.g. before this commit moved-to information was lost
> if the delete-half of a move was replaced, and fixing this as a special
> case in the code that adds the replacing node would have been ugly.

^ here is one of those situations that generates ugly code.

> Also, clear moved-to relpaths from the BASE tree during 'revert' so
> we don't leave phantom moved-to information in the DB (are there any
> other places where we need to clear it?).

^ here would be the remedy to any interrupted operations involving moved-to.
And at the same time this update of op_depth==0 rows during revert was not
necessary before this patch.

~Neels