You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Daniel Shahaf <d....@daniel.shahaf.name> on 2010/11/11 21:25:40 UTC

FSFS locks questions and a concern

Philip observed today that, since 'hotcopy' copies the $REPOS/db/locks/???/*
files using a simple svn_io_copy_dir_recursively(), the correctness of
the copy is not guaranteed if locks are being added/removed while the
hotcopy is ongoing: the hotcopy might contain only an arbitrary subset
of the path-prefix-digest-files.  (It might contain any subset of the
three digest files md5("/"), md5("/trunk/iota"), md5("/trunk").  The
lock removal removes/updates those files in depth-first order.)

So, a few questions:

1. 'svn lock ^/trunk/iota' creates an md5("/trunk") digest file.  That
   digest file is not referenced by the md5("/") digest file; both of
   these digest files refer only and directly to the md5("/trunk/iota")
   digest file.

   Does the md5("/trunk") digest file have to be preserved in a hotcopy?
   Given that it is unreferenced, under what conditions would it be read
   --- when the reader is pre-1.7?

2. Given that the md5("/") digest file refers to iota's digest file
   directly, doesn't that mean that the md5("/") digest file will
   refer *every* locked file's digest file?  I'm concerned about the
   efficiency of this arrangement.

3. What's a reasonable strategy to hotcopy the locks?

   I've started on a patch that uses walk_digest_files() (I modified
   that function today), but currently that patch simply does a walk
   starting at the digest file md5("/"), and consequently it misses
   copying the unreferenced digest file md5("/trunk").

Re: FSFS locks questions and a concern

Posted by Philip Martin <ph...@codematters.co.uk>.
Philip Martin <ph...@wandisco.com> writes:

> Daniel Shahaf <d....@daniel.shahaf.name> writes:
>
>> 1. 'svn lock ^/trunk/iota' creates an md5("/trunk") digest file.  That
>>    digest file is not referenced by the md5("/") digest file; both of
>>    these digest files refer only and directly to the md5("/trunk/iota")
>>    digest file.
>>
>>    Does the md5("/trunk") digest file have to be preserved in a hotcopy?
>>    Given that it is unreferenced, under what conditions would it be read
>>    --- when the reader is pre-1.7?
>
> Yes, it's used by operations on /trunk:
>
> $ svnadmin create repo
> $ svn import -mm Makefile file://`pwd`/repo/trunk/f
> $ svn lock file://`pwd`/repo/trunk/f
> $ svn rm -mm file://`pwd`/repo/trunk
> svn: Cannot verify lock on path '/trunk/f'; no matching lock-token available
> $ rm -f repo/db/locks/9fd/9fdfca3f0419c7cff7f2d1f623525d29
> $ svn info file://`pwd`/repo/trunk/f | grep Owner
> Lock Owner: pm
> $ svn rm -mm file://`pwd`/repo/A

Typo, that should be

  $ svn rm -mm file://`pwd`/repo/trunk

> Committed revision 2.
>
> So deleting md5("/trunk") meant that the lock no longer stopped the
> commit.

-- 
Philip

Re: FSFS locks questions and a concern

Posted by Philip Martin <ph...@wandisco.com>.
Daniel Shahaf <d....@daniel.shahaf.name> writes:

> 1. 'svn lock ^/trunk/iota' creates an md5("/trunk") digest file.  That
>    digest file is not referenced by the md5("/") digest file; both of
>    these digest files refer only and directly to the md5("/trunk/iota")
>    digest file.
>
>    Does the md5("/trunk") digest file have to be preserved in a hotcopy?
>    Given that it is unreferenced, under what conditions would it be read
>    --- when the reader is pre-1.7?

Yes, it's used by operations on /trunk:

$ svnadmin create repo
$ svn import -mm Makefile file://`pwd`/repo/trunk/f
$ svn lock file://`pwd`/repo/trunk/f
$ svn rm -mm file://`pwd`/repo/trunk
svn: Cannot verify lock on path '/trunk/f'; no matching lock-token available
$ rm -f repo/db/locks/9fd/9fdfca3f0419c7cff7f2d1f623525d29
$ svn info file://`pwd`/repo/trunk/f | grep Owner
Lock Owner: pm
$ svn rm -mm file://`pwd`/repo/A
Committed revision 2.

So deleting md5("/trunk") meant that the lock no longer stopped the
commit.

-- 
Philip

Re: FSFS locks questions and a concern

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
C. Michael Pilato wrote on Fri, Nov 12, 2010 at 09:40:49 -0500:
> On 11/11/2010 04:25 PM, Daniel Shahaf wrote:
> > Philip observed today that, since 'hotcopy' copies the $REPOS/db/locks/???/*
> > files using a simple svn_io_copy_dir_recursively(), the correctness of
> > the copy is not guaranteed if locks are being added/removed while the
> > hotcopy is ongoing: the hotcopy might contain only an arbitrary subset
> > of the path-prefix-digest-files.  (It might contain any subset of the
> > three digest files md5("/"), md5("/trunk/iota"), md5("/trunk").  The
> > lock removal removes/updates those files in depth-first order.)
> > 
> > So, a few questions:
> 
> Maybe, just maybe, you can find some of the information you seek in the
> history of http://subversion.tigris.org/issues/show_bug.cgi?id=3660.  I had
> to study the FSFS lock storage and usage algorithms some time ago to fix a
> bug.  (Though, don't expect to find hotcopy strategies there.)

Indeed, I don't find hotcopy strategies there, but I do find there
discussion on whether the FSFS locks directory should, in each
non-immediate ancestor of a locked file, a pointer to that file's digest
file or a pointer to the next-ancestor-in-the-chain (i.e., /trunk to
/trunk/A)'s digest file.

I guess this answers my original point (2) by "It has always been this
way".  Following "if it ain't broken, don't fix it", I'll leave this bit
untouched until somebody complains (that 'svn lock' takes too long).

Thanks for the pointer,

Daniel

Re: FSFS locks questions and a concern

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 11/11/2010 04:25 PM, Daniel Shahaf wrote:
> Philip observed today that, since 'hotcopy' copies the $REPOS/db/locks/???/*
> files using a simple svn_io_copy_dir_recursively(), the correctness of
> the copy is not guaranteed if locks are being added/removed while the
> hotcopy is ongoing: the hotcopy might contain only an arbitrary subset
> of the path-prefix-digest-files.  (It might contain any subset of the
> three digest files md5("/"), md5("/trunk/iota"), md5("/trunk").  The
> lock removal removes/updates those files in depth-first order.)
> 
> So, a few questions:

Maybe, just maybe, you can find some of the information you seek in the
history of http://subversion.tigris.org/issues/show_bug.cgi?id=3660.  I had
to study the FSFS lock storage and usage algorithms some time ago to fix a
bug.  (Though, don't expect to find hotcopy strategies there.)

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: FSFS locks questions and a concern

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Shahaf wrote on Sat, Nov 13, 2010 at 14:08:12 +0200:
> Philip Martin wrote on Fri, Nov 12, 2010 at 11:02:14 +0000:
> > Daniel Shahaf <d....@daniel.shahaf.name> writes:
> > > 3. What's a reasonable strategy to hotcopy the locks?
> > >
> > >    I've started on a patch that uses walk_digest_files() (I modified
> > >    that function today), but currently that patch simply does a walk
> > >    starting at the digest file md5("/"), and consequently it misses
> > >    copying the unreferenced digest file md5("/trunk").
> > 
> > It appears you have to read/copy md5("/") and get a list of locks, then
> > read/copy each lock file that still exists, then reconstruct the
> > intermediate files for the locks that were copied.
> > 
> 
> So, for example:
> 
>     - read md5("/")
>       * read md5("/trunk/A/mu")
>         + compute md5("/trunk/A")
>         + compute md5("/trunk")
>       * read md5("/trunk/iota")
>         + compute md5("/trunk")
> 
> where the 'compute()' steps follow those of normal lock creation: read
> the digest file, update list of children, write it back.

Attached patch implements this.  Comments?

http://people.apache.org/~danielsh/hot.logmsg
http://people.apache.org/~danielsh/hot.diff

[[[
FSFS: teach 'hotcopy' the semantics of the locks/ directory.

Found by: philip

* notes/fsfs:
    Document another limitation of rsync-style backup, which 'hotcopy' lacks.

* subversion/libsvn_fs_fs/fs_fs.c
  (svn_fs_fs__hotcopy):
    Use svn_fs_fs__hotcopy_locks() instead of svn_io_copy_dir_recursively().

* subversion/libsvn_fs_fs/lock.c
  (svn_fs_fs__hotcopy_locks):  New declaration.

* subversion/libsvn_fs_fs/lock.c
  (digest_dir_path_from_digest):
    New, like svn_dirent_dirname(digest_path_from_digest()).
  (hotcopy_walker):  New callback.
  (svn_fs_fs__hotcopy_locks):  New definition.
]]]

[[[
Index: notes/fsfs
===================================================================
--- notes/fsfs	(revision 1034745)
+++ notes/fsfs	(working copy)
@@ -248,6 +248,11 @@ manually removed.  "svnadmin hotcopy" does not cop
 transactions from an FSFS repository, although that might need to
 change if Subversion starts making use of long-lived transactions.
 
+Naively copying an FSFS repository might also copy an inconsistent view
+of the locks directory, if locks are being added or removed as the copy
+operation is running.  "svnadmin hotcopy" copies and generates the locks
+digest files properly.
+
 So, if you are using standard backup tools to make backups of a FSFS
 repository, configure the software to copy the "current" file before
 the numbered revision files, if possible, and configure it not to copy
Index: subversion/libsvn_fs_fs/fs_fs.c
===================================================================
--- subversion/libsvn_fs_fs/fs_fs.c	(revision 1034745)
+++ subversion/libsvn_fs_fs/fs_fs.c	(working copy)
@@ -1767,9 +1767,7 @@ svn_fs_fs__hotcopy(const char *src_path,
   src_subdir = svn_dirent_join(src_path, PATH_LOCKS_DIR, pool);
   SVN_ERR(svn_io_check_path(src_subdir, &kind, pool));
   if (kind == svn_node_dir)
-    SVN_ERR(svn_io_copy_dir_recursively(src_subdir, dst_path,
-                                        PATH_LOCKS_DIR, TRUE, NULL,
-                                        NULL, pool));
+    SVN_ERR(svn_fs_fs__hotcopy_locks(src_path, dst_path, pool));
 
   /* Now copy the node-origins cache tree. */
   src_subdir = svn_dirent_join(src_path, PATH_NODE_ORIGINS_DIR, pool);
Index: subversion/libsvn_fs_fs/lock.c
===================================================================
--- subversion/libsvn_fs_fs/lock.c	(revision 1034756)
+++ subversion/libsvn_fs_fs/lock.c	(working copy)
@@ -121,6 +121,19 @@ err_corrupt_lockfile(const char *fs_path, const ch
 
 /*** Digest file handling functions. ***/
 
+/* Return the path of the lock/entries directory for which DIGEST is the
+   hashed repository relative path. */
+static const char *
+digest_dir_path_from_digest(const char *fs_path,
+                            const char *digest,
+                            apr_pool_t *pool)
+{
+  return svn_dirent_join_many(pool, fs_path, PATH_LOCKS_DIR,
+                              apr_pstrmemdup(pool, digest, DIGEST_SUBDIR_LEN),
+                              NULL);
+}
+
+
 /* Return the path of the lock/entries file for which DIGEST is the
    hashed repository relative path. */
 static const char *
@@ -893,6 +906,65 @@ unlock_body(void *baton, apr_pool_t *pool)
 }
 
 
+/*** Hotcopying locks. ***/
+
+/* Implements walk_digests_callback_t. */
+static svn_error_t *
+hotcopy_walker(void *baton,
+               const char *fs_path,
+               const char *digest_path,
+               apr_hash_t *children /* ignored */,
+               svn_lock_t *lock /* ignored */,
+               svn_boolean_t have_write_lock,
+               apr_pool_t *pool)
+{
+  const char *dst_fs_path = baton;
+  const char *digest;
+  const char *src_digest_dir, *dst_digest_dir;
+  const char *perms_reference = digest_path;
+  
+  digest = svn_dirent_basename(digest_path, NULL);
+  src_digest_dir = svn_dirent_dirname(digest_path, pool);
+  dst_digest_dir = digest_dir_path_from_digest(dst_fs_path, digest, pool);
+
+  SVN_DBG(("Hotcopying fs='%s' dest='%s' digest='%s'\n", 
+           fs_path, dst_fs_path, digest));
+
+  /* Copy the digest file. */
+  SVN_ERR(svn_fs_fs__ensure_dir_exists(dst_digest_dir, dst_fs_path, pool));
+  SVN_ERR(svn_io_dir_file_copy(src_digest_dir, dst_digest_dir, digest, pool));
+
+  /* Generate the ancestor digest files (e.g., /trunk/A and /trunk, if we
+     locked /trunk/A/mu).  It will re-write DST_FS_PATH's digest_path, too.
+   */
+  if (lock)
+    SVN_ERR(set_lock(dst_fs_path, lock, perms_reference, pool));
+
+  return SVN_NO_ERROR;
+}
+
+svn_error_t *
+svn_fs_fs__hotcopy_locks(const char *fs_path,
+                         const char *dst_fs_path,
+                         apr_pool_t *pool)
+{
+  /* md5("/") == 6666cd76f96956469e7be39d750cc7d9 */
+  const char *md5_root = make_digest("/", pool);
+
+  SVN_ERR(svn_fs_fs__ensure_dir_exists(
+            svn_dirent_join(dst_fs_path, PATH_LOCKS_DIR, pool),
+            dst_fs_path,
+            pool));
+
+  SVN_ERR(walk_digest_files(fs_path,
+                            digest_path_from_digest(fs_path, md5_root, pool),
+                            hotcopy_walker, (void *)dst_fs_path,
+                            FALSE /* have_write_lock */,
+                            pool));
+  return SVN_NO_ERROR;
+}
+
+
 /*** Public API implementations ***/
 
 svn_error_t *
Index: subversion/libsvn_fs_fs/lock.h
===================================================================
--- subversion/libsvn_fs_fs/lock.h	(revision 1034745)
+++ subversion/libsvn_fs_fs/lock.h	(working copy)
@@ -96,6 +96,12 @@ svn_error_t *svn_fs_fs__allow_locked_operation(con
                                                svn_boolean_t have_write_lock,
                                                apr_pool_t *pool);
 
+/* Copy the locks hierarchy from the live FS at FS_PATH to DST_SUBDIR. */
+svn_error_t *
+svn_fs_fs__hotcopy_locks(const char *fs_path,
+                         const char *dst_subdir,
+                         apr_pool_t *scratch_pool);
+
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */
]]]

Re: FSFS locks questions and a concern

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Daniel Shahaf wrote on Sat, Nov 13, 2010 at 14:08:12 +0200:
> Philip Martin wrote on Fri, Nov 12, 2010 at 11:02:14 +0000:
> > Daniel Shahaf <d....@daniel.shahaf.name> writes:
> > > 3. What's a reasonable strategy to hotcopy the locks?
> > >
> > >    I've started on a patch that uses walk_digest_files() (I modified
> > >    that function today), but currently that patch simply does a walk
> > >    starting at the digest file md5("/"), and consequently it misses
> > >    copying the unreferenced digest file md5("/trunk").
> > 
> > It appears you have to read/copy md5("/") and get a list of locks, then
> > read/copy each lock file that still exists, then reconstruct the
> > intermediate files for the locks that were copied.
> > 
> 
> So, for example:
> 
>     - read md5("/")
>       * read md5("/trunk/A/mu")
>         + compute md5("/trunk/A")
>         + compute md5("/trunk")
>       * read md5("/trunk/iota")
>         + compute md5("/trunk")
> 
> where the 'compute()' steps follow those of normal lock creation: read
> the digest file, update list of children, write it back.

Attached patch implements this.  Comments?

http://people.apache.org/~danielsh/hot.logmsg
http://people.apache.org/~danielsh/hot.diff

[[[
FSFS: teach 'hotcopy' the semantics of the locks/ directory.

Found by: philip

* notes/fsfs:
    Document another limitation of rsync-style backup, which 'hotcopy' lacks.

* subversion/libsvn_fs_fs/fs_fs.c
  (svn_fs_fs__hotcopy):
    Use svn_fs_fs__hotcopy_locks() instead of svn_io_copy_dir_recursively().

* subversion/libsvn_fs_fs/lock.c
  (svn_fs_fs__hotcopy_locks):  New declaration.

* subversion/libsvn_fs_fs/lock.c
  (digest_dir_path_from_digest):
    New, like svn_dirent_dirname(digest_path_from_digest()).
  (hotcopy_walker):  New callback.
  (svn_fs_fs__hotcopy_locks):  New definition.
]]]

[[[
Index: notes/fsfs
===================================================================
--- notes/fsfs	(revision 1034745)
+++ notes/fsfs	(working copy)
@@ -248,6 +248,11 @@ manually removed.  "svnadmin hotcopy" does not cop
 transactions from an FSFS repository, although that might need to
 change if Subversion starts making use of long-lived transactions.
 
+Naively copying an FSFS repository might also copy an inconsistent view
+of the locks directory, if locks are being added or removed as the copy
+operation is running.  "svnadmin hotcopy" copies and generates the locks
+digest files properly.
+
 So, if you are using standard backup tools to make backups of a FSFS
 repository, configure the software to copy the "current" file before
 the numbered revision files, if possible, and configure it not to copy
Index: subversion/libsvn_fs_fs/fs_fs.c
===================================================================
--- subversion/libsvn_fs_fs/fs_fs.c	(revision 1034745)
+++ subversion/libsvn_fs_fs/fs_fs.c	(working copy)
@@ -1767,9 +1767,7 @@ svn_fs_fs__hotcopy(const char *src_path,
   src_subdir = svn_dirent_join(src_path, PATH_LOCKS_DIR, pool);
   SVN_ERR(svn_io_check_path(src_subdir, &kind, pool));
   if (kind == svn_node_dir)
-    SVN_ERR(svn_io_copy_dir_recursively(src_subdir, dst_path,
-                                        PATH_LOCKS_DIR, TRUE, NULL,
-                                        NULL, pool));
+    SVN_ERR(svn_fs_fs__hotcopy_locks(src_path, dst_path, pool));
 
   /* Now copy the node-origins cache tree. */
   src_subdir = svn_dirent_join(src_path, PATH_NODE_ORIGINS_DIR, pool);
Index: subversion/libsvn_fs_fs/lock.c
===================================================================
--- subversion/libsvn_fs_fs/lock.c	(revision 1034756)
+++ subversion/libsvn_fs_fs/lock.c	(working copy)
@@ -121,6 +121,19 @@ err_corrupt_lockfile(const char *fs_path, const ch
 
 /*** Digest file handling functions. ***/
 
+/* Return the path of the lock/entries directory for which DIGEST is the
+   hashed repository relative path. */
+static const char *
+digest_dir_path_from_digest(const char *fs_path,
+                            const char *digest,
+                            apr_pool_t *pool)
+{
+  return svn_dirent_join_many(pool, fs_path, PATH_LOCKS_DIR,
+                              apr_pstrmemdup(pool, digest, DIGEST_SUBDIR_LEN),
+                              NULL);
+}
+
+
 /* Return the path of the lock/entries file for which DIGEST is the
    hashed repository relative path. */
 static const char *
@@ -893,6 +906,65 @@ unlock_body(void *baton, apr_pool_t *pool)
 }
 
 
+/*** Hotcopying locks. ***/
+
+/* Implements walk_digests_callback_t. */
+static svn_error_t *
+hotcopy_walker(void *baton,
+               const char *fs_path,
+               const char *digest_path,
+               apr_hash_t *children /* ignored */,
+               svn_lock_t *lock /* ignored */,
+               svn_boolean_t have_write_lock,
+               apr_pool_t *pool)
+{
+  const char *dst_fs_path = baton;
+  const char *digest;
+  const char *src_digest_dir, *dst_digest_dir;
+  const char *perms_reference = digest_path;
+  
+  digest = svn_dirent_basename(digest_path, NULL);
+  src_digest_dir = svn_dirent_dirname(digest_path, pool);
+  dst_digest_dir = digest_dir_path_from_digest(dst_fs_path, digest, pool);
+
+  SVN_DBG(("Hotcopying fs='%s' dest='%s' digest='%s'\n", 
+           fs_path, dst_fs_path, digest));
+
+  /* Copy the digest file. */
+  SVN_ERR(svn_fs_fs__ensure_dir_exists(dst_digest_dir, dst_fs_path, pool));
+  SVN_ERR(svn_io_dir_file_copy(src_digest_dir, dst_digest_dir, digest, pool));
+
+  /* Generate the ancestor digest files (e.g., /trunk/A and /trunk, if we
+     locked /trunk/A/mu).  It will re-write DST_FS_PATH's digest_path, too.
+   */
+  if (lock)
+    SVN_ERR(set_lock(dst_fs_path, lock, perms_reference, pool));
+
+  return SVN_NO_ERROR;
+}
+
+svn_error_t *
+svn_fs_fs__hotcopy_locks(const char *fs_path,
+                         const char *dst_fs_path,
+                         apr_pool_t *pool)
+{
+  /* md5("/") == 6666cd76f96956469e7be39d750cc7d9 */
+  const char *md5_root = make_digest("/", pool);
+
+  SVN_ERR(svn_fs_fs__ensure_dir_exists(
+            svn_dirent_join(dst_fs_path, PATH_LOCKS_DIR, pool),
+            dst_fs_path,
+            pool));
+
+  SVN_ERR(walk_digest_files(fs_path,
+                            digest_path_from_digest(fs_path, md5_root, pool),
+                            hotcopy_walker, (void *)dst_fs_path,
+                            FALSE /* have_write_lock */,
+                            pool));
+  return SVN_NO_ERROR;
+}
+
+
 /*** Public API implementations ***/
 
 svn_error_t *
Index: subversion/libsvn_fs_fs/lock.h
===================================================================
--- subversion/libsvn_fs_fs/lock.h	(revision 1034745)
+++ subversion/libsvn_fs_fs/lock.h	(working copy)
@@ -96,6 +96,12 @@ svn_error_t *svn_fs_fs__allow_locked_operation(con
                                                svn_boolean_t have_write_lock,
                                                apr_pool_t *pool);
 
+/* Copy the locks hierarchy from the live FS at FS_PATH to DST_SUBDIR. */
+svn_error_t *
+svn_fs_fs__hotcopy_locks(const char *fs_path,
+                         const char *dst_subdir,
+                         apr_pool_t *scratch_pool);
+
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */
]]]

Re: FSFS locks questions and a concern

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Philip Martin wrote on Fri, Nov 12, 2010 at 11:02:14 +0000:
> Daniel Shahaf <d....@daniel.shahaf.name> writes:
> 
> > 3. What's a reasonable strategy to hotcopy the locks?
> >
> >    I've started on a patch that uses walk_digest_files() (I modified
> >    that function today), but currently that patch simply does a walk
> >    starting at the digest file md5("/"), and consequently it misses
> >    copying the unreferenced digest file md5("/trunk").
> 
> It appears you have to read/copy md5("/") and get a list of locks, then
> read/copy each lock file that still exists, then reconstruct the
> intermediate files for the locks that were copied.
> 

So, for example:

    - read md5("/")
      * read md5("/trunk/A/mu")
        + compute md5("/trunk/A")
        + compute md5("/trunk")
      * read md5("/trunk/iota")
        + compute md5("/trunk")

where the 'compute()' steps follow those of normal lock creation: read
the digest file, update list of children, write it back.

> -- 
> Philip

Re: FSFS locks questions and a concern

Posted by Philip Martin <ph...@wandisco.com>.
Daniel Shahaf <d....@daniel.shahaf.name> writes:

> 3. What's a reasonable strategy to hotcopy the locks?
>
>    I've started on a patch that uses walk_digest_files() (I modified
>    that function today), but currently that patch simply does a walk
>    starting at the digest file md5("/"), and consequently it misses
>    copying the unreferenced digest file md5("/trunk").

It appears you have to read/copy md5("/") and get a list of locks, then
read/copy each lock file that still exists, then reconstruct the
intermediate files for the locks that were copied.

-- 
Philip