You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Justin Erenkrantz <je...@apache.org> on 2002/07/11 08:47:11 UTC

[PATCH] Add entries caching

This is a first-cut patch that adds entries caching (only parses it
once per directory on a commit).  I'm posting this now so that people
can have a preview of what will come once the new adm_access baton
API is complete.

I'm not providing a log entry since this does really naughty
things and adds new functions to the API.  I figure that Philip
has a better grasp on the API than I do, so I'll wait for him to
get the adm_access_baton_t everywhere.  Once the API is in place,
I can work on cleaning up this code and integrating it.  The key
thing is that I now know this caching strategy works.

I've identified the key components in the harvest_committables
code path that rely on the entries file and added _cache function
variants that take the new baton and eventually pass that baton
to svn_wc_entry_cache().  The biggest changes are in commit_util.c.
Everything else is pretty much untouched.

This works for my linux kernel tree (local patch of 2.4.16->2.4.17).
Comparative (rough) timings for 'svn commit' (I wait for my editor
to come up and then quit, so it may be off a second or two):

Current SVN:
svn commit  226.25s user 9.53s system 85% cpu 4:37.00 total
This patch:
svn commit  10.58s user 16.62s system 18% cpu 2:27.09 total

Look at the CPU numbers and the total time.  I think with my
patch we start to get IO bound (I have a slow disk subsystem).

Oh, we are doing apr_stat()'s more than we should.  When we get
this entries parsing resolved, I can go back through and attempt
to reduce our apr_stat() calls.  First things first.

Please let me know what you think.  If you want to review the
code, feel free, but I *know* it is non-committable right now.
I just did this patch as a quick proof-of-concept.  -- justin

Index: ./subversion/include/svn_wc.h
===================================================================
--- ./subversion/include/svn_wc.h
+++ ./subversion/include/svn_wc.h	2002-07-11 00:19:34.000000000 -0700
@@ -215,6 +215,59 @@
 
 } svn_wc_diff_callbacks_t;
 
+/* ### Should this type be opaque in the public interface? */
+typedef struct svn_wc_adm_access_t
+{
+   /* PATH to directory which contains the administrative area */
+   const char *path;
+
+   enum svn_wc_adm_access_type {
+
+      /* SVN_WC_ADM_ACCESS_UNLOCKED indicates no lock is held allowing
+         read-only access without cacheing. */
+      svn_wc_adm_access_unlocked,
+
+#if 0
+      /* ### If read-only operations are allowed sufficient write access to
+         ### create read locks (did you follow that?) then entries cacheing
+         ### could apply to read-only operations as well.  This would
+         ### probably want to fall back to unlocked access if the
+         ### filesystem permissions prohibit writing to the administrative
+         ### area (consider running svn_wc_status on some other user's
+         ### working copy). */
+
+      /* SVN_WC_ADM_ACCESS_READ_LOCK indicates that read-only access and
+         cacheing are allowed. */
+      svn_wc_adm_access_read_lock,
+#endif
+
+      /* SVN_WC_ADM_ACCESS_WRITE_LOCK indicates that read-write access and
+         cacheing are allowed. */
+      svn_wc_adm_access_write_lock
+
+   } type;
+
+   /* LOCK_EXISTS is set TRUE when the write lock exists */
+   svn_boolean_t lock_exists;
+
+#if 1
+   /* ENTRIES_MODIFED is set TRUE when the entries cached in ENTRIES have
+      been modified from the original values read from the file. */
+   svn_boolean_t entries_modified;
+
+   /* Once the 'entries' file has been read, ENTRIES will cache the
+      contents if this access baton has an appropriate lock. Otherwise
+      ENTRIES will be NULL. */
+   apr_hash_t *entries;
+   apr_hash_t *non_deleted_entries;
+#endif
+
+   /* POOL is used to allocate cached items, they need to persist for the
+      lifetime of this access baton */
+   apr_pool_t *pool;
+
+} svn_wc_adm_access_t;
+
 
 
 /*** Asking questions about a working copy. ***/
@@ -248,6 +301,10 @@
                                      const char *filename,
                                      apr_pool_t *pool);
 
+svn_error_t *svn_wc_text_modified_cache_p (svn_boolean_t *modified_p,
+                                           svn_wc_adm_access_t *admin_baton,
+                                           const char *filename,
+                                           apr_pool_t *pool);
 
 /* Set *MODIFIED_P to non-zero if PATH's properties are modified
    w.r.t. the base revision, else set MODIFIED_P to zero. */
@@ -255,6 +312,12 @@
                                       const char *path,
                                       apr_pool_t *pool);
 
+/* Set *MODIFIED_P to non-zero if PATH's properties are modified
+   w.r.t. the base revision, else set MODIFIED_P to zero. */
+svn_error_t *svn_wc_props_modified_cache_p (svn_boolean_t *modified_p,
+                                            svn_wc_adm_access_t *admin_baton,
+                                            const char *path,
+                                            apr_pool_t *pool);
 
 
 
@@ -339,6 +402,11 @@
                            svn_boolean_t show_deleted,
                            apr_pool_t *pool);
 
+svn_error_t *svn_wc_entry_cache (svn_wc_entry_t **entry,
+                                 svn_wc_adm_access_t *access_baton, 
+                                 const char *path,
+                                 svn_boolean_t show_deleted,
+                                 apr_pool_t *pool);
 
 /* Parse the `entries' file for PATH and return a hash ENTRIES, whose
    keys are (const char *) entry names and values are (svn_wc_entry_t
@@ -1472,58 +1540,6 @@
 
 /*** Locking/Opening/Closing ***/
 
-/* ### Should this type be opaque in the public interface? */
-typedef struct svn_wc_adm_access_t
-{
-   /* PATH to directory which contains the administrative area */
-   const char *path;
-
-   enum svn_wc_adm_access_type {
-
-      /* SVN_WC_ADM_ACCESS_UNLOCKED indicates no lock is held allowing
-         read-only access without cacheing. */
-      svn_wc_adm_access_unlocked,
-
-#if 0
-      /* ### If read-only operations are allowed sufficient write access to
-         ### create read locks (did you follow that?) then entries cacheing
-         ### could apply to read-only operations as well.  This would
-         ### probably want to fall back to unlocked access if the
-         ### filesystem permissions prohibit writing to the administrative
-         ### area (consider running svn_wc_status on some other user's
-         ### working copy). */
-
-      /* SVN_WC_ADM_ACCESS_READ_LOCK indicates that read-only access and
-         cacheing are allowed. */
-      svn_wc_adm_access_read_lock,
-#endif
-
-      /* SVN_WC_ADM_ACCESS_WRITE_LOCK indicates that read-write access and
-         cacheing are allowed. */
-      svn_wc_adm_access_write_lock
-
-   } type;
-
-   /* LOCK_EXISTS is set TRUE when the write lock exists */
-   svn_boolean_t lock_exists;
-
-#if 0
-   /* ENTRIES_MODIFED is set TRUE when the entries cached in ENTRIES have
-      been modified from the original values read from the file. */
-   svn_boolean_t entries_modified;
-
-   /* Once the 'entries' file has been read, ENTRIES will cache the
-      contents if this access baton has an appropriate lock. Otherwise
-      ENTRIES will be NULL. */
-   apr_hash_t *entries;
-#endif
-
-   /* POOL is used to allocate cached items, they need to persist for the
-      lifetime of this access baton */
-   apr_pool_t *pool;
-
-} svn_wc_adm_access_t;
-
 /* Return an access baton in ADM_ACCESS for the working copy administrative
    area associated with the directory PATH.  If WRITE_LOCK is set the baton
    will include a write lock, otherwise the baton can only be used for read
Index: ./subversion/libsvn_wc/props.c
===================================================================
--- ./subversion/libsvn_wc/props.c
+++ ./subversion/libsvn_wc/props.c	2002-07-10 23:24:27.000000000 -0700
@@ -1365,6 +1365,139 @@
   return SVN_NO_ERROR;
 }
 
+svn_error_t *
+svn_wc_props_modified_cache_p (svn_boolean_t *modified_p,
+                               svn_wc_adm_access_t *admin_baton,
+                               const char *path,
+                               apr_pool_t *pool)
+{
+  svn_boolean_t bempty, wempty;
+  const char *prop_path;
+  const char *prop_base_path;
+  svn_boolean_t different_filesizes, equal_timestamps;
+  svn_wc_entry_t *entry;
+  apr_pool_t *subpool = svn_pool_create (pool);
+
+  /* First, get the paths of the working and 'base' prop files. */
+  SVN_ERR (svn_wc__prop_path (&prop_path, path, 0, subpool));
+  SVN_ERR (svn_wc__prop_base_path (&prop_base_path, path, 0, subpool));
+
+  /* Decide if either path is "empty" of properties. */
+  SVN_ERR (empty_props_p (&wempty, prop_path, subpool));
+  SVN_ERR (empty_props_p (&bempty, prop_base_path, subpool));
+
+  /* If something is scheduled for replacement, we do *not* want to
+     pay attention to any base-props;  they might be residual from the
+     old deleted file. */
+  SVN_ERR (svn_wc_entry_cache (&entry, admin_baton, path, TRUE, pool));  
+  if (entry 
+      && ((entry->schedule == svn_wc_schedule_replace)
+          || (entry->schedule == svn_wc_schedule_add)))
+    {
+      /* svn_wc_add() guarantees that a newly added file has no
+         working props at all; thus if this file is non-empty, the
+         user must have modified them.  Hopefully the caller will know
+         to ignore the baseprops as well!  */
+      *modified_p = wempty ? FALSE : TRUE;
+      goto cleanup;        
+    }
+
+  /* Easy out:  if the base file is empty, we know the answer
+     immediately. */
+  if (bempty)
+    {
+      if (! wempty)
+        {
+          /* base is empty, but working is not */
+          *modified_p = TRUE;
+          goto cleanup;
+        }
+      else
+        {
+          /* base and working are both empty */
+          *modified_p = FALSE;
+          goto cleanup;
+        }
+    }
+
+  /* OK, so the base file is non-empty.  One more easy out: */
+  if (wempty)
+    {
+      /* base exists, working is empty */
+      *modified_p = TRUE;
+      goto cleanup;
+    }
+
+  /* At this point, we know both files exists.  Therefore we have no
+     choice but to start checking their contents. */
+  
+  /* There are at least three tests we can try in succession. */
+  
+  /* Easy-answer attempt #1:  */
+  
+  /* Check if the the local and prop-base file have *definitely*
+     different filesizes. */
+  SVN_ERR (svn_io_filesizes_different_p (&different_filesizes,
+                                         prop_path,
+                                         prop_base_path,
+                                         subpool));
+  if (different_filesizes) 
+    {
+      *modified_p = TRUE;
+      goto cleanup;
+    }
+  
+  /* Easy-answer attempt #2:  */
+      
+  /* See if the local file's prop timestamp is the same as the one
+     recorded in the administrative directory.  */
+  SVN_ERR (svn_wc__timestamps_equal_p (&equal_timestamps, path,
+                                       svn_wc__prop_time, subpool));
+  if (equal_timestamps)
+    {
+      *modified_p = FALSE;
+      goto cleanup;
+    }
+  
+  /* Last ditch attempt:  */
+  
+  /* If we get here, then we know that the filesizes are the same,
+     but the timestamps are different.  That's still not enough
+     evidence to make a correct decision;  we need to look at the
+     files' contents directly.
+
+     However, doing a byte-for-byte comparison won't work.  The two
+     properties files may have the *exact* same name/value pairs, but
+     arranged in a different order.  (Our hashdump format makes no
+     guarantees about ordering.)
+
+     Therefore, rather than use contents_identical_p(), we use
+     svn_wc_get_local_propchanges(). */
+  {
+    apr_array_header_t *local_propchanges;
+    apr_hash_t *localprops = apr_hash_make (subpool);
+    apr_hash_t *baseprops = apr_hash_make (subpool);
+
+    SVN_ERR (svn_wc__load_prop_file (prop_path, localprops, subpool));
+    SVN_ERR (svn_wc__load_prop_file (prop_base_path,
+                                     baseprops,
+                                     subpool));
+    SVN_ERR (svn_wc_get_local_propchanges (&local_propchanges,
+                                           localprops,
+                                           baseprops,
+                                           subpool));
+                                         
+    if (local_propchanges->nelts > 0)
+      *modified_p = TRUE;
+    else
+      *modified_p = FALSE;
+  }
+ 
+ cleanup:
+  svn_pool_destroy (subpool);
+  
+  return SVN_NO_ERROR;
+}
 
 
 svn_error_t *
Index: ./subversion/libsvn_wc/entries.c
===================================================================
--- ./subversion/libsvn_wc/entries.c
+++ ./subversion/libsvn_wc/entries.c	2002-07-11 00:55:39.000000000 -0700
@@ -669,6 +669,148 @@
   return SVN_NO_ERROR;
 }
 
+static
+svn_error_t *
+prune_deleted_entries_cache (apr_hash_t *new,
+                             apr_hash_t *orig,
+                             apr_pool_t *pool)
+{
+  apr_hash_index_t *hi;
+
+  for (hi = apr_hash_first (pool, orig); hi; hi = apr_hash_next (hi))
+    {
+      const void *key;
+      apr_ssize_t keylen; 
+      void *val;
+      svn_wc_entry_t *entry;
+
+      /* Get the entry */
+      apr_hash_this (hi, &key, &keylen, &val);
+      entry = val;
+
+      if (! entry->deleted || (entry->schedule == svn_wc_schedule_add))
+        {
+          apr_hash_set(new, key, keylen, val);
+        }
+    }
+  return SVN_NO_ERROR;
+}
+
+/* Lazy creation. */
+static
+svn_error_t *
+populate_entries_cache (svn_wc_adm_access_t *admin_baton, const char *dir)
+{
+  assert (! strcmp(admin_baton->path, dir));
+  if ( ! admin_baton->entries )
+    {
+      admin_baton->entries = apr_hash_make(admin_baton->pool);
+      admin_baton->non_deleted_entries = apr_hash_make(admin_baton->pool);
+      
+      SVN_ERR (svn_wc_entries_read (&admin_baton->entries, dir, TRUE,
+               admin_baton->pool));
+      prune_deleted_entries_cache(admin_baton->non_deleted_entries,
+                                  admin_baton->entries, admin_baton->pool);
+    }
+
+  return SVN_NO_ERROR; 
+}
+
+svn_error_t *
+svn_wc_entry_cache (svn_wc_entry_t **entry,
+                    svn_wc_adm_access_t *admin_baton,
+                    const char *path,
+                    svn_boolean_t show_deleted,
+                    apr_pool_t *pool)
+{
+  enum svn_node_kind kind;
+  svn_boolean_t is_wc;
+  apr_hash_t *entries;
+
+  *entry = NULL;
+
+  SVN_ERR (svn_io_check_path (path, &kind, pool));
+
+  /* We need a canonicalized path! */
+  path = svn_path_canonicalize_nts(path, pool);
+
+  /* ### todo:
+     Make an innocent way to discover that a dir/path is or is not
+     under version control, so that this function can be robust.  I
+     think svn_wc_entries_read() will return an error right now if,
+     for example, PATH represents a new dir that svn still thinks is a
+     regular file under version control. */
+
+  if (kind == svn_node_dir)
+    {
+      SVN_ERR (svn_wc_check_wc (path, &is_wc, pool));
+      if (! is_wc)
+        return svn_error_createf
+          (SVN_ERR_WC_NOT_DIRECTORY, 0, NULL, pool,
+           "svn_wc_entry: %s is not a working copy directory", path);
+
+
+      /* If we can go from the cache (the admin_baton path matches up), go. */
+      if (! strcmp(admin_baton->path, path))
+        {
+          populate_entries_cache (admin_baton, path);
+          if ( show_deleted )
+            {
+              entries = admin_baton->entries;
+            }
+          else
+            {
+              entries = admin_baton->non_deleted_entries;
+            }
+        }
+      else
+        {
+          SVN_ERR (svn_wc_entries_read (&entries, path, show_deleted, pool));
+        }
+
+      *entry
+        = apr_hash_get (entries, SVN_WC_ENTRY_THIS_DIR, APR_HASH_KEY_STRING);
+    }
+
+  if (! *entry)
+    {
+      /* Maybe we're here because PATH is a directory, and we've
+         already tried and failed to retrieve its revision information
+         (we could have failed because PATH is under rev control as a
+         file, not a directory, i.e., the user rm'd the file and
+         created a dir there).
+         
+         Or maybe we're here because PATH is a regular file.
+         
+         Either way, if PATH is a versioned entity, it is versioned as
+         a file.  So look split and look in parent for entry info. */
+
+      const char *dir, *base_name;
+      svn_path_split_nts (path, &dir, &base_name, pool);
+
+      if (svn_path_is_empty_nts (dir))
+        dir = ".";
+
+      SVN_ERR (svn_wc_check_wc (dir, &is_wc, pool));
+      if (! is_wc)
+        return svn_error_createf
+          (SVN_ERR_WC_NOT_DIRECTORY, 0, NULL, pool,
+           "svn_wc_entry: %s is not a working copy directory", dir);
+
+      populate_entries_cache (admin_baton, dir);
+      if ( show_deleted )
+        {
+          entries = admin_baton->entries;
+        }
+      else
+        {
+          entries = admin_baton->non_deleted_entries;
+        }
+      *entry = apr_hash_get (entries, base_name, APR_HASH_KEY_STRING);
+    }
+
+  return SVN_NO_ERROR;
+}
 
 #if 0
 /* This is #if 0'd out until I decide where to use it. --cmpilato */
Index: ./subversion/libsvn_wc/lock.c
===================================================================
--- ./subversion/libsvn_wc/lock.c
+++ ./subversion/libsvn_wc/lock.c	2002-07-11 00:12:33.000000000 -0700
@@ -95,6 +95,8 @@
   lock->lock_exists = FALSE;
   lock->path = apr_pstrdup (pool, path);
   lock->pool = pool;
+  lock->entries = NULL;
+  lock->non_deleted_entries = NULL;
 
   if (write_lock)
     {
Index: ./subversion/libsvn_wc/questions.c
===================================================================
--- ./subversion/libsvn_wc/questions.c
+++ ./subversion/libsvn_wc/questions.c	2002-07-11 00:25:33.000000000 -0700
@@ -199,6 +199,82 @@
   return SVN_NO_ERROR;
 }
 
+svn_error_t *
+svn_wc__timestamps_equal_cache_p (svn_boolean_t *equal_p,
+                                  svn_wc_adm_access_t *admin_baton,
+                                  const char *path,
+                                  const enum svn_wc__timestamp_kind timestamp_kind,
+                                  apr_pool_t *pool)
+{
+  apr_time_t wfile_time, entrytime = 0;
+  const char *dirpath, *entryname;
+  struct svn_wc_entry_t *entry;
+  enum svn_node_kind kind;
+
+  SVN_ERR (svn_io_check_path (path, &kind, pool));
+  if (kind == svn_node_dir)
+    {
+      dirpath = path;
+      entryname = SVN_WC_ENTRY_THIS_DIR;
+    }
+  else
+    svn_path_split_nts (path, &dirpath, &entryname, pool);
+
+  /* Get the timestamp from the entries file */
+  SVN_ERR (svn_wc_entry_cache (&entry, admin_baton, dirpath, FALSE, pool));
+
+  /* Can't compare timestamps for an unversioned file. */
+  if (entry == NULL)
+    return svn_error_createf
+      (SVN_ERR_ENTRY_NOT_FOUND, 0, NULL, pool,
+       "timestamps_equal_p: `%s' not under revision control", entryname);
+
+  /* Get the timestamp from the working file and the entry */
+  if (timestamp_kind == svn_wc__text_time)
+    {
+      SVN_ERR (svn_io_file_affected_time (&wfile_time, path, pool));
+      entrytime = entry->text_time;
+    }
+  
+  else if (timestamp_kind == svn_wc__prop_time)
+    {
+      const char *prop_path;
+
+      SVN_ERR (svn_wc__prop_path (&prop_path, path, 0, pool));
+      SVN_ERR (svn_io_file_affected_time (&wfile_time, prop_path, pool));
+      entrytime = entry->prop_time;
+    }
+
+  if (! entrytime)
+    {
+      /* TODO: If either timestamp is inaccessible, the test cannot
+         return an answer.  Assume that the timestamps are
+         different. */
+      *equal_p = FALSE;
+      return SVN_NO_ERROR;
+    }
+
+  {
+    /* Put the disk timestamp through a string conversion, so it's
+       at the same resolution as entry timestamps. */
+    /* This string conversion here may be goodness, but it does
+       nothing currently _and_ it is somewhat expensive _and_ it eats
+       memory _and_ it is tested for in the regression tests. But I
+       will only comment it out because I do not possess the guts to
+       remove it altogether. */
+    /*
+    const char *tstr = svn_time_to_nts (wfile_time, pool);
+    SVN_ERR (svn_time_from_nts (&wfile_time, tstr, pool));
+    */
+  }
+  
+  if (wfile_time == entrytime)
+    *equal_p = TRUE;
+  else
+    *equal_p = FALSE;
+
+  return SVN_NO_ERROR;
+}
 
 /* Do a byte-for-byte comparison of FILE1 and FILE2. */
 static svn_error_t *
@@ -366,6 +442,61 @@
   return SVN_NO_ERROR;
 }
 
+svn_error_t *
+svn_wc_text_modified_cache_p (svn_boolean_t *modified_p,
+                              svn_wc_adm_access_t *admin_baton,
+                              const char *filename,
+                              apr_pool_t *pool)
+{
+  const char *textbase_filename;
+  svn_boolean_t equal_timestamps;
+  apr_pool_t *subpool = svn_pool_create (pool);
+  enum svn_node_kind kind;
+
+  /* Sanity check:  if the path doesn't exist, return FALSE. */
+  SVN_ERR (svn_io_check_path (filename, &kind, subpool));
+  if (kind != svn_node_file)
+    {
+      *modified_p = FALSE;
+      goto cleanup;
+    }
+
+  /* See if the local file's timestamp is the same as the one recorded
+     in the administrative directory.  This could, theoretically, be
+     wrong in certain rare cases, but with the addition of a forced
+     delay after commits (see revision 419 and issue #542) it's highly
+     unlikely to be a problem. */
+  SVN_ERR (svn_wc__timestamps_equal_cache_p (&equal_timestamps, admin_baton,
+                                             filename, svn_wc__text_time,
+                                             subpool));
+  if (equal_timestamps)
+    {
+      *modified_p = FALSE;
+      goto cleanup;
+    }
+      
+  /* If there's no text-base file, we have to assume the working file
+     is modified.  For example, a file scheduled for addition but not
+     yet committed. */
+  textbase_filename = svn_wc__text_base_path (filename, 0, subpool);
+  SVN_ERR (svn_io_check_path (textbase_filename, &kind, subpool));
+  if (kind != svn_node_file)
+    {
+      *modified_p = TRUE;
+      goto cleanup;
+    }
+  
+  /* Otherwise, fall back on the standard mod detector. */
+  SVN_ERR (svn_wc__versioned_file_modcheck (modified_p,
+                                            filename,
+                                            textbase_filename,
+                                            subpool));
+
+ cleanup:
+  svn_pool_destroy (subpool);
+
+  return SVN_NO_ERROR;
+}
 
 
 
Index: ./subversion/libsvn_client/commit_util.c
===================================================================
--- ./subversion/libsvn_client/commit_util.c
+++ ./subversion/libsvn_client/commit_util.c	2002-07-11 00:56:35.000000000 -0700
@@ -53,18 +53,26 @@
    allocations. */
 static svn_error_t *
 lock_dir (apr_hash_t *locked_dirs,
+          svn_wc_adm_access_t **admin_baton,
           const char *dir,
           apr_pool_t *pool)
 {
+  void *value;
   apr_pool_t *hash_pool = apr_hash_pool_get (locked_dirs);
 
-  if (! apr_hash_get (locked_dirs, dir, APR_HASH_KEY_STRING))
+  value = apr_hash_get (locked_dirs, dir, APR_HASH_KEY_STRING);
+
+  if (! value ) 
     {
       svn_wc_adm_access_t *adm_access;
       SVN_ERR (svn_wc_adm_open (&adm_access, dir, TRUE, hash_pool));
       apr_hash_set (locked_dirs, apr_pstrdup (hash_pool, dir), 
                     APR_HASH_KEY_STRING, adm_access);
+      value = adm_access;
     }
+  
+  *admin_baton = (svn_wc_adm_access_t*)value;
+
   return SVN_NO_ERROR;
 }
 
@@ -131,6 +139,7 @@
 static svn_error_t *
 harvest_committables (apr_hash_t *committables,
                       apr_hash_t *locked_dirs,
+                      svn_wc_adm_access_t *admin_baton,
                       const char *path,
                       const char *url,
                       const char *copyfrom_url,
@@ -142,7 +151,6 @@
                       apr_pool_t *pool)
 {
   apr_pool_t *subpool = svn_pool_create (pool);  /* ### why? */
-  apr_hash_t *entries = NULL;
   svn_boolean_t text_mod = FALSE, prop_mod = FALSE;
   apr_byte_t state_flags = 0;
   const char *p_path;
@@ -166,14 +174,7 @@
       /* ... then try to read its own entries file so we have a full
          entry for it (we were going to have to do this eventually to
          recurse anyway, so... ) */
-      svn_wc_entry_t *e = NULL;
-      if (svn_wc_entries_read (&entries, path, FALSE, subpool))
-        entries = NULL;
-
-      if ((entries) 
-          && ((e = apr_hash_get (entries, SVN_WC_ENTRY_THIS_DIR, 
-                                 APR_HASH_KEY_STRING))))
-        entry = e;
+      svn_wc_entry_cache (&entry, admin_baton, path, FALSE, subpool);
     }
 
   /* Test for a state of conflict, returning an error if an unresolved
@@ -277,8 +278,10 @@
     {
       /* Check for local mods: text+props for files, props alone for dirs. */
       if (entry->kind == svn_node_file)
-        SVN_ERR (svn_wc_text_modified_p (&text_mod, path, subpool));
-      SVN_ERR (svn_wc_props_modified_p (&prop_mod, path, subpool));
+        SVN_ERR (svn_wc_text_modified_cache_p (&text_mod, admin_baton, path,
+                                               subpool));
+      SVN_ERR (svn_wc_props_modified_cache_p (&prop_mod, admin_baton, path,
+                                              subpool));
     }
 
   /* Set text/prop modification flags accordingly. */
@@ -290,11 +293,13 @@
   /* Now, if this is something to commit, add it to our list. */
   if (state_flags)
     {
+      svn_wc_adm_access_t *ensure_access_baton;
+
       /* If the commit item is a directory, lock it, else lock its parent. */
       if (entry->kind == svn_node_dir)
-        lock_dir (locked_dirs, path, pool);
+        SVN_ERR (lock_dir (locked_dirs, &ensure_access_baton, path, pool));
       else
-        lock_dir (locked_dirs, p_path, pool);
+        SVN_ERR (lock_dir (locked_dirs, &ensure_access_baton, p_path, pool));
 
       /* Finally, add the committable item. */
       add_committable (committables, path, entry->kind, url,
@@ -305,7 +310,7 @@
   /* For directories, recursively handle each of their entries (except
      when the directory is being deleted, unless the deletion is part
      of a replacement ... how confusing). */
-  if ((entries) 
+  if (entry->kind == svn_node_dir
       && ((! (state_flags & SVN_CLIENT_COMMIT_ITEM_DELETE))
           || (state_flags & SVN_CLIENT_COMMIT_ITEM_ADD)))
     {
@@ -315,7 +320,7 @@
 
       /* Loop over all other entries in this directory, skipping the
          "this dir" entry. */
-      for (hi = apr_hash_first (subpool, entries);
+      for (hi = apr_hash_first (subpool, admin_baton->non_deleted_entries);
            hi;
            hi = apr_hash_next (hi))
         {
@@ -330,6 +335,8 @@
           const char *this_cf_url
             = cf_url ? apr_pstrdup (loop_pool, cf_url) : NULL;
 
+          svn_wc_adm_access_t *subdir_admin_baton;
+
           /* Get the next entry */
           apr_hash_this (hi, &key, &klen, &val);
           name = (const char *) key;
@@ -366,9 +373,19 @@
               used_url = this_url;
             }
 
+          if (this_entry->kind == svn_node_dir)
+            {
+              SVN_ERR (lock_dir (locked_dirs, &subdir_admin_baton, full_path,
+                                 subpool));
+            }
+          else
+            {
+              subdir_admin_baton = admin_baton; 
+            }
+
           /* Recurse. */
           SVN_ERR (harvest_committables 
-                   (committables, locked_dirs, full_path,
+                   (committables, locked_dirs, subdir_admin_baton, full_path,
                     used_url ? used_url : this_entry->url,
                     this_cf_url,
                     (svn_wc_entry_t *)val, 
@@ -397,13 +414,20 @@
                                   apr_pool_t *pool)
 {
   int i = 0;
-
+  svn_wc_adm_access_t *admin_baton;
+        
   /* Create the COMMITTABLES hash. */
   *committables = apr_hash_make (pool);
 
   /* Create the LOCKED_DIRS hash. */
   *locked_dirs = apr_hash_make (pool);
 
+  /* Normalize parent dir to not have /. */
+  parent_dir = svn_path_canonicalize_nts (parent_dir, pool);
+
+  /* Obtain the admin_baton, so we can be cacheable. */
+  SVN_ERR (lock_dir (*locked_dirs, &admin_baton, parent_dir, pool));
+
   /* ### Would be nice to use an iteration pool here, but need to
      first look into lifetime issues w/ anything passed to
      harvest_committables() and possibly stored by it. */ 
@@ -424,7 +448,7 @@
         }
 
       /* No entry?  This TARGET isn't even under version control! */
-      SVN_ERR (svn_wc_entry (&entry, target, FALSE, pool));
+      SVN_ERR (svn_wc_entry_cache (&entry, admin_baton, target, FALSE, pool));
       if (! entry)
         return svn_error_create 
           (SVN_ERR_ENTRY_NOT_FOUND, 0, NULL, pool, target);
@@ -458,7 +482,8 @@
           svn_path_split_nts (target, &parent, &base_name, pool);
           if (svn_path_is_empty_nts (parent))
             parent = ".";
-          SVN_ERR (svn_wc_entry (&p_entry, parent, FALSE, pool));
+          SVN_ERR (svn_wc_entry_cache (&p_entry, admin_baton, parent, FALSE,
+                   pool));
           if (! p_entry)
             return svn_error_createf 
               (SVN_ERR_WC_CORRUPT, 0, NULL, pool, 
@@ -490,9 +515,9 @@
            target);
 
       /* Handle our TARGET. */
-      SVN_ERR (harvest_committables (*committables, *locked_dirs, target, 
-                                     url, NULL, entry, NULL, FALSE, FALSE, 
-                                     nonrecursive, pool));
+      SVN_ERR (harvest_committables (*committables, *locked_dirs, admin_baton,
+                                     target, url, NULL, entry, NULL, FALSE,
+                                     FALSE, nonrecursive, pool));
 
       i++;
     }
@@ -510,6 +535,7 @@
                                    apr_pool_t *pool)
 {
   svn_wc_entry_t *entry;
+  svn_wc_adm_access_t *admin_baton;
 
   /* Create the COMMITTABLES hash. */
   *committables = apr_hash_make (pool);
@@ -517,16 +543,19 @@
   /* Create the LOCKED_DIRS hash. */
   *locked_dirs = apr_hash_make (pool);
 
+  SVN_ERR (lock_dir (*locked_dirs, &admin_baton, target, pool));
+
   /* Read the entry for TARGET. */
-  SVN_ERR (svn_wc_entry (&entry, target, FALSE, pool));
+  SVN_ERR (svn_wc_entry_cache (&entry, admin_baton, target, FALSE, pool));
+
   if (! entry)
     return svn_error_create 
       (SVN_ERR_ENTRY_NOT_FOUND, 0, NULL, pool, target);
       
   /* Handle our TARGET. */
-  SVN_ERR (harvest_committables (*committables, *locked_dirs, target, 
-                                 new_url, entry->url, entry, NULL,
-                                 FALSE, TRUE, FALSE, pool));
+  SVN_ERR (harvest_committables (*committables, *locked_dirs, admin_baton,
+                                 target, new_url, entry->url, entry,
+                                 NULL, FALSE, TRUE, FALSE, pool));
 
   return SVN_NO_ERROR;
 }

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Add entries caching

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
> > Current SVN:
> > svn commit  226.25s user 9.53s system 85% cpu 4:37.00 total
> > This patch:
> > svn commit  10.58s user 16.62s system 18% cpu 2:27.09 total

"Proof of concept" doesn't begin to do this justice... :-)



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Add entries caching

Posted by Ben Collins-Sussman <su...@collab.net>.
Justin Erenkrantz <je...@apache.org> writes:

> Current SVN:
> svn commit  226.25s user 9.53s system 85% cpu 4:37.00 total
> This patch:
> svn commit  10.58s user 16.62s system 18% cpu 2:27.09 total

** falls off chair **

Holy COW, that's INCREDIBLE!

I'm so glad you guys are attacking this.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org