You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Sander Striker <st...@apache.org> on 2003/07/01 23:26:24 UTC

[PATCH] Cache open repository per connection, WAS: RE: general server performance (was Re: apache svn server memory usage?)

> From: sussman@collab.net [mailto:sussman@collab.net]
> Sent: Tuesday, July 01, 2003 5:43 PM

> 2. As was already mentioned, because HTTP request is stateless, apache
>    opens and closes/syncs the repository (BDB environment) with
>    *every* request.  (One user had write caching turned off on his
>    server;  this caused his http checkouts to arrive about 1 file
>    every 2 seconds!)  There's been discusion about keeping the
>    repository open for the whole TCP/IP "connection session", and
>    Sander has experimented with this, but it didn't seem to do much.
>    Still need to investigate.

And here is the limited tested patch.


Sander

Log:
Cache open repository per connection.

* subversion/mod_dav_svn/repos.c

  (dav_svn_get_resource): Store open repository in connection notes
    table, keyed by location and repositoryname.  Use this open
    repository for the duration of the connection.


Index: subversion/mod_dav_svn/repos.c
===================================================================
--- subversion/mod_dav_svn/repos.c      (revision 6386)
+++ subversion/mod_dav_svn/repos.c      (working copy)
@@ -22,6 +22,7 @@
 #include <http_protocol.h>
 #include <http_log.h>
 #include <http_core.h>  /* for ap_construct_url */
+#include <util_script.h> /* for ap_find_path_info */
 #include <mod_dav.h>

 #define APR_WANT_STRFUNC
@@ -1076,6 +1077,8 @@
   const char *repos_name;
   const char *relative;
   const char *repos_path;
+  const char *base_path;
+  const char *repos_key;
   const char *version_name;
   svn_error_t *serr;
   dav_error *err;
@@ -1181,15 +1184,27 @@
   /* Remember who is making this request */
   repos->username = r->user;

-  /* open the SVN FS */
-  serr = svn_repos_open(&(repos->repos), fs_path, r->pool);
-  if (serr != NULL)
+  /* Get the repository */
+  base_path = apr_pstrndup(r->pool, r->uri,
+                           ap_find_path_info(r->uri, r->path_info));
+  repos_key = apr_pstrcat(r->pool, "mod_dav_svn:", base_path, root_path);
+  repos->repos = (void *)apr_table_get(r->connection->notes, repos_key);
+  if (repos->repos == NULL)
     {
-      return dav_svn_convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
-                                 apr_psprintf(r->pool,
-                                              "Could not open the SVN "
-                                              "filesystem at %s",
-                                              fs_path));
+      serr = svn_repos_open(&(repos->repos), fs_path, r->connection->pool);
+      if (serr != NULL)
+        {
+          return dav_svn_convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
+                                     apr_psprintf(r->pool,
+                                                  "Could not open the SVN "
+                                                  "filesystem at %s",
+                                                  fs_path));
+        }
+
+      /* Cache the open repos for the next request on this connection */
+      apr_table_setn(r->connection->notes,
+                     apr_pstrdup(r->connection->pool, repos_key),
+                     (void *)repos->repos);
     }

   /* cache the filesystem object */

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Cache open repository per connection, WAS: RE: general server performance (was Re: apache svn server memory usage?)

Posted by pl...@lanminds.com.
>>>>> On Mon, 14 Jul 2003, "Sander" == Sander Striker wrote:

  Sander> Like I said in my earlier comment: _don't_ worry about this
  Sander> patch, I will be applying it myself.

I'm completely confused now. Is what I filed as 1412 the same as what 
I e-mailed you about?

I had them marked as two different threads for some reason.  One of 
which you stated you'd commit yourself (Subject: Oops...Here is the 
patch), and the other, (Subject: Cache open repository per connection).

Of course, now that I look at the messages properly threaded in the 
archive, they do appear to be related :(

Sorry, I *suck* at patch management!

(I can't *wait* for Sander Roobol to get back from holiday!)
-- 

Seeya,
Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: [PATCH] Cache open repository per connection, WAS: RE: general server performance (was Re: apache svn server memory usage?)

Posted by Sander Striker <st...@apache.org>.
> From: Paul L Lussier [mailto:pll@lanminds.com]
> Sent: Monday, July 14, 2003 8:44 PM

> Filed as issue 1412:
> 
> 	http://subversion.tigris.org/issues/show_bug.cgi?id=1412

You gave me about 20 minutes to reply...  Which ofcourse I didn't
make.  The patch was already committed.  Like I said in my earlier
comment: _don't_ worry about this patch, I will be applying it myself.

Oh well,


Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Cache open repository per connection, WAS: RE: general server performance (was Re: apache svn server memory usage?)

Posted by Paul L Lussier <pl...@lanminds.com>.
Filed as issue 1412:

	http://subversion.tigris.org/issues/show_bug.cgi?id=1412
-- 

Seeya,
Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: [PATCH] Cache open repository per connection

Posted by Sander Striker <st...@apache.org>.
> From: Mukund [mailto:mukund@tessna.com]
> Sent: Wednesday, July 02, 2003 2:01 PM

> On Tue, Jul 01, 2003 at 07:25:54PM -0700, Greg Stein wrote:
> | > >    Sander has experimented with this, but it didn't seem to do much.
> | 
> | Bugs :-)
> | 
> 
> Sander, can you comment on Greg's message and if any changes to the patch
> are due, please make them. I would really appreciate this particular
> patch.

Yes, as soon as I get back tonight.

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Cache open repository per connection

Posted by Mukund <mu...@tessna.com>.
On Tue, Jul 01, 2003 at 07:25:54PM -0700, Greg Stein wrote:
| > >    Sander has experimented with this, but it didn't seem to do much.
| 
| Bugs :-)
| 

Sander, can you comment on Greg's message and if any changes to the patch
are due, please make them. I would really appreciate this particular
patch.

Mukund


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: [PATCH] Cache open repository per connection

Posted by Sander Striker <st...@apache.org>.
> From: Greg Stein [mailto:gstein@lyra.org]
> Sent: Wednesday, July 02, 2003 4:26 AM

> Bugs :-)

'lets :)  Perceptions... ;)  Okay, okay, it was a hack.  Happy now? ;P :)

>>...
>> +++ subversion/mod_dav_svn/repos.c      (working copy)
>>...
>> +  /* Get the repository */
>> +  base_path = apr_pstrndup(r->pool, r->uri,
>> +                           ap_find_path_info(r->uri, r->path_info));
> 
> Hmm. I'm not really sure what this is extracting from the URL. A clearer
> comment might be helpful.

heh heh, it extracts the Location part.  I came across this piece of
code a while ago in the httpd codebase and it seems about the only
way to get to the Location part, apart from storing it in the dir config
at dir config creation time and retrieving it from there later on.
 
>> +  repos_key = apr_pstrcat(r->pool, "mod_dav_svn:", base_path, root_path);
> 
> Might want to put a ,NULL on the end there. Otherwise, your key is random
> and will never get a cache-hit :-)

Crap!  :)  You're right.  This would explain why I never saw any speedup ;) :)
 
> In any case, I'd recommend caching on the fs_path instead of the URI.

Nope.  See the discussion on list.  Greg Hudson was best able to express why
that can be a bad idea.  Basically it comes down to being able to map multiple
Locations to the same repository.  Come to think of it, root_path will prolly
do just fine... but we might consider blending in the hostname aswell.
 
>> +  repos->repos = (void *)apr_table_get(r->connection->notes, repos_key);
> 
> I'd recommend using r->connection->pool's userdata instead of the notes.
> Tables are not meant to store binary values; I'm not sure that it is very
> reliable.

Oh come one, live a little ;).

Sidenote: we need to fix the apr docs:

 * Tables are used to store entirely opaque structures
 * for applications, while Arrays are usually used to
 * deal with string lists.

Ofcourse this isn't true when you are using the add/set functions, as oposed
to addn/setn, since those try to copy the data.

> You could (again) see corrupted data or cache misses.

Not in this case.  No copying of the reference takes place.  But I agree that
using the connection pools userdata is cleaner.  New patch below.


Sander

Index: subversion/mod_dav_svn/repos.c
===================================================================
--- subversion/mod_dav_svn/repos.c      (revision 6386)
+++ subversion/mod_dav_svn/repos.c      (working copy)
@@ -1076,6 +1076,7 @@
   const char *repos_name;
   const char *relative;
   const char *repos_path;
+  const char *repos_key;
   const char *version_name;
   svn_error_t *serr;
   dav_error *err;
@@ -1181,15 +1182,27 @@
   /* Remember who is making this request */
   repos->username = r->user;

-  /* open the SVN FS */
-  serr = svn_repos_open(&(repos->repos), fs_path, r->pool);
-  if (serr != NULL)
+  /* Cache open repository.  Key it off by root_path, which should be more
+   * unique than the fs_path, given that two Locations may point to the
+   * same repository.
+   */
+  repos_key = apr_pstrcat(r->pool, "mod_dav_svn:", root_path, NULL);
+  apr_pool_userdata_get((void **)&repos->repos, repos_key, r->connection->pool);
+  if (repos->repos == NULL)
     {
-      return dav_svn_convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
-                                 apr_psprintf(r->pool,
-                                              "Could not open the SVN "
-                                              "filesystem at %s",
-                                              fs_path));
+      serr = svn_repos_open(&(repos->repos), fs_path, r->connection->pool);
+      if (serr != NULL)
+        {
+          return dav_svn_convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
+                                     apr_psprintf(r->pool,
+                                                  "Could not open the SVN "
+                                                  "filesystem at %s",
+                                                  fs_path));
+        }
+
+      /* Cache the open repos for the next request on this connection */
+      apr_pool_userdata_set(repos->repos, repos_key,
+                            NULL, r->connection->pool);
     }

   /* cache the filesystem object */

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Cache open repository per connection

Posted by Greg Stein <gs...@lyra.org>.
On Wed, Jul 02, 2003 at 01:26:24AM +0200, Sander Striker wrote:
> > From: sussman@collab.net [mailto:sussman@collab.net]
> > Sent: Tuesday, July 01, 2003 5:43 PM
> 
> > 2. As was already mentioned, because HTTP request is stateless, apache
> >    opens and closes/syncs the repository (BDB environment) with
> >    *every* request.  (One user had write caching turned off on his
> >    server;  this caused his http checkouts to arrive about 1 file
> >    every 2 seconds!)  There's been discusion about keeping the
> >    repository open for the whole TCP/IP "connection session", and
> >    Sander has experimented with this, but it didn't seem to do much.

Bugs :-)

>...
> +++ subversion/mod_dav_svn/repos.c      (working copy)
>...
> +  /* Get the repository */
> +  base_path = apr_pstrndup(r->pool, r->uri,
> +                           ap_find_path_info(r->uri, r->path_info));

Hmm. I'm not really sure what this is extracting from the URL. A clearer
comment might be helpful.

> +  repos_key = apr_pstrcat(r->pool, "mod_dav_svn:", base_path, root_path);

Might want to put a ,NULL on the end there. Otherwise, your key is random
and will never get a cache-hit :-)

In any case, I'd recommend caching on the fs_path instead of the URI.

> +  repos->repos = (void *)apr_table_get(r->connection->notes, repos_key);

I'd recommend using r->connection->pool's userdata instead of the notes.
Tables are not meant to store binary values; I'm not sure that it is very
reliable. You could (again) see corrupted data or cache misses.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org