You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Chris Hecker <ch...@d6.com> on 2003/07/01 04:44:05 UTC

RE: apache svn server memory usage?

> > This is normal.  14-22M httpd isn't great, but it's much better than
> > the hundreds of megs it used to be, back when we had real memory
> > bugs.  :-)
>...
>Correct.  It does a check whether it should free() that memory or keep
>it around.  The default is keeping everything around.  Over time,
>memory usage will stabilize and thus no more system calls in the form
>of malloc()/free() are needed.

The bummer is it's being kept around for all the forked processes.  Would 
it be better to use one of the other mpms (threaded?) so they share some of 
this memory?  I know nothing about apache, of course.  The docs say some 
modules aren't thread-safe...is mod_dav + svn?

>The MaxMemFree directive allows you to control the total amount of
>memory somewhat.  httpd 2.0.47 will (most probably) obey this directive
>better than the current version does.

What will be good settings here?

Thanks,
Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Branko Čibej <br...@xbc.nu>.
Chris Hecker wrote:

>
>> > Right, but even checkouts seem pokey...are they considered
>> > transactions as far as disk syncing as well (I assume not)?
>> I'm talking about database transactions, and yes, quite a few of those
>> take place during checkout.
>
>
> Ah, doesn't it seem a bit wrong to be doing logged transactions for
> read-only operations (like up and co)?  It seems like there'd be a
> lighter weight BDB process for that.

:-) It's not that simple. A SVN checkout or update isn't just about
reading from the database.

Of course, there are lots of places in the code where we could (and IMHO
should) stop using transactions.There's even an issue about this (409),
but as I've said before elsewhere, this is anything but a trivial thing
to do. It involves big changes in the FS implementation, and we can't
afford to do those ATM.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Chris Hecker <ch...@d6.com>.
> > Right, but even checkouts seem pokey...are they considered
> > transactions as far as disk syncing as well (I assume not)?
>I'm talking about database transactions, and yes, quite a few of those
>take place during checkout.

Ah, doesn't it seem a bit wrong to be doing logged transactions for 
read-only operations (like up and co)?  It seems like there'd be a lighter 
weight BDB process for that.

Chris



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance

Posted by Greg Stein <gs...@lyra.org>.
On Wed, Jul 02, 2003 at 01:04:37AM +0200, Branko ??ibej wrote:
>...
> You can set the DB_TXN_NOSYNC option in DB_CONFIG, but of course if you
> do that, you're prone to irrecoverable database corruption if anything
> goes wrong with your system.

It would be *really* cool if we could adjust that on a per-FS-open basis.
"I'm going to do some work which can be lost." That would be just perfect
for update reports, where a crash on the server will cause the client to
simply restart. Any data loss is no big deal.

(of course, we don't want to have the BDB lose integrity and need to be
 recovered; we really want something that says "don't be loggy")

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Branko Čibej <br...@xbc.nu>.
Chris Hecker wrote:

>
>> Caching doesn't help you when you have to fsync the database log files
>> at every transaction commit.
>
>
> Right, but even checkouts seem pokey...are they considered
> transactions as far as disk syncing as well (I assume not)?

I'm talking about database transactions, and yes, quite a few of those
take place during checkout.

> Also, is there any way to trade risk for performance, and have it not
> sync to disk as often, or schedule it for the background, etc.?

You can set the DB_TXN_NOSYNC option in DB_CONFIG, but of course if you
do that, you're prone to irrecoverable database corruption if anything
goes wrong with your system.

> But even ignoring that, what explains why the net throughput is so low?

Throughtput is amount of data sent divided by the time it takes to send
it. If it takes longer to have the data available because the server
blocks on disk I/O, then of course that'll lower your throughput.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Philip Martin <ph...@codematters.co.uk>.
Chris Hecker <ch...@d6.com> writes:

> Right, but even checkouts seem pokey...are they considered
> transactions as far as disk syncing as well (I assume not)?

Checkouts are transactions.

> Also, is there any way to trade risk for performance, and have it
> not sync to disk as often, or schedule it for the background, etc.?

Perhaps 'svnadmin create --bdb-txn-nosync' is what you want?  You can
alter an existing repository by setting DB_TXN_NOSYNC/DB_TXN_WRITE_NOSYNC
in the repository's DB_CONFIG.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: [PATCH] Cache open repository per connection

Posted by Sander Striker <st...@apache.org>.
> From: Mukund [mailto:mukund@tessna.com]
> Sent: Wednesday, July 02, 2003 2:01 PM

> On Tue, Jul 01, 2003 at 07:25:54PM -0700, Greg Stein wrote:
> | > >    Sander has experimented with this, but it didn't seem to do much.
> | 
> | Bugs :-)
> | 
> 
> Sander, can you comment on Greg's message and if any changes to the patch
> are due, please make them. I would really appreciate this particular
> patch.

Yes, as soon as I get back tonight.

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Cache open repository per connection

Posted by Mukund <mu...@tessna.com>.
On Tue, Jul 01, 2003 at 07:25:54PM -0700, Greg Stein wrote:
| > >    Sander has experimented with this, but it didn't seem to do much.
| 
| Bugs :-)
| 

Sander, can you comment on Greg's message and if any changes to the patch
are due, please make them. I would really appreciate this particular
patch.

Mukund


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: [PATCH] Cache open repository per connection

Posted by Sander Striker <st...@apache.org>.
> From: Greg Stein [mailto:gstein@lyra.org]
> Sent: Wednesday, July 02, 2003 4:26 AM

> Bugs :-)

'lets :)  Perceptions... ;)  Okay, okay, it was a hack.  Happy now? ;P :)

>>...
>> +++ subversion/mod_dav_svn/repos.c      (working copy)
>>...
>> +  /* Get the repository */
>> +  base_path = apr_pstrndup(r->pool, r->uri,
>> +                           ap_find_path_info(r->uri, r->path_info));
> 
> Hmm. I'm not really sure what this is extracting from the URL. A clearer
> comment might be helpful.

heh heh, it extracts the Location part.  I came across this piece of
code a while ago in the httpd codebase and it seems about the only
way to get to the Location part, apart from storing it in the dir config
at dir config creation time and retrieving it from there later on.
 
>> +  repos_key = apr_pstrcat(r->pool, "mod_dav_svn:", base_path, root_path);
> 
> Might want to put a ,NULL on the end there. Otherwise, your key is random
> and will never get a cache-hit :-)

Crap!  :)  You're right.  This would explain why I never saw any speedup ;) :)
 
> In any case, I'd recommend caching on the fs_path instead of the URI.

Nope.  See the discussion on list.  Greg Hudson was best able to express why
that can be a bad idea.  Basically it comes down to being able to map multiple
Locations to the same repository.  Come to think of it, root_path will prolly
do just fine... but we might consider blending in the hostname aswell.
 
>> +  repos->repos = (void *)apr_table_get(r->connection->notes, repos_key);
> 
> I'd recommend using r->connection->pool's userdata instead of the notes.
> Tables are not meant to store binary values; I'm not sure that it is very
> reliable.

Oh come one, live a little ;).

Sidenote: we need to fix the apr docs:

 * Tables are used to store entirely opaque structures
 * for applications, while Arrays are usually used to
 * deal with string lists.

Ofcourse this isn't true when you are using the add/set functions, as oposed
to addn/setn, since those try to copy the data.

> You could (again) see corrupted data or cache misses.

Not in this case.  No copying of the reference takes place.  But I agree that
using the connection pools userdata is cleaner.  New patch below.


Sander

Index: subversion/mod_dav_svn/repos.c
===================================================================
--- subversion/mod_dav_svn/repos.c      (revision 6386)
+++ subversion/mod_dav_svn/repos.c      (working copy)
@@ -1076,6 +1076,7 @@
   const char *repos_name;
   const char *relative;
   const char *repos_path;
+  const char *repos_key;
   const char *version_name;
   svn_error_t *serr;
   dav_error *err;
@@ -1181,15 +1182,27 @@
   /* Remember who is making this request */
   repos->username = r->user;

-  /* open the SVN FS */
-  serr = svn_repos_open(&(repos->repos), fs_path, r->pool);
-  if (serr != NULL)
+  /* Cache open repository.  Key it off by root_path, which should be more
+   * unique than the fs_path, given that two Locations may point to the
+   * same repository.
+   */
+  repos_key = apr_pstrcat(r->pool, "mod_dav_svn:", root_path, NULL);
+  apr_pool_userdata_get((void **)&repos->repos, repos_key, r->connection->pool);
+  if (repos->repos == NULL)
     {
-      return dav_svn_convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
-                                 apr_psprintf(r->pool,
-                                              "Could not open the SVN "
-                                              "filesystem at %s",
-                                              fs_path));
+      serr = svn_repos_open(&(repos->repos), fs_path, r->connection->pool);
+      if (serr != NULL)
+        {
+          return dav_svn_convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
+                                     apr_psprintf(r->pool,
+                                                  "Could not open the SVN "
+                                                  "filesystem at %s",
+                                                  fs_path));
+        }
+
+      /* Cache the open repos for the next request on this connection */
+      apr_pool_userdata_set(repos->repos, repos_key,
+                            NULL, r->connection->pool);
     }

   /* cache the filesystem object */

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Cache open repository per connection

Posted by Greg Stein <gs...@lyra.org>.
On Wed, Jul 02, 2003 at 01:26:24AM +0200, Sander Striker wrote:
> > From: sussman@collab.net [mailto:sussman@collab.net]
> > Sent: Tuesday, July 01, 2003 5:43 PM
> 
> > 2. As was already mentioned, because HTTP request is stateless, apache
> >    opens and closes/syncs the repository (BDB environment) with
> >    *every* request.  (One user had write caching turned off on his
> >    server;  this caused his http checkouts to arrive about 1 file
> >    every 2 seconds!)  There's been discusion about keeping the
> >    repository open for the whole TCP/IP "connection session", and
> >    Sander has experimented with this, but it didn't seem to do much.

Bugs :-)

>...
> +++ subversion/mod_dav_svn/repos.c      (working copy)
>...
> +  /* Get the repository */
> +  base_path = apr_pstrndup(r->pool, r->uri,
> +                           ap_find_path_info(r->uri, r->path_info));

Hmm. I'm not really sure what this is extracting from the URL. A clearer
comment might be helpful.

> +  repos_key = apr_pstrcat(r->pool, "mod_dav_svn:", base_path, root_path);

Might want to put a ,NULL on the end there. Otherwise, your key is random
and will never get a cache-hit :-)

In any case, I'd recommend caching on the fs_path instead of the URI.

> +  repos->repos = (void *)apr_table_get(r->connection->notes, repos_key);

I'd recommend using r->connection->pool's userdata instead of the notes.
Tables are not meant to store binary values; I'm not sure that it is very
reliable. You could (again) see corrupted data or cache misses.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Branko Čibej <br...@xbc.nu>.
Mukund wrote:

>On Wed, Jul 02, 2003 at 06:10:12PM +0200, Branko Čibej wrote:
>| Let's just make one thing clear here -- the fsync that happens at every
>| BDB transaction commit has nothing to do with how many times you open
>| the database. Keeping the DB open will help, yes, but it won't
>| significantly decrease the number of fsyncs.
>
>Hi Branko
>
>Perhaps you have not understood what I had meant. You can disable the
>fsync which happens at every transaction commit, using the DB_TXN_NOSYNC
>option. But however, when you close the DB, the fsync still happens.
>
>When the DB is opened and closed at every request, the whole point of
>DB_TXN_NOSYNC is defeated, as you are literally syncing every small bunch of
>transactions per request as they happen. In an active repository, this
>keeps the disk constantly busy. Keeping an open connection pool
>helps in this case.
>
>I wonder if the sync at DB close can be disabled.
>  
>
It can, but not with an option in DB_CONFIG. You can pass the DB_NOSYNC
flag to the DB->close function. I would recommend against doing that in
our code, though.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Mukund <mu...@tessna.com>.
On Wed, Jul 02, 2003 at 06:10:12PM +0200, Branko Čibej wrote:
| Let's just make one thing clear here -- the fsync that happens at every
| BDB transaction commit has nothing to do with how many times you open
| the database. Keeping the DB open will help, yes, but it won't
| significantly decrease the number of fsyncs.

Hi Branko

Perhaps you have not understood what I had meant. You can disable the
fsync which happens at every transaction commit, using the DB_TXN_NOSYNC
option. But however, when you close the DB, the fsync still happens.

When the DB is opened and closed at every request, the whole point of
DB_TXN_NOSYNC is defeated, as you are literally syncing every small bunch of
transactions per request as they happen. In an active repository, this
keeps the disk constantly busy. Keeping an open connection pool
helps in this case.

I wonder if the sync at DB close can be disabled.

Mukund


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Branko Čibej <br...@xbc.nu>.
Mukund wrote:

>On Tue, Jul 01, 2003 at 10:42:43AM -0500, Ben Collins-Sussman wrote:
>|    every 2 seconds!)  There's been discusion about keeping the
>|    repository open for the whole TCP/IP "connection session", and
>|    Sander has experimented with this, but it didn't seem to do much.
>|    Still need to investigate.
>
>Hi Sussman
>
>I am going to try this patch when Sander looks at Greg Stein's comments
>(in this thread) for his patch and releases a new one if he thinks
>modifications are due.
>
>I am not sure how keeping the repository open will not help, as the
>performance degradation is due to syncs of the accumulated
>transactions when the DB is closed at the end of every HTTP request.
>A checkout has the disk chugging like when an OS thrashes.
>  
>
Let's just make one thing clear here -- the fsync that happens at every
BDB transaction commit has nothing to do with how many times you open
the database. Keeping the DB open will help, yes, but it won't
significantly decrease the number of fsyncs.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Mukund <mu...@tessna.com>.
On Tue, Jul 01, 2003 at 10:42:43AM -0500, Ben Collins-Sussman wrote:
|    every 2 seconds!)  There's been discusion about keeping the
|    repository open for the whole TCP/IP "connection session", and
|    Sander has experimented with this, but it didn't seem to do much.
|    Still need to investigate.

Hi Sussman

I am going to try this patch when Sander looks at Greg Stein's comments
(in this thread) for his patch and releases a new one if he thinks
modifications are due.

I am not sure how keeping the repository open will not help, as the
performance degradation is due to syncs of the accumulated
transactions when the DB is closed at the end of every HTTP request.
A checkout has the disk chugging like when an OS thrashes.

Mukund


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Cache open repository per connection, WAS: RE: general server performance (was Re: apache svn server memory usage?)

Posted by pl...@lanminds.com.
>>>>> On Mon, 14 Jul 2003, "Sander" == Sander Striker wrote:

  Sander> Like I said in my earlier comment: _don't_ worry about this
  Sander> patch, I will be applying it myself.

I'm completely confused now. Is what I filed as 1412 the same as what 
I e-mailed you about?

I had them marked as two different threads for some reason.  One of 
which you stated you'd commit yourself (Subject: Oops...Here is the 
patch), and the other, (Subject: Cache open repository per connection).

Of course, now that I look at the messages properly threaded in the 
archive, they do appear to be related :(

Sorry, I *suck* at patch management!

(I can't *wait* for Sander Roobol to get back from holiday!)
-- 

Seeya,
Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: [PATCH] Cache open repository per connection, WAS: RE: general server performance (was Re: apache svn server memory usage?)

Posted by Sander Striker <st...@apache.org>.
> From: Paul L Lussier [mailto:pll@lanminds.com]
> Sent: Monday, July 14, 2003 8:44 PM

> Filed as issue 1412:
> 
> 	http://subversion.tigris.org/issues/show_bug.cgi?id=1412

You gave me about 20 minutes to reply...  Which ofcourse I didn't
make.  The patch was already committed.  Like I said in my earlier
comment: _don't_ worry about this patch, I will be applying it myself.

Oh well,


Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Cache open repository per connection, WAS: RE: general server performance (was Re: apache svn server memory usage?)

Posted by Paul L Lussier <pl...@lanminds.com>.
Filed as issue 1412:

	http://subversion.tigris.org/issues/show_bug.cgi?id=1412
-- 

Seeya,
Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[PATCH] Cache open repository per connection, WAS: RE: general server performance (was Re: apache svn server memory usage?)

Posted by Sander Striker <st...@apache.org>.
> From: sussman@collab.net [mailto:sussman@collab.net]
> Sent: Tuesday, July 01, 2003 5:43 PM

> 2. As was already mentioned, because HTTP request is stateless, apache
>    opens and closes/syncs the repository (BDB environment) with
>    *every* request.  (One user had write caching turned off on his
>    server;  this caused his http checkouts to arrive about 1 file
>    every 2 seconds!)  There's been discusion about keeping the
>    repository open for the whole TCP/IP "connection session", and
>    Sander has experimented with this, but it didn't seem to do much.
>    Still need to investigate.

And here is the limited tested patch.


Sander

Log:
Cache open repository per connection.

* subversion/mod_dav_svn/repos.c

  (dav_svn_get_resource): Store open repository in connection notes
    table, keyed by location and repositoryname.  Use this open
    repository for the duration of the connection.


Index: subversion/mod_dav_svn/repos.c
===================================================================
--- subversion/mod_dav_svn/repos.c      (revision 6386)
+++ subversion/mod_dav_svn/repos.c      (working copy)
@@ -22,6 +22,7 @@
 #include <http_protocol.h>
 #include <http_log.h>
 #include <http_core.h>  /* for ap_construct_url */
+#include <util_script.h> /* for ap_find_path_info */
 #include <mod_dav.h>

 #define APR_WANT_STRFUNC
@@ -1076,6 +1077,8 @@
   const char *repos_name;
   const char *relative;
   const char *repos_path;
+  const char *base_path;
+  const char *repos_key;
   const char *version_name;
   svn_error_t *serr;
   dav_error *err;
@@ -1181,15 +1184,27 @@
   /* Remember who is making this request */
   repos->username = r->user;

-  /* open the SVN FS */
-  serr = svn_repos_open(&(repos->repos), fs_path, r->pool);
-  if (serr != NULL)
+  /* Get the repository */
+  base_path = apr_pstrndup(r->pool, r->uri,
+                           ap_find_path_info(r->uri, r->path_info));
+  repos_key = apr_pstrcat(r->pool, "mod_dav_svn:", base_path, root_path);
+  repos->repos = (void *)apr_table_get(r->connection->notes, repos_key);
+  if (repos->repos == NULL)
     {
-      return dav_svn_convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
-                                 apr_psprintf(r->pool,
-                                              "Could not open the SVN "
-                                              "filesystem at %s",
-                                              fs_path));
+      serr = svn_repos_open(&(repos->repos), fs_path, r->connection->pool);
+      if (serr != NULL)
+        {
+          return dav_svn_convert_err(serr, HTTP_INTERNAL_SERVER_ERROR,
+                                     apr_psprintf(r->pool,
+                                                  "Could not open the SVN "
+                                                  "filesystem at %s",
+                                                  fs_path));
+        }
+
+      /* Cache the open repos for the next request on this connection */
+      apr_table_setn(r->connection->notes,
+                     apr_pstrdup(r->connection->pool, repos_key),
+                     (void *)repos->repos);
     }

   /* cache the filesystem object */

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: general server performance (was Re: apache svn server memory usage?)

Posted by Steven Brown <sw...@ucsd.edu>.

> -----Original Message-----
> From: sussman@collab.net [mailto:sussman@collab.net]
> Sent: Tuesday, July 01, 2003 8:43 AM
> To: Chris Hecker; SVN Dev List
> Subject: Re: general server performance (was Re: apache svn server
> memory usage?)
>
>
> Chris Hecker <ch...@d6.com> writes:
>
> > I should be clear that I'm not complaining here, I know svn is still
> > in development, premature optimization and all that.  I'm just
> > wondering if there's something I've screwed up as the server admin or
> > if this is all stuff that code changes will be necessary to fix.
>
> I have some strong opinions here, and I'll state them at the risk of
> Greg Stein coming at me with an axe.  :-)
>
> I think there are two fundamental problems regarding the "slowness" of
> ra_dav/apache, compared to, say, ra_svn/svnserve:
>
> 1. HTTP is a stateless protocol.  It's just *not* the best choice in
>    the world for something like version control, no matter how you
>    drink the kool-aid.  Even though the client keeps a single TCP/IP
>    connection open to apache, there are still a whole lot of network
>    turnarounds, and the requests/repsonses are pretty "thick" with
>    headers.
>
>    Now granted, at the moment, we've not yet optimized ra_dav nearly
>    as much as we can.  It's still sending too many requests and
>    turnarounds, waaaay more than it should.  And it will be fixed.
>    And HTTP proxy caches will speed things up as well. But deep down,
>    I still believe that HTTP will never be quite as fast as our custom
>    stateful protocol.

I'm not too familiar with the methods subversion is using in its dav layer,
but I've definately run into the performance problems with checkout.  Would
HTTP pipelining the requests be possible as a quick hack, i.e., no/minimal
dependency issues?  I'd guess that would remove almost all of the
performance issues related to the network.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: general server performance (was Re: apache svn server memory usage?)

Posted by Ben Collins-Sussman <su...@collab.net>.
Chris Hecker <ch...@d6.com> writes:

> I should be clear that I'm not complaining here, I know svn is still
> in development, premature optimization and all that.  I'm just
> wondering if there's something I've screwed up as the server admin or
> if this is all stuff that code changes will be necessary to fix.

I have some strong opinions here, and I'll state them at the risk of
Greg Stein coming at me with an axe.  :-)

I think there are two fundamental problems regarding the "slowness" of
ra_dav/apache, compared to, say, ra_svn/svnserve:

1. HTTP is a stateless protocol.  It's just *not* the best choice in
   the world for something like version control, no matter how you
   drink the kool-aid.  Even though the client keeps a single TCP/IP
   connection open to apache, there are still a whole lot of network
   turnarounds, and the requests/repsonses are pretty "thick" with
   headers.

   Now granted, at the moment, we've not yet optimized ra_dav nearly
   as much as we can.  It's still sending too many requests and
   turnarounds, waaaay more than it should.  And it will be fixed.
   And HTTP proxy caches will speed things up as well. But deep down,
   I still believe that HTTP will never be quite as fast as our custom
   stateful protocol.

2. As was already mentioned, because HTTP request is stateless, apache
   opens and closes/syncs the repository (BDB environment) with
   *every* request.  (One user had write caching turned off on his
   server;  this caused his http checkouts to arrive about 1 file
   every 2 seconds!)  There's been discusion about keeping the
   repository open for the whole TCP/IP "connection session", and
   Sander has experimented with this, but it didn't seem to do much.
   Still need to investigate.

At the moment, there's still a tradeoff decision to be made.  If you
use apache, you'll get slower performance than svnserve, but you get a
zillion other great features in return (no unix accounts required,
almost any sort of authentication, path-based authorization, some
degree of webDAV interoperability, etc.)  I think it's worth the trade
for most people.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

general server performance (was Re: apache svn server memory usage?)

Posted by Chris Hecker <ch...@d6.com>.
>Caching doesn't help you when you have to fsync the database log files
>at every transaction commit.

Right, but even checkouts seem pokey...are they considered transactions as 
far as disk syncing as well (I assume not)?  Also, is there any way to 
trade risk for performance, and have it not sync to disk as often, or 
schedule it for the background, etc.?  But even ignoring that, what 
explains why the net throughput is so low?

Some quick empirical data (totally unscientific) on a fresh checkout:

4m3s "time svn co https://blah dir"
201 files (excluding all .svn/ files) (6kb median size, 63k average size 
uncompressed)
12.8mb non-.svn uncompressed file size sum
145kb sent/1.5mb received total net traffic during co
tar.gz of all non-.svn files 1.3mb (so the server->client compression is 
working well)
6kb median size, 63k average size uncompressed

Taking the 1.5mb / 4m3s gives only 6.4kbps.  HTTP downloads over the same 
SSL connection to this server using wget acheive a steady 110kbps, so svn's 
utilization is not great.

I should be clear that I'm not complaining here, I know svn is still in 
development, premature optimization and all that.  I'm just wondering if 
there's something I've screwed up as the server admin or if this is all 
stuff that code changes will be necessary to fix.

Thanks,
Chris

         


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: apache svn server memory usage?

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On Tue, Jul 01, 2003 at 12:18:15AM -0700, Chris Hecker wrote:
> I'm actually trying to figure out why svn seems slow (slow is defined as 
> relatively high latency and low throughput...each command takes a while to 
> run and doesn't move data very fast across the network), and I thought I'd 
> start with this.  However, according to atsar the machine's not paging at 

You probably don't have the streamy PROPFIND responses in mod_dav that
Ben just added.

These patches have been approved and will be in 2.0.47 - you can also
use APACHE_2_0_BRANCH from httpd-2.0's CVS until then.  -- justin 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: apache svn server memory usage?

Posted by Branko Čibej <br...@xbc.nu>.
Chris Hecker wrote:

> The httpd process is only taking <10% of CPU when an svn command is
> running as well, and the net isn't remotely taxed.  So, I guess I need
> to look into disk IO, but I don't see why caching wouldn't take care
> of that (the machine was running with 200mb devoted to cache according
> to the last top I posted).

Caching doesn't help you when you have to fsync the database log files
at every transaction commit.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: apache svn server memory usage?

Posted by Chris Hecker <ch...@d6.com>.
>If you are on a platform with a decent thread library the worker mpm
>is a good bet.  mod_dav + svn is thread safe.

Is Debian Linux with kernel 2.4.21 considered to have a good thread library?

>It kind of depends on how much memory you have, what your usage pattern
>looks like etc.  Try switching to the worker mpm first and then decide
>how much memory you can live with being (permanently) taken up by httpd.
>Devide that number by the average number of httpd processes (with worker
>default config and no heavy load: 2).

Okay, thanks.  Is there any way to know when allocating more memory to 
httpd would speed things up?  In other words, how do I tune the number, 
besides just deciding a priori that httpd should get 100mb or something?

I'm actually trying to figure out why svn seems slow (slow is defined as 
relatively high latency and low throughput...each command takes a while to 
run and doesn't move data very fast across the network), and I thought I'd 
start with this.  However, according to atsar the machine's not paging at 
all, so I doubt the memory is the problem (but it's good to know how to 
constrain the beast, so the above is valuable :).  The httpd process is 
only taking <10% of CPU when an svn command is running as well, and the net 
isn't remotely taxed.  So, I guess I need to look into disk IO, but I don't 
see why caching wouldn't take care of that (the machine was running with 
200mb devoted to cache according to the last top I posted).  I wonder if 
running over https is slowing things down because of the SSL negotiation on 
each connection?  But, it seems like keepalive should reduce that.  I'm 
totally inexperienced at profiling things on unix, obviously.  :)

Thanks a lot,
Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: apache svn server memory usage?

Posted by Sander Striker <st...@apache.org>.
> From: news [mailto:news@main.gmane.org]On Behalf Of Jeff Stuart
> Sent: Monday, July 07, 2003 8:33 PM

> On Tue, 1 Jul 2003 08:52:45 +0200
> "Sander Striker" <st...@apache.org> wrote:
> > > The bummer is it's being kept around for all the forked processes.  Would 
> > > it be better to use one of the other mpms (threaded?) so they share some of 
> > > this memory?  I know nothing about apache, of course.  The docs say some 
> > > modules aren't thread-safe...is mod_dav + svn?
> > 
> > If you are on a platform with a decent thread library the worker mpm
> > is a good bet.  mod_dav + svn is thread safe.
> 
> Does subversion work under the worker mpm now?  I know when I tried it a few months
> ago (read around 0.17 of subversion) I was seeing hard freezes when attempting to
> commit or checkout things with svn.

Works for me.

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: apache svn server memory usage?

Posted by Jeff Stuart <js...@computer-city.net>.
On Tue, 1 Jul 2003 08:52:45 +0200
"Sander Striker" <st...@apache.org> wrote:
> > The bummer is it's being kept around for all the forked processes.  Would 
> > it be better to use one of the other mpms (threaded?) so they share some of 
> > this memory?  I know nothing about apache, of course.  The docs say some 
> > modules aren't thread-safe...is mod_dav + svn?
> 
> If you are on a platform with a decent thread library the worker mpm
> is a good bet.  mod_dav + svn is thread safe.

Does subversion work under the worker mpm now?  I know when I tried it a few months ago (read around 0.17 of subversion) I was seeing hard freezes when attempting to commit or checkout things with svn.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: apache svn server memory usage?

Posted by Sander Striker <st...@apache.org>.
> From: Chris Hecker [mailto:checker@d6.com]
> Sent: Tuesday, July 01, 2003 6:44 AM

Hi Chris,

> The bummer is it's being kept around for all the forked processes.  Would 
> it be better to use one of the other mpms (threaded?) so they share some of 
> this memory?  I know nothing about apache, of course.  The docs say some 
> modules aren't thread-safe...is mod_dav + svn?

If you are on a platform with a decent thread library the worker mpm
is a good bet.  mod_dav + svn is thread safe.
 
>> The MaxMemFree directive allows you to control the total amount of
>> memory somewhat.  httpd 2.0.47 will (most probably) obey this directive
>> better than the current version does.
> 
> What will be good settings here?

http://httpd.apache.org/docs-2.0/mod/mpm_common.html#maxmemfree

It kind of depends on how much memory you have, what your usage pattern
looks like etc.  Try switching to the worker mpm first and then decide
how much memory you can live with being (permanently) taken up by httpd.
Devide that number by the average number of httpd processes (with worker
default config and no heavy load: 2).

HTH,

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org