You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Branko Čibej <br...@xbc.nu> on 2003/02/22 03:48:25 UTC

[PATCH] Disable fsync at transaction commit

The rather long thread about when to use transactions, how often to
checkpoint, and the impact of fsync on performance prompted me to come
up with a little demonstratoin. This patch does the following:

    * Adds a mechanism for passing configuration options to the FS layer
      in a way that's independent of the actual FS implementation
    * Adds an option to "svnadmin create" to configure the FS so that
      fsyncs on transaction commit are disabled
    * Uses this feature in all the FS and Python tests.

This patch reduces "make check" time over ra_local on my Windows box
from 56 to 34 minutes. I'd like to ask people to do two things: measure
the effects of this patch on their machines, and especially comment on
the FS configuration mechanism.

The new "svnadmin create" flag would be useful for cvs2svn'ing large
repositories, and "svnadmin load"ing them, too, not just for testing.
The administrator could remove the nosync flag from DB_CONFIG before
putting the newly-created repository into production.

The log message (if I were to commit this patch) would look a bit like
this (see, Karl, I /told/ you :svn_config_create should be publc :-) :

* include/svn_fs.h (svn_fs_new): New parameter "fs_config". All callers
changed.
* include/svn_repos.h (svn_repos_open, svn_repos_create): New parameter
"fs_config". All callers changed.
* include/svn_config.h (svn_config_create): New function.

* libsvn_subr/config.c (svn_config_create, svn_config_read): Extract the
svn_config_t allocation bits from svn_config_read into svn_config_create.

* libsvn_fs/fs.h (svn_fs_t): New member "config".
* libsvn_fs/fs.c (svn_fs_new): Initialise the "config" member of the
svn_fs_t.
(svn_fs_create_berkeley): Add "set_flags DB_TXN_NOSYNC" to DB_CONFIG if
the nosync option was set in fs->config.

* libsvn_repos/repos.c (svn_repos_create, get_repos): Pass fs_config to
svn_fs_new. All callers changed.

* svnadmin/main.c: New "create" option --bdb-txn-nosync. If set, pass
the nosync option to the repos layer when creating the repository.

* tests/fs-helpers.c, tests/libsvn_fs/fs-test.c: Pass the nosync option
to svn_fs_new and svn_repos_create.

* tests/clients/cmdline/svntest/main.py (create_repos): Pass
--bdb-txn-nosync to "svnadmin create".

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/

Re: [PATCH] Disable fsync at transaction commit

Posted by Branko Čibej <br...@xbc.nu>.
Branko Čibej wrote:

>Greg Stein wrote:
>
>  
>
>>On Sat, Feb 22, 2003 at 04:48:25AM +0100, Branko Cibej wrote:
>> 
>>
>>    
>>
>>>...
>>>The log message (if I were to commit this patch) would look a bit like
>>>this (see, Karl, I /told/ you :svn_config_create should be publc :-) :
>>>   
>>>
>>>      
>>>
>>Blah. Just use a hash table of options. svn_config should be about reading
>>config files, not a replacment for simple name/value option pairs.
>>
>>"Hey! I've got this hammer here, and *that* looks like a nail! WHAM! WHAM!"
>>
>>:-)
>>
>>Seriously... just use a hash. Forget the svn_config_create stuff.
>>
>>    
>>
>The point of using an svn_config_t is that you can put
>FS-backend-specific options in fs-backend-specific sections, without
>having to know in advance which backend you're using. Yes, I know we
>have only the BDB backend at the moment. I'm sure gat appreciates every
>BDB-specific top-level interface we add. :-)
>  
>
Besides, I *hate* using APR hashes for such things, because you have to
cast the values all over the place...

void *value;
const char *real_value;
apr_hash_get(foo, &value);
reap_value = value;

Blech! Give me svn_config_get any day. (And I might just go and change
that interface to return a const char* instead of returning it by
reference. I don't know what I was thinking when I wrote that.)

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: [PATCH] Disable fsync at transaction commit

Posted by Sander Striker <st...@apache.org>.
> From: Greg Stein [mailto:gstein@lyra.org]
> Sent: Saturday, February 22, 2003 11:57 AM

> On Sat, Feb 22, 2003 at 11:12:21AM +0100, Branko Cibej wrote:
>>...
>> O.K., all right already, I give in. I like my bikesheds in a lavender
>> and crimson spiral pattern, but I can live with this plain lilac one.
> 
> Red polka dots, man... red polka dots.
> 
> With a background subshading of purple and green stripes, of course.
> 
> :-)

Woah! This makes you want to wish to be color blind ;)
 
> I *do* believe that we might toss around an svn_config_t at some point to
> deal with the *runtime* aspects. The runtime would load up a config file for
> gunk about server operation. In this particular case, however, we don't have
> a config file in sight.
> 
> IOW, when that future point comes along, then yes: we add svn_config_t to
> the svn_repos_open and svn_fs_open functions. The first purpose will most
> likely be to deal with SQL connection data.
> [ altho... in the Apache/mod_dav_svn case, we would *definitely* want to
>   pull that data from httpd.conf, so what is the mechanism for passing that
>   data along... ??? ]

An svn_config_t doesn't have to come from a config file per se.  Look at
clients/cmdline/main.c and see what we do for --diff-cmd and --diff3-cmd.
I would suggest we could fill in the relevant parts using the info we get
from the directives in httpd.conf just the same (in mod_dav_svn).

Sander


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Disable fsync at transaction commit

Posted by Greg Stein <gs...@lyra.org>.
On Sat, Feb 22, 2003 at 11:12:21AM +0100, Branko Cibej wrote:
>...
> O.K., all right already, I give in. I like my bikesheds in a lavender
> and crimson spiral pattern, but I can live with this plain lilac one.

Red polka dots, man... red polka dots.

With a background subshading of purple and green stripes, of course.

:-)

I *do* believe that we might toss around an svn_config_t at some point to
deal with the *runtime* aspects. The runtime would load up a config file for
gunk about server operation. In this particular case, however, we don't have
a config file in sight.

IOW, when that future point comes along, then yes: we add svn_config_t to
the svn_repos_open and svn_fs_open functions. The first purpose will most
likely be to deal with SQL connection data.
[ altho... in the Apache/mod_dav_svn case, we would *definitely* want to
  pull that data from httpd.conf, so what is the mechanism for passing that
  data along... ??? ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Disable fsync at transaction commit

Posted by Branko Čibej <br...@xbc.nu>.
Greg Stein wrote:

>On Sat, Feb 22, 2003 at 05:41:28AM +0100, Branko Cibej wrote:
>  
>
>>Greg Stein wrote:
>>...
>>    
>>
>>>Seriously... just use a hash. Forget the svn_config_create stuff.
>>>      
>>>
>>The point of using an svn_config_t is that you can put
>>FS-backend-specific options in fs-backend-specific sections, without
>>having to know in advance which backend you're using. Yes, I know we
>>have only the BDB backend at the moment. I'm sure gat appreciates every
>>BDB-specific top-level interface we add. :-)
>>    
>>
>
>Quick. What was the name of the cmdline option to svnadmin?
>
>  --bdb-txn-nosync
>
>Quick, what would the hash key be?
>
>  bdb-txn-nosync
>
>You've got ample namespace protection for the backends. Each backend is
>going to announce the specific things that it might look for. Hopefully,
>there won't be many.
>
O.K., all right already, I give in. I like my bikesheds in a lavender
and crimson spiral pattern, but I can live with this plain lilac one.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Disable fsync at transaction commit

Posted by Greg Stein <gs...@lyra.org>.
On Sat, Feb 22, 2003 at 05:41:28AM +0100, Branko Cibej wrote:
> Greg Stein wrote:
>...
> >Seriously... just use a hash. Forget the svn_config_create stuff.
>
> The point of using an svn_config_t is that you can put
> FS-backend-specific options in fs-backend-specific sections, without
> having to know in advance which backend you're using. Yes, I know we
> have only the BDB backend at the moment. I'm sure gat appreciates every
> BDB-specific top-level interface we add. :-)

Quick. What was the name of the cmdline option to svnadmin?

  --bdb-txn-nosync

Quick, what would the hash key be?

  bdb-txn-nosync

You've got ample namespace protection for the backends. Each backend is
going to announce the specific things that it might look for. Hopefully,
there won't be many.

> >Hmm. It seems like there are two pieces here. One piece that passes config
> >to affect the creation, and one piece to pass params during normal
> >operation. But I didn't see any of the latter, so maybe the patch can be
> >trimmed back to just the creation params [for now] ?
> >
> >We can always come back at some future point to add runtime params.
>
> Well, svn_fs_new has to take the config parameter because that's where
> you create the svn_fs_t, not in svn_fs_create_berkeley.

Well, I might suggest that you pass the creation options to the
svn_fs_create_berkeley() function.

Not sure if I like that, but it doesn't mean you have to change all callers
of the svn_fs_new() function, especially given that we have no runtime
parameters that we know of.

That said, when we move to SQL, we *would* have runtime params for the
connection string (or user, pass, etc separate params). Would those be
passed to fs_new() or to fs_open()?

> I guess the
> changes to the repos interface could be limited to just svn_repos_create
> for now, with svn_repos_open always sending a NULL config to svn_fs_new.
> That should work, yes.

Ah. That would be nice. We could expand svn_repos_open() later, yes.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Disable fsync at transaction commit

Posted by Branko Čibej <br...@xbc.nu>.
Greg Stein wrote:

>On Sat, Feb 22, 2003 at 04:48:25AM +0100, Branko Cibej wrote:
>  
>
>>...
>>The log message (if I were to commit this patch) would look a bit like
>>this (see, Karl, I /told/ you :svn_config_create should be publc :-) :
>>    
>>
>
>Blah. Just use a hash table of options. svn_config should be about reading
>config files, not a replacment for simple name/value option pairs.
>
>"Hey! I've got this hammer here, and *that* looks like a nail! WHAM! WHAM!"
>
>:-)
>
>Seriously... just use a hash. Forget the svn_config_create stuff.
>
The point of using an svn_config_t is that you can put
FS-backend-specific options in fs-backend-specific sections, without
having to know in advance which backend you're using. Yes, I know we
have only the BDB backend at the moment. I'm sure gat appreciates every
BDB-specific top-level interface we add. :-)

>Hmm. It seems like there are two pieces here. One piece that passes config
>to affect the creation, and one piece to pass params during normal
>operation. But I didn't see any of the latter, so maybe the patch can be
>trimmed back to just the creation params [for now] ?
>
>We can always come back at some future point to add runtime params.
>  
>
Well, svn_fs_new has to take the config parameter because that's where
you create the svn_fs_t, not in svn_fs_create_berkeley. I guess the
changes to the repos interface could be limited to just svn_repos_create
for now, with svn_repos_open always sending a NULL config to svn_fs_new.
That should work, yes.


-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Disable fsync at transaction commit

Posted by Greg Stein <gs...@lyra.org>.
On Sat, Feb 22, 2003 at 04:48:25AM +0100, Branko Cibej wrote:
>...
> The log message (if I were to commit this patch) would look a bit like
> this (see, Karl, I /told/ you :svn_config_create should be publc :-) :

Blah. Just use a hash table of options. svn_config should be about reading
config files, not a replacment for simple name/value option pairs.

"Hey! I've got this hammer here, and *that* looks like a nail! WHAM! WHAM!"

:-)

Seriously... just use a hash. Forget the svn_config_create stuff.

>...
> +++ subversion/include/svn_fs.h	(working copy)
> @@ -30,6 +30,7 @@
>  #include "svn_error.h"
>  #include "svn_delta.h"
>  #include "svn_io.h"
> +#include "svn_config.h"

Oof. Just use a hash... just use a hash...

>...


Hmm. It seems like there are two pieces here. One piece that passes config
to affect the creation, and one piece to pass params during normal
operation. But I didn't see any of the latter, so maybe the patch can be
trimmed back to just the creation params [for now] ?

We can always come back at some future point to add runtime params.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Disable fsync at transaction commit

Posted by Benjamin Pflugmann <be...@pflugmann.de>.
On Mon 2003-02-24 at 04:23:15 +0100, Branko ??ibej wrote:
> Benjamin Pflugmann wrote:
[...]
> >Very well possible. Dependend on implementation, RAID5 is slower than a
> >plain disk at writes, but never better.
              ^^^^^^^^^

> Uh, no, it will be better on reads, because it can parallelize them
> (just like RAID0) -- but that doesn't help the fsync, of course.

Yeah, what I said... I already limited my argument to writes.

> Nope, it's a H/W IDE raid controller (the 3ware Escalade, if you're
> interested about such things).

Ummm. That leaves the possibility that the raid controller disables
write caching for the IDE disks, which would be a sensible thing to
do.

[...]
> >which is more in the range of your results. 
> >
> Nope, mine was 56 to 34 minutes. :-)

:-)

    Benjamin.

Re: [PATCH] Disable fsync at transaction commit

Posted by Branko Čibej <br...@xbc.nu>.
Benjamin Pflugmann wrote:

>>How unfortunate. 10% is nothing to sneer at, but it seems that Windows
>>is a terrible host for any kind of database. My box is a dual Athlom MP
>>2200+ (1.8GHz) with 1GB of memory and 4-way RAID 5 disks, running
>>Windows XP. Hm, maybe the fsyncs are that much slower because of the
>>RAID thing.
>>    
>>
>
>Very well possible. Dependend on implementation, RAID5 is slower than a
>plain disk at writes, but never better.
>
Uh, no, it will be better on reads, because it can parallelize them
(just like RAID0) -- but that doesn't help the fsync, of course.

>But let me guess, the RAID uses SCSI disks?
>
Nope, it's a H/W IDE raid controller (the 3ware Escalade, if you're
interested about such things).

> Most (all?) IDE disks lie
>about writing. They are using write-caching, while SCSI disks only say
>they have written stuff to disk, when it really is on the platter.
>
>If I disable write-caching with my IDE disk (hdparm -W0 /dev/hdb, note
>that some drives ignore that request and continue to use the write
>cache), make check needs
>
>  - 48 mins (without the patch)
>  - 34 mins (with patch)
>
>which is more in the range of your results. 
>
Nope, mine was 56 to 34 minutes. :-)

[snip]

>Maybe there should also be some way for people to play it more fast
>than safe (e.g. allow to disable synchronous syncs at all), if that
>meets their priorities.
>  
>
That's the point of the "svnadmin create --bdb-txn-nosync" patch, which
I'll commit one of these days. I can imagine disabling the fsyncs for
cvs2svn and large dump/load cycles. You'd probably turn them back on for
production, of course.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Disable fsync at transaction commit

Posted by Benjamin Pflugmann <be...@pflugmann.de>.
On Sat 2003-02-22 at 11:17:47 +0100, Branko ?ibej wrote:
> Greg Hudson wrote:
> 
> >Anyway, I did the grunt work.  I used a simplified form of your patch,
> >which I'll include at the end.  The "make check" time appears to drop
> >from 20 minutes to 18 minutes on my machine with DB_TXN_NOSYNC set.  I
> >have a 700MHz PIII Thinkpad A22P with 384MB of memory.

Just another data point: I get almost the same figures on Linux 2.4.21pre3
using an IBM Deskstar 120 GXP disk (IC35L060AVVA07: 7200RPM, 8.5ms seek,
Ultra ADA/100, 60GB). Btw, on a 500MHz Athlon, 384MB - but the relevant
part are the disks, of course. The times go down from 18.5 to 17 mins for
me.

> How unfortunate. 10% is nothing to sneer at, but it seems that Windows
> is a terrible host for any kind of database. My box is a dual Athlom MP
> 2200+ (1.8GHz) with 1GB of memory and 4-way RAID 5 disks, running
> Windows XP. Hm, maybe the fsyncs are that much slower because of the
> RAID thing.

Very well possible. Dependend on implementation, RAID5 is slower than a
plain disk at writes, but never better.

But let me guess, the RAID uses SCSI disks? Most (all?) IDE disks lie
about writing. They are using write-caching, while SCSI disks only say
they have written stuff to disk, when it really is on the platter.

If I disable write-caching with my IDE disk (hdparm -W0 /dev/hdb, note
that some drives ignore that request and continue to use the write
cache), make check needs

  - 48 mins (without the patch)
  - 34 mins (with patch)

which is more in the range of your results. 

Happier now? ;-)


One could say that users of IDE drives were already never really using
fsyncs.

I know, more precisely, they were using asynchronous syncs instead of
synchronous ones which is better than no syncs at all, but when you
get a power outage, you'll see it's effectively the same as no
syncs. *eg*

Maybe we should add that advice somewhere:

  If you are paranoid about your repository, make sure to use disks
  without write-caching: either use SCSI disks or disable write-caching
  on IDE disks *and* test that the drive honors it, or alternatively buy
  a UPS for the IDE disk.

I know this may sound like an overkill, but if it is, why being paranoid
about repository consistency to begin with? Since we are (at least that
is my understanding from what I have read so far), tell people what
hardware requirements are to be met to make best use of the software
layer paranoia.

Maybe there should also be some way for people to play it more fast
than safe (e.g. allow to disable synchronous syncs at all), if that
meets their priorities.

HTH,

	Benjamin.


Re: [PATCH] Disable fsync at transaction commit

Posted by Branko Čibej <br...@xbc.nu>.
Greg Hudson wrote:

>Anyway, I did the grunt work.  I used a simplified form of your patch,
>which I'll include at the end.  The "make check" time appears to drop
>from 20 minutes to 18 minutes on my machine with DB_TXN_NOSYNC set.  I
>have a 700MHz PIII Thinkpad A22P with 384MB of memory.
>
How unfortunate. 10% is nothing to sneer at, but it seems that Windows
is a terrible host for any kind of database. My box is a dual Athlom MP
2200+ (1.8GHz) with 1GB of memory and 4-way RAID 5 disks, running
Windows XP. Hm, maybe the fsyncs are that much slower because of the
RAID thing.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Disable fsync at transaction commit

Posted by Greg Hudson <gh...@MIT.EDU>.
On Fri, 2003-02-21 at 22:48, Branko Čibej wrote:
> This patch reduces "make check" time over ra_local on my Windows box
> from 56 to 34 minutes. I'd like to ask people to do two things: measure
> the effects of this patch on their machines, and especially comment on
> the FS configuration mechanism.

I'm not sure I can bring myself to have an opinion on the FS
configuration mechanism.  On the one hand, I sympathize with Greg's
concern that svn_config should only be used when there are actual config
files; on the other hand, maybe you should be able to drop a config file
in the repository instead of passing command-line options.

Anyway, I did the grunt work.  I used a simplified form of your patch,
which I'll include at the end.  The "make check" time appears to drop
from 20 minutes to 18 minutes on my machine with DB_TXN_NOSYNC set.  I
have a 700MHz PIII Thinkpad A22P with 384MB of memory.

Here are the results of "time make check".  The first two are with the
original code, the third one with the patch.  Since the first two runs
were nearly identical (no noticeable caching effects), I didn't have the
patience to run the patched code twice.

  191.280u 43.930s 20:04.26 19.5% 0+0k 0+0io 1152035pf+0w
  193.520u 45.320s 20:05.55 19.8% 0+0k 0+0io 1150689pf+0w
  186.340u 44.550s 18:02.58 21.3% 0+0k 0+0io 1154886pf+0w

commit-tests 28 failed in all three runs, but that should only have
skewed the results slightly if at all.

Index: subversion/libsvn_fs/fs.c
===================================================================
--- subversion/libsvn_fs/fs.c	(revision 5014)
+++ subversion/libsvn_fs/fs.c	(working copy)
@@ -450,7 +450,14 @@
       "#\n"
       "# http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgId=161960\n"
       "set_lg_bsize     262144\n"
-      "set_lg_max      1048576\n";
+      "set_lg_max      1048576\n"
+      "#\n"
+      "# Disable fsync of log files on transaction commit. Read the\n"
+      "# documentation abtou DB_TXN_NOSYNC at:\n"
+      "#\n"
+      "#   http://www.sleepycat.com/docs/api_c/env_set_flags.html\n"
+      "#\n"
+      "set_flags DB_TXN_NOSYNC\n";
 
     SVN_ERR (svn_io_file_open (&dbconfig_file, dbconfig_file_name,
                                APR_WRITE | APR_CREATE, APR_OS_DEFAULT,

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org