You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Oliver Jowett <ol...@opencloud.com> on 2004/07/17 01:32:14 UTC

'svnadmin load' & database sync options

Hi all,

I'm currently looking at converting our large (~350mb) CVS repository to 
subversion, learning subversion along the way.

cvs2svn happily produces a dumpfile containing ~14000 transactions:

> -rw-r--r--    1 oliver   ocstaff  708319348 Jul 16 18:53 cvs2svn-dump

Loading it via 'svnadmin load' is hideously slow, taking almost 10 hours:

> oliver@cyclone:~/svn-test$ svnadmin create repo-sync
> oliver@cyclone:~/svn-test$ time svnadmin load -q repo-sync <cvs2svn-dump
> 
> real    561m59.668s
> user    14m1.379s
> sys     2m4.799s

Ok, so I'll use --bdb-txn-nosync:

> oliver@cyclone:~/svn-test$ svnadmin create --bdb-txn-nosync repo-no-sync
> oliver@cyclone:~/svn-test$ time svnadmin load -q repo-no-sync <cvs2svn-dump
> 
> real    146m49.972s
> user    13m3.273s
> sys     1m36.818s

Better but still very disk-bound. Some digging with lsof/strace showed 
that some fsync() calls are still done on the DB log files.

I experimented a bit with other DB options and ended up with this:

> oliver@cyclone:~/svn-test$ svnadmin create --bdb-txn-nosync repo-no-log
> oliver@cyclone:~/svn-test$ echo "set_flags DB_TXN_NOT_DURABLE" >>repo-no-log/db/DB_CONFIG 
> oliver@cyclone:~/svn-test$ svnadmin recover repo-no-log
> Please wait; recovering the repository may take some time...
> 
> Recovery completed.
> The latest repos revision is 0.
> oliver@cyclone:~/svn-test$ time svnadmin load -q repo-no-log <cvs2svn-dump
> 
> real    26m40.620s
> user    12m40.711s
> sys     1m9.318s

That's more like what I originally expected!

The system these all ran on (cyclone) is a dual Athlon/MP 2800+, 2GB 
RAM. The OS is Debian stable with a 2.6.5 Linux kernel, and subversion 
is 1.0.5 as packaged in Debian unstable:

> ||/ Name           Version        Description
> +++-==============-==============-============================================
> ii  subversion     1.0.5-1        Advanced version control system (aka. svn)
> ii  libdb4.2       4.2.52-16      Berkeley v4.2 Database Libraries [runtime]

The subversion repositories are on an ext3 filesystem on a commodity IDE 
disk with the disk's write-caching disabled.

So, some questions:

1) Is using DB_TXN_NOT_DURABLE during the initial load a sane thing to 
do? I don't care about recovery from failures during the load at all -- 
I'd just restart from scratch if something did go wrong.
2) Is it normal for fsync() to still be called when --bdb-txn-nosync in use?
3) Is an option to use DB_TXN_NOT_DURABLE for the duration of a 
'svnadmin load' a good idea?

-O

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: 'svnadmin load' & database sync options

Posted by Oliver Jowett <ol...@opencloud.com>.
Replying to myself..

Oliver Jowett wrote:

>> oliver@cyclone:~/svn-test$ svnadmin create --bdb-txn-nosync repo-no-log
>> oliver@cyclone:~/svn-test$ echo "set_flags DB_TXN_NOT_DURABLE" 
>> >>repo-no-log/db/DB_CONFIG oliver@cyclone:~/svn-test$ svnadmin recover 
>> repo-no-log
>> Please wait; recovering the repository may take some time...
>>
>> Recovery completed.
>> The latest repos revision is 0.

Don't do this, not even just for the initial load. It seems you can't 
safely turn off DB_TXN_NOT_DURABLE once set; 'svnadmin recover' is not 
happy.

I found a better solution in the end: put the new repository on a 
memory-based filesystem (e.g. tmpfs). fsync() is then essentially free 
so 'svnadmin load' is fast. Once it's done, copy the resulting 
repository somewhere persistent.

Doing that reduced my load time from 2.5 hours to 15 minutes.

-O

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org