You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Jerome Lacoste <la...@frisurf.no> on 2004/02/01 12:08:09 UTC

[converting scarab] high repository size after cvs2svn

Hi,

The scarab project [1] has decided to give SVN a go.

I played with cvs2svn yesterday. My current problem is the repository
size which is about 8 times higher than the corresponding CVS one. 




I am using subversion 0.33 which is the current version in Debian
unstable.
> grep LastChangedRevision `which cvs2svn`
# $LastChangedRevision: 7729 $

A newer version, probably 0.37, should come really soon in the
repository according to the maintainer [2]. I will upgrade ASAIC.

To convert, I did the following:

# cvs/scarab contains the scarab cvs module

> mkdir svn
> svnadmin ./svn
> cvs2svn -v -s ./svn -v cvs/scarab

I first made a dry-run which took 45 seconds to complete.
I then made the full run, which took a little bit over 3 hours to
complete. (config PIV 2.6 GHz, 512 Mo). > 90% CPU in execution.

I didn't test the repository yet, but my results are disturbing:

jerome@dolcevita> du -sk cvs svn
235784  cvs
1991752 svn

In other words, subversion uses much more space than cvs in the
repository (> 8 times).

I couldn't find any reference to such issue in the issue system.

I am completely new to subversion so I perhaps made a mistake and missed
an option/config.

Perhaps 0.37 will help, as 0.33 still uses BDB 4.1 in Debian. 0.37 will
use 4.2; should that help?

Would be happy if someone could shred some light on this issue?

Cheers,

Jerome

[1] http://scarab.tigris.org/
[2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=222353

Re: [converting scarab] high repository size after cvs2svn

Posted by Andreas Jellinghaus <aj...@dungeon.inka.de>.
debian has db 4.2.52 in testing, so it's easy to install.
compiling subversion yourself is then quite easy,
as subversion already has all libraries it needs in its tar file.

if you want to compile subversion with apache:
compile the apr and apr-util found in subversion first,
then apache with pointing it to those installations
(apache has them in its tar file, too, but they aren't good enough),
and then compile subversion with the alrfeady installed apr,apt-util
and apache (apxs-config) so you get a mod_dav_svn.

it's easier than this sounds, so give it a try.

using 0.37 is a very good idea. 

you will need to either dump and restore the subversion repository
for upgrading from 0.33, or - if you didn't do any changes to it so far
- simply start once more with an empty repostory and import the cvs
tree again. with 0.37 it will not be that big.

if there is a problem with cvs2svn, there is an alternative written
in perl called refinecvs that might help you.


if you have an old db installed in /usr, and a new one somewhere else,
then pointing svn to the new one will make it compile, but result in a
broken svn linked to both db libraries. don't do that. simply 
apt-get install libdb4.2-dev
(maybe also libdb4.2++-dev ?) so you only have development files for the
latest 4.2.52 db on your system and you will be fine.

Good luck!

Regards, Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: [converting scarab] high repository size after cvs2svn

Posted by Corrin Lakeland <la...@cs.otago.ac.nz>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 02 Feb 2004 01:08, Jerome Lacoste wrote:
> Hi,
>
> The scarab project [1] has decided to give SVN a go.
>
> I played with cvs2svn yesterday. My current problem is the repository
> size which is about 8 times higher than the corresponding CVS one.

The problem is old log files, not the main repository.

To fix it, use svnadmin list-unused-dblogs | xargs rm

Alternatively, don't worry about it.

Corrin
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAHVsTi5A0ZsG8x8cRArFEAJoCnlr0jSpJlS+7raY80w0+ss5KiACfZQQ4
0l/lZxLckriOMIxTng+H58M=
=pgnw
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: [converting scarab] high repository size after cvs2svn

Posted by Branko Čibej <br...@xbc.nu>.
John Szakmeister wrote:

>I highly recommend using BDB 4.2.50.  You'll get some speed 
>improvements, and automatic log removal which makes it a little easier to 
>maintain.
>  
>
But note that just using BDB 4.2 (the latest is 4.2.52, not 4.2.59)
without upgrading to the latest Subversion will _not_ give you automatic
log file removal.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: [converting scarab] high repository size after cvs2svn

Posted by Ben Collins-Sussman <su...@collab.net>.
On Sun, 2004-02-01 at 16:26, Jerome Lacoste wrote:

> SVN repository is now 'only' 60% bigger, and it's more acceptable, but
> still disappointing :)
> 

Please don't confuse Subversion and cvs2svn.py.  They're separate
projects.  If you had been using Subversion for all Scarab development
from day one, I can pretty much guarantee your repository would be the
same size, and most likely smaller if you're storing binaries. 

It's well known that cvs2svn has bugs, is still being actively worked
on, and has not been "blessed" for widespread use yet.  It's nowhere
near 1.0 quality.  The problem it's trying to solve, particularly when
it comes to "deducing" branches and tags, is incredibly difficult.  At
the moment, cvs2svn.py often results in a set of commits that are far,
far more complex and inefficient than what humans really would have
done.

If cvs2svn is critical to you, check back in about a month.  A few
people have recently started working on it very hard, and it should be
in much better shape in a few weeks.

Alternately, you can do what Subversion did in August 2001:  just leave
your history behind in a CVS repository... i.e. just import the latest
scarab tree into an empty SVN repository and keep going.  It's not
really all that inconvenient.  "Switching to SVN" doesn't have to imply
"migrating all history."





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: [converting scarab] high repository size after cvs2svn

Posted by Jerome Lacoste <la...@frisurf.no>.
On Sun, 2004-02-01 at 13:31, John Szakmeister wrote:
> On Sunday 01 February 2004 07:08, Jerome Lacoste wrote:
> > Hi,
> >
> > The scarab project [1] has decided to give SVN a go.
> >
> > I played with cvs2svn yesterday. My current problem is the repository
> > size which is about 8 times higher than the corresponding CVS one

Thanks to all those answered.

I cleaned up the log files and it helped a lot.

jerome@dolcevita> svnadmin list-unused-dblogs svn | xargs rm
jerome@dolcevita> du -sk cvs svn
235784  cvs
382980  svn

SVN repository is now 'only' 60% bigger, and it's more acceptable, but
still disappointing :)

I will give the whole conversion a retry with 0.37.

Thanks!

J

Re: [converting scarab] high repository size after cvs2svn

Posted by John Szakmeister <jo...@szakmeister.net>.
On Sunday 01 February 2004 07:08, Jerome Lacoste wrote:
> Hi,
>
> The scarab project [1] has decided to give SVN a go.
>
> I played with cvs2svn yesterday. My current problem is the repository
> size which is about 8 times higher than the corresponding CVS one.
>
> I am using subversion 0.33 which is the current version in Debian
> unstable.
>
> [snip]
> A newer version, probably 0.37, should come really soon in the
> repository according to the maintainer [2]. I will upgrade ASAIC.
>
> To convert, I did the following:
>
> [snip]
> jerome@dolcevita> du -sk cvs svn
> 235784  cvs
> 1991752 svn
>
> In other words, subversion uses much more space than cvs in the
> repository (> 8 times).
>
> I couldn't find any reference to such issue in the issue system.
>
> I am completely new to subversion so I perhaps made a mistake and missed
> an option/config.
>
> Perhaps 0.37 will help, as 0.33 still uses BDB 4.1 in Debian. 0.37 will
> use 4.2; should that help?
>
> Would be happy if someone could shred some light on this issue?

There are several things that you should be aware of.  First, did you clean 
the log files after converting the repository?  While cvs2svn does it's 
thing, it commits to the repository leaving behind log files.  You need to 
use 'svnadmin list-unused-dblogs' to get a list of the unused log files that 
you can safely delete.  BDB 4.2.x takes care of this problem by automatically 
removing log files itself.  I imagine that you didn't do this, and that's why 
your repository looks so large.

Second, stay away from BDB 4.1.x.  We've seen database problems from people 
that have been using that version.  Apparently there are some issues with 
their shared memory implementation that causing sporadic repository 
corruption.  I highly recommend using BDB 4.2.50.  You'll get some speed 
improvements, and automatic log removal which makes it a little easier to 
maintain.

Finally, our latest version is always the best.  You'll probably find little 
change in cvs2svn, but there have been a number of user-visible changes since 
0.33.0 in the command line tool.  Be careful if you takes this step though, 
you will need to 'dump' the repository and 'load' it after you upgrade.  An 
example of how to do this is in the FAQ at:

 http://subversion.tigris.org/project_faq.html#dumpload

I hope that answers some of your questions!

-John


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org