You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Marko Macek <Ma...@gmx.net> on 2002/11/07 21:02:59 UTC

[PATCH] cvs2svn.py - new test release

Hi!

This is my newest test patch for cvs2svn with tags and branches support.

ChangeLog (relative to rev 3685):
1) includes bug fix for preventing the commits on the same file from 
being combined
2) remembers the branch for each file in the .*revs files
3) remembers the tags and branch points for each file in .*revs file
4) NEW: a new pass to determine branch dependencies (only handles
    trees, not DAGs for now)
5) NEW: branches are copied recursively, starting from the trunk
    (or the vendor branch)
6) NEW: unlike before, we now copy the tags and branch points in a
    single new revision (still file-by-file copy though - work is in
    progress to optimize this).
    This revision now has the "svn:author" and "svn:log" properties set.
7) NEW: --vendor=vendor-branch-tag-name to start the conversion
    from the vendor branch.
    TODO: the trunk should be a copy of the vendor branch start tag,
    not a new checkin.

Please test and report any problems, especially with the branch
conversion.

I have tried converting the emacs repository and it seems that
subversion gets really slow with several ten thousand revisions.

	MArk

Re: [PATCH] cvs2svn.py - new test release

Posted by Shun-ichi GOTO <go...@taiyo.co.jp>.
Hi, Mark,

Thanks for you great patch!
I'm happy for supporting vendor branch.

>>>>> at Thu, 07 Nov 2002 22:02:59 +0100,
>>>>> Marko Macek <Ma...@gmx.net> said,
>
> This is my newest test patch for cvs2svn with tags and branches support.

I'm having two minor problem with cvs2svn importing existing big
repository.

  1. If revision 1.1 (of RCS file) is 'dead' state (by cvs admin -o1.1 ?),
     cvs2svn fails.

  2. I met strange CVS repository. A RCS file exists in directory and
     Attic. I don't know how to make this situation (Maybe differernce
     of cvs program revisions?), and I don't know how we treat them.
     At least, currret cvs2svn handles both and deletion and changing
     might be occure in one transaction.

I have sample RCS files to reproduce these two, attached.

## and also attached very ad-hoc patch I used.

--- Regards,
 Shun-ichi Goto  <go...@taiyo.co.jp>
   R&D Group, TAIYO Corp., Tokyo, JAPAN

Re: cvs2svn.py - converting large repository

Posted by Daniel Berlin <db...@dberlin.org>.
Note the ouch's below.

Try adding "set_cachesize 0 64000000 1" to the DB_CONFIG in the db dir of 
the repo, and see if it makes it go faster.


On Fri, 8 Nov 2002, Marko Macek wrote:

> Daniel Berlin wrote:
> 
> > db_stat -m the resulting database and send the output to me (or the 
> > mailing list).
> > This will tell us whether it's subversion, or whether you need to tune 
> > the various DB_CONFIG parameters (like cache size).
> 
> These are the new statistics with a svn just a few days old and with
> my latest cvs2svn.
> 
> $ db_stat -m
> 257KB 768B      Total cache size (263936 bytes).
> 1       Number of caches.
> 270336  Pool individual cache size.
> 59M     Requested pages found in the cache (74%).

			^^^^^
	Ouch!

> 0       Requested pages mapped into the process' address space.
> 21M     Requested pages not found in the cache.
		^^^^^^^^^^^^^ 
		Ouch



> 161113  Pages created in the cache.
> 21M     Pages read into the cache.
> 4150615 Pages written from the cache to the backing file.
> 18M     Clean pages forced from the cache.
	^^^^^^^^^^^^^
		Ouch

> 3923095 Dirty pages forced from the cache.
> 0       Dirty buffers written by trickle-sync thread.
> 64      Current clean buffer count.
> 0       Current dirty buffer count.
> 67      Number of hash buckets used for page location.
> 92M     Total number of times hash chains searched for a page.
> 9       The longest hash chain searched for a page.
> 137M    Total number of hash buckets examined for page location.
> 236M    The number of region locks granted without waiting.
> 2       The number of region locks granted after waiting.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Pool File: strings
> 4096    Page size.
> 44M     Requested pages found in the cache (69%).	
			^^^^^^^^^^^^
			Ouch

> 0       Requested pages mapped into the process' address space.
> 20M     Requested pages not found in the cache.
> 155254  Pages created in the cache.
> 20M     Pages read into the cache.
> 4004768 Pages written from the cache to the backing file.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Pool File: representations
> 4096    Page size.
> 14M     Requested pages found in the cache (95%).
> 0       Requested pages mapped into the process' address space.
> 758317  Requested pages not found in the cache.
> 5859    Pages created in the cache.
> 758316  Pages read into the cache.
> 145847  Pages written from the cache to the backing file.
> 
> 
> Hmm, there are less pool files than before...
> 
> I still get this error after the conversion:
> 
> $ svn log file:///repo/svn/emacs
> svn: Berkeley DB error
> svn: Berkeley DB error while reading node revision for filesystem 
> /repo/svn/emacs/db:
> Cannot allocate memory
> 
> $ svnadmin youngest /repo/svn/emacs
> 49869
> 
> And I also got this error at the end conversion during a
> conversion of a tag (only fs.copy is part of the commit):
> 
> 
> tag gerd_dbe to /branches/gerd_dbe/lisp/ChangeLog from 
> /trunk/lisp/ChangeLog revision 39204
> Traceback (most recent call last):
>    File "/home/mark/svnwork/final-3/cvs2svn.py", line 947, in ?
>      main()
>    File "/home/mark/svnwork/final-3/cvs2svn.py", line 944, in main
>      util.run_app(convert, ctx, start_pass=start_pass)
>    File 
> "/root/rpms/tmp/subversion-2002110320-0/usr/lib/python2.2/site-packages/svn/util.py", 
> line 38, in run_app
>    File "/home/mark/svnwork/final-3/cvs2svn.py", line 874, in convert
>      _passes[i](ctx)
>    File "/home/mark/svnwork/final-3/cvs2svn.py", line 744, in pass5
>      do_copy(ctx, t_fs, 1, branch_copies, copy_from, copies_done, 
> branches_done)
>    File "/home/mark/svnwork/final-3/cvs2svn.py", line 794, in do_copy
>      conflicts, new_rev = fs.commit_txn(txn)
> RuntimeError: Commit succeeded, deltification failed
> 
> The conversion took (with fsync disabled):
> 
> real    428m18.067s
> user    397m43.871s
> sys     18m39.643s
> Fri Nov  8 14:45:21 CET 2002
> 
> Mark
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py - converting large repository

Posted by Marko Macek <Ma...@gmx.net>.
Daniel Berlin wrote:

> db_stat -m the resulting database and send the output to me (or the 
> mailing list).
> This will tell us whether it's subversion, or whether you need to tune 
> the various DB_CONFIG parameters (like cache size).

These are the new statistics with a svn just a few days old and with
my latest cvs2svn.

$ db_stat -m
257KB 768B      Total cache size (263936 bytes).
1       Number of caches.
270336  Pool individual cache size.
59M     Requested pages found in the cache (74%).
0       Requested pages mapped into the process' address space.
21M     Requested pages not found in the cache.
161113  Pages created in the cache.
21M     Pages read into the cache.
4150615 Pages written from the cache to the backing file.
18M     Clean pages forced from the cache.
3923095 Dirty pages forced from the cache.
0       Dirty buffers written by trickle-sync thread.
64      Current clean buffer count.
0       Current dirty buffer count.
67      Number of hash buckets used for page location.
92M     Total number of times hash chains searched for a page.
9       The longest hash chain searched for a page.
137M    Total number of hash buckets examined for page location.
236M    The number of region locks granted without waiting.
2       The number of region locks granted after waiting.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: strings
4096    Page size.
44M     Requested pages found in the cache (69%).
0       Requested pages mapped into the process' address space.
20M     Requested pages not found in the cache.
155254  Pages created in the cache.
20M     Pages read into the cache.
4004768 Pages written from the cache to the backing file.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pool File: representations
4096    Page size.
14M     Requested pages found in the cache (95%).
0       Requested pages mapped into the process' address space.
758317  Requested pages not found in the cache.
5859    Pages created in the cache.
758316  Pages read into the cache.
145847  Pages written from the cache to the backing file.


Hmm, there are less pool files than before...

I still get this error after the conversion:

$ svn log file:///repo/svn/emacs
svn: Berkeley DB error
svn: Berkeley DB error while reading node revision for filesystem 
/repo/svn/emacs/db:
Cannot allocate memory

$ svnadmin youngest /repo/svn/emacs
49869

And I also got this error at the end conversion during a
conversion of a tag (only fs.copy is part of the commit):


tag gerd_dbe to /branches/gerd_dbe/lisp/ChangeLog from 
/trunk/lisp/ChangeLog revision 39204
Traceback (most recent call last):
   File "/home/mark/svnwork/final-3/cvs2svn.py", line 947, in ?
     main()
   File "/home/mark/svnwork/final-3/cvs2svn.py", line 944, in main
     util.run_app(convert, ctx, start_pass=start_pass)
   File 
"/root/rpms/tmp/subversion-2002110320-0/usr/lib/python2.2/site-packages/svn/util.py", 
line 38, in run_app
   File "/home/mark/svnwork/final-3/cvs2svn.py", line 874, in convert
     _passes[i](ctx)
   File "/home/mark/svnwork/final-3/cvs2svn.py", line 744, in pass5
     do_copy(ctx, t_fs, 1, branch_copies, copy_from, copies_done, 
branches_done)
   File "/home/mark/svnwork/final-3/cvs2svn.py", line 794, in do_copy
     conflicts, new_rev = fs.commit_txn(txn)
RuntimeError: Commit succeeded, deltification failed

The conversion took (with fsync disabled):

real    428m18.067s
user    397m43.871s
sys     18m39.643s
Fri Nov  8 14:45:21 CET 2002

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] cvs2svn.py - new test release

Posted by Daniel Berlin <db...@dberlin.org>.

On Thu, 7 Nov 2002, Marko Macek wrote:

> Hi!
> 
> This is my newest test patch for cvs2svn with tags and branches support.
> 
> ChangeLog (relative to rev 3685):
> 1) includes bug fix for preventing the commits on the same file from 
> being combined
> 2) remembers the branch for each file in the .*revs files
> 3) remembers the tags and branch points for each file in .*revs file
> 4) NEW: a new pass to determine branch dependencies (only handles
>     trees, not DAGs for now)
> 5) NEW: branches are copied recursively, starting from the trunk
>     (or the vendor branch)
> 6) NEW: unlike before, we now copy the tags and branch points in a
>     single new revision (still file-by-file copy though - work is in
>     progress to optimize this).
>     This revision now has the "svn:author" and "svn:log" properties set.
> 7) NEW: --vendor=vendor-branch-tag-name to start the conversion
>     from the vendor branch.
>     TODO: the trunk should be a copy of the vendor branch start tag,
>     not a new checkin.
> 
> Please test and report any problems, especially with the branch
> conversion.
> 
> I have tried converting the emacs repository and it seems that
> subversion gets really slow with several ten thousand revisions.

db_stat -m the resulting database and send the output to me (or the 
mailing list).
This will tell us whether it's subversion, or whether you need to tune 
the various DB_CONFIG parameters (like cache size).
> 
> 	MArk
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org