You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Nathan Sharp <sp...@phoenix-int.com> on 2002/12/18 14:02:07 UTC
Large repository crashing cvs2svn.py and possible memory leak?
I'm currently testing out subversion on a test box here (RedHat 8.0) and
have been very impressed so far. I'm quite excited to employ it on a
real project, but am having problems importing our large CVS repository
into it. Our CVS repository goes back to 1996 and is 2.2GB on disk.
The cvs2svn.py script successfully runs through the first thee passes
but fails with a segementation fault in the fourth pass. The repository
it generates is valid up until the point it crashes (somewhere in 1999)
and it doesn't seem to crash at the same spot if I re-run it. One thing
I noticed was that as the script runs, it takes increasingly more and
more memory as it goes, was up to almost 400Meg last I checked before it
crashed. I'm suspicious that perhaps it just runs the computer out of
memory (it isn't a real powerful box, it is just for testing), but I
don't have any real evidence to that fact.
I'm running:
svn HEAD as of a couple of days ago
python 2.2.1-17 RPM
swig 1.3.16 from tarball
viewcvs HEAD as of a couple of days ago
Berkely db 4.0.14-14 RPM
Any advice for debugging this? I ran the script w/ a -v and the reports
it generates seem O.K. up until it crashes, at which point the output
ends abruptly mid-line w/ no other errors. The shell reports the
segmentation fault. No core file is generated.
Thanks again!
Nathan
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: [PATCH] Re: Large repository crashing cvs2svn.py and possible
memory leak?
Posted by Nathan Sharp <sp...@phoenix-int.com>.
Sorry for the long delay getting back. I finally got around to
compiling in your patch and it doesn't appear to help - although I'm now
getting "obstructed updates" even after doing a clean co, so I haven't
been able to run a job to completion. It is still taking up near 100M
of memory and growing fast before it stops. It seems to me that the
leak is related to reading the local working copy files, not with
accessing the repository, and that whether I'm using a local file:/// or
http:// URL for the repository makes no difference. Win32 vs Unix also
seems to not matter. These operations have given me trouble:
cvs2svn.py using local repository
pset whether fs or http (but that doesn't access the repos...)
commit whether fs or http.
These operations seem to perhaps have a very small leak, but small
enough to not matter for even my 2G repository.
revert
cleanup
co
I haven't had a chance yet to try the mmacek branch. I'll see if I can
get to it this weekend (as well as trying the latest head).
Thanks for the reply! Hopefully we can resolve this soon.
Nathan
P.S. Regarding the obstructed updates: Yes I've tried a svnadmin
recover as well as an svn cleanup. Neither help. I think the problem is
related to the memory leak, though, because if I go run the command I
was running and filed on the file which failed individually, it works.
It only fails when doing a large recursive call to set a lot of stuff.
Marko Macek wrote:
> Nathan Sharp wrote:
>
>> I am about 2 days old to subversion, I wasn't even aware that there
>> was a branch with anything I might be interested in. As Donald said,
>> yes, I am using the trunk.
>> I experimented further (thanks to some help I got on the IRC channel)
>> and think I have a workaround now. After running passes 1-3 (and
>> forcibly preventing 4 from running), I took the cvs2svn-data.s-revs
>> file and chopped it into files with 20,000 lines each. By running
>> the script on pass 4 on each file in order, I seem to be able to run
>> successfully, which proves that the problem a) is a memory leak and
>> b) was failing because it ran my system out of memory. The only
>> negative effect of what I did is that right where I split the files
>> (since I just did exactly 20k line files and didn't manually split
>> the files up at a natural commit break) I will end up with an commit
>> which is split in 2, which is minor enough for me not to worry about it.
>> The general belief I heard on the IRC channel is that the memory leak
>> is probably in the swig bindings and not in the cvs2svn.py script
>> itself. I'd be happy to help in any way possible if someone wants to
>> try and fix it.
>>
> Please try the following patch to subversion.
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
[PATCH] Re: Large repository crashing cvs2svn.py and possible memory
leak?
Posted by Marko Macek <Ma...@gmx.net>.
Nathan Sharp wrote:
> I am about 2 days old to subversion, I wasn't even aware that there
> was a branch with anything I might be interested in. As Donald said,
> yes, I am using the trunk.
> I experimented further (thanks to some help I got on the IRC channel)
> and think I have a workaround now. After running passes 1-3 (and
> forcibly preventing 4 from running), I took the cvs2svn-data.s-revs
> file and chopped it into files with 20,000 lines each. By running the
> script on pass 4 on each file in order, I seem to be able to run
> successfully, which proves that the problem a) is a memory leak and b)
> was failing because it ran my system out of memory. The only negative
> effect of what I did is that right where I split the files (since I
> just did exactly 20k line files and didn't manually split the files up
> at a natural commit break) I will end up with an commit which is split
> in 2, which is minor enough for me not to worry about it.
> The general belief I heard on the IRC channel is that the memory leak
> is probably in the swig bindings and not in the cvs2svn.py script
> itself. I'd be happy to help in any way possible if someone wants to
> try and fix it.
>
Please try the following patch to subversion.
Index: subversion/libsvn_fs/bdb/nodes-table.c
===================================================================
--- subversion/libsvn_fs/bdb/nodes-table.c (revision 4167)
+++ subversion/libsvn_fs/bdb/nodes-table.c (working copy)
@@ -146,6 +146,7 @@
"successor id `%s' (for `%s') already exists in filesystem %s",
new_id_str->data, id_str->data, fs->path);
}
+ if (err) svn_error_clear(err);
/* Return the new node revision ID. */
*successor_p = new_id;
If you can, please also test /branches/cvs2svn-mmacek from subversion
repository (it has some bugfixes in addition to basic branch and tag
conversion support)
MArk
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Large repository crashing cvs2svn.py and possible memory leak?
Posted by Nathan Sharp <ns...@phoenix-int.com>.
I am about 2 days old to subversion, I wasn't even aware that there was
a branch with anything I might be interested in. As Donald said, yes, I
am using the trunk.
I experimented further (thanks to some help I got on the IRC channel)
and think I have a workaround now. After running passes 1-3 (and
forcibly preventing 4 from running), I took the cvs2svn-data.s-revs file
and chopped it into files with 20,000 lines each. By running the script
on pass 4 on each file in order, I seem to be able to run successfully,
which proves that the problem a) is a memory leak and b) was failing
because it ran my system out of memory. The only negative effect of
what I did is that right where I split the files (since I just did
exactly 20k line files and didn't manually split the files up at a
natural commit break) I will end up with an commit which is split in 2,
which is minor enough for me not to worry about it.
The general belief I heard on the IRC channel is that the memory leak is
probably in the swig bindings and not in the cvs2svn.py script itself.
I'd be happy to help in any way possible if someone wants to try and fix it.
Nathan
Branko Čibej wrote:
>Which cvs2svn.py script are you using? The one from thr trunk, or the
>one from /branches/cvs2svn-mmacek?
>
>Nathan Sharp wrote:
>
>
>
>>I'm currently testing out subversion on a test box here (RedHat 8.0)
>>and have been very impressed so far. I'm quite excited to employ it
>>on a real project, but am having problems importing our large CVS
>>repository into it. Our CVS repository goes back to 1996 and is 2.2GB
>>on disk. The cvs2svn.py script successfully runs through the first
>>thee passes but fails with a segementation fault in the fourth pass.
>>The repository it generates is valid up until the point it crashes
>>(somewhere in 1999) and it doesn't seem to crash at the same spot if I
>>re-run it. One thing I noticed was that as the script runs, it takes
>>increasingly more and more memory as it goes, was up to almost 400Meg
>>last I checked before it crashed. I'm suspicious that perhaps it just
>>runs the computer out of memory (it isn't a real powerful box, it is
>>just for testing), but I don't have any real evidence to that fact.
>>I'm running:
>>svn HEAD as of a couple of days ago
>>python 2.2.1-17 RPM
>>swig 1.3.16 from tarball
>>viewcvs HEAD as of a couple of days ago
>>Berkely db 4.0.14-14 RPM
>>
>>
>>Any advice for debugging this? I ran the script w/ a -v and the
>>reports it generates seem O.K. up until it crashes, at which point the
>>output ends abruptly mid-line w/ no other errors. The shell reports
>>the segmentation fault. No core file is generated.
>>Thanks again!
>> Nathan
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
>>For additional commands, e-mail: dev-help@subversion.tigris.org
>>
>>
>>
>
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Large repository crashing cvs2svn.py and possible memory leak?
Posted by Branko Čibej <br...@xbc.nu>.
Which cvs2svn.py script are you using? The one from thr trunk, or the
one from /branches/cvs2svn-mmacek?
Nathan Sharp wrote:
> I'm currently testing out subversion on a test box here (RedHat 8.0)
> and have been very impressed so far. I'm quite excited to employ it
> on a real project, but am having problems importing our large CVS
> repository into it. Our CVS repository goes back to 1996 and is 2.2GB
> on disk. The cvs2svn.py script successfully runs through the first
> thee passes but fails with a segementation fault in the fourth pass.
> The repository it generates is valid up until the point it crashes
> (somewhere in 1999) and it doesn't seem to crash at the same spot if I
> re-run it. One thing I noticed was that as the script runs, it takes
> increasingly more and more memory as it goes, was up to almost 400Meg
> last I checked before it crashed. I'm suspicious that perhaps it just
> runs the computer out of memory (it isn't a real powerful box, it is
> just for testing), but I don't have any real evidence to that fact.
> I'm running:
> svn HEAD as of a couple of days ago
> python 2.2.1-17 RPM
> swig 1.3.16 from tarball
> viewcvs HEAD as of a couple of days ago
> Berkely db 4.0.14-14 RPM
>
>
> Any advice for debugging this? I ran the script w/ a -v and the
> reports it generates seem O.K. up until it crashes, at which point the
> output ends abruptly mid-line w/ no other errors. The shell reports
> the segmentation fault. No core file is generated.
> Thanks again!
> Nathan
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
--
Brane Čibej <br...@xbc.nu> http://www.xbc.nu/brane/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org