You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Marcelo E. Magallon" <ma...@bigfoot.com> on 2002/04/12 20:34:36 UTC

svn eats all memory, OOM killer kicks in

Hi,

 while playing with subversion I found this problem.  I'm using the
 Window Maker sources as a testbed.

 * I create a repository

    svnadmin create ~/tmp/wmaker/svn

 * I checkout this

    cd ~/tmp/wmaker/work
    svn co file:///home/marcelo/tmp/wmaker/svn
    cd svn

 * I create some directories here

    svn mkdir trunk
    svn mkdir tags
    svn mkdir branches
    svn mkdir branches/debian
    svn mkdir branches/upstream

 * I introduce some files in the working copy

    cd branches/upstream
    tar xvzf $DEPOT/wmaker_0.64.0.orig.tar.gz
    mv WindowMaker-0.64.0/* .
    rmdir WindowMaker-0.64.0
    svn add --recursive *

 * I check in

    cd ../..
    svn ci -m "Window Maker 0.64.0"
    Adding ...
    Adding ...
    Adding ...
    Transfering .................................................
    .............................................................
    [...]
    ...................Killed

 * A second check-in attempt produces this:

    Adding    branches/upstream/util/wmsetup.c

    svn_error: #21068 : <File already exists in revision>
      Commit failed (details follow):

    svn_error: #21068 : <File already exists in revision>
      file already exists: filesystem `/home/marcelo/tmp/wmaker/svn/db', transaction `3', path `branches/upstream/util/wmsetup.c'

 At this point I assume the repository has been corrupted.  How can I
 verify or deny this?

 My box has 256 MB RAM + 128 MB swap.  About 300+ MB are free before the
 check-in.  The files being checked-in total 20+ MB.  At this point I
 assume the repository has been corrupted.  How can I verify or deny
 this?

 I have compiled subversion myself from a fresh check out (1677).  BDB
 is 4.0.14; neon is 0.19.3; APR is 2.0.35.  This is Linux, 2.4.18 + XFS
 patches.  Subversion was compiled using gcc version 2.95.4 20011002
 (Debian prerelease), configured like this:

        ./configure CC="cc -g"\
                --prefix=/usr \
                --mandir=\$${prefix}/share/man \
                --infodir=\$${prefix}/share/info \
                --with-berkeley-db=/usr \
                --with-apr-libs=/usr/lib/apache2 --enable-static \
                --with-apxs=$(APXS) \
                --with-neon=/usr \
                --enable-debug

 (that's cut'n'pasted from a makefile used to compile subversion)

 Thanks,

 Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn eats all memory, OOM killer kicks in

Posted by "Marcelo E. Magallon" <ma...@bigfoot.com>.
>> cmpilato@collab.net writes:

 > I am fixing our memory usage right now (actually, this fix is done,
 > I'm just recompiling and testing).  Watch the svn@ list for the
 > commit.

 Thanks!

 Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn eats all memory, OOM killer kicks in

Posted by "Kirby C. Bohling" <kb...@birddog.com>.

Karl Fogel wrote:
> "Kirby C. Bohling" <kb...@birddog.com> writes:
> 
>>>Maybe we can think of some way to test for killability in the client?
>>>
>>No that isn't exactly a portable solution (won't work on Windows).  I
>>don't know of anything else that gives you "killable" tests that are
>>repeatable.  Just an idea if you really want to try that.  The worst
>>thing about it is the warts in the code with randomFailure.  We ended
>>up moving it to an extremely low level library that everybody
>>referenced and left it there.  Then commented the heck out of it.  I
>>would guess that pool creation would be my chosen place in the SVN
>>code.
>>
> 
> We don't have to do random failures -- we can have chosen failure
> points, set off by a special command-line flag (for example).
> 
>    $ svn --fail-point=LABEL ...
> 
> Then the tests would be repeatable.  (Of course, we can use random
> processes to help choose the points.)
> 
> We'd still have to have those conditional exits scattered throughout
> the code, but it would be worth it IMHO.
> 
> -K
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
> 

We did random to avoid finding them by hand (lazy programmers, I know 
it's shocking), and essentially did stress testing to show it worked 
(not much for proof).  The only bothersome piece about labelling the 
failure points, is sometimes I want it to fail, after the hundredth 
interation of passing that point.  We found that lots of failure 
conditions were handled reasonable well during early stages, but nested 
failures deep in the belly of recursive calls weren't handled correctly.

You can either do that by using gdb and a script or extending the syntax 
of fail-point a bit.  If it where me, I would make the label a function 
call so you can set a breakpoint on it and not call use --fail-point on 
the command line for debugging and let the process continue running.  It 
also makes the failure points a bit cleaner in the code.

We did a similar dance of just generating random error codes on malloc, 
and put wrappers around important calls that would cause plausible 
failures randomly to see that all of the error conditions were handled 
correctly as opposed to just the program randomly dying.

Food for thought along those lines.  You guys are sharper then me, so 
I'll be interested to see how you solve the problem.


	Kirby


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn eats all memory, OOM killer kicks in

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
"Kirby C. Bohling" <kb...@birddog.com> writes:
> > Maybe we can think of some way to test for killability in the client?
>
> No that isn't exactly a portable solution (won't work on Windows).  I
> don't know of anything else that gives you "killable" tests that are
> repeatable.  Just an idea if you really want to try that.  The worst
> thing about it is the warts in the code with randomFailure.  We ended
> up moving it to an extremely low level library that everybody
> referenced and left it there.  Then commented the heck out of it.  I
> would guess that pool creation would be my chosen place in the SVN
> code.

We don't have to do random failures -- we can have chosen failure
points, set off by a special command-line flag (for example).

   $ svn --fail-point=LABEL ...

Then the tests would be repeatable.  (Of course, we can use random
processes to help choose the points.)

We'd still have to have those conditional exits scattered throughout
the code, but it would be worth it IMHO.

-K


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn eats all memory, OOM killer kicks in

Posted by "Kirby C. Bohling" <kb...@birddog.com>.

Greg Stein wrote:
> On Mon, Apr 15, 2002 at 11:01:43AM -0500, cmpilato@collab.net wrote:
> 
>>...
>>So, while your commit succeeded in the repository, your client was
>>Killed before the working copy was fully modified to reflect that it
>>*knew* your commit had succeeded (thus the reason why some of your
>>files were still marked for addition, and some weren't).  Well, that's
>>just a bogus place to be from a working copy standpoint, as you can
>>imagine.
>>
> 
> But even if it crashed, it *should* be okay, right? The WC would still show
> items to commit, the user would try it, and then get a conflict. An 'update'
> would then merge in the changes (already there!) and all would be happy.
> 
> The thing is, the client should be able to die (kill -9) at any point, and
> thing should work except for a possible need to use 'svn cleanup'.
> 
> Maybe we can think of some way to test for killability in the client?
> 
> Cheers,
> -g
> 
> 


Last time I saw anybody do anything of that nature, there was a special
compile flag that caused a raise( 9 ); (pick your signal, 15,9, 11 etc) a
certain percentage of the time (fairly low).  The random seed was the time
it was compiled, so any single compile was repeatable.

Essentially foo -V, or strings foo would give you the random seed for 
any given binary.  So the tests can be made repeatable if you want them 
to be (assuming you have the same random number generator as the bug 
reporter).  Put the randomFailure() around the code you want to test for 
failure modes.

Then make the client do stuff until it fails, and then sure the 
resulting mess was recoverable.  If you want extra bonus points, it 
should print out the function that is calling it, possibly a backtrace. 
  It would probably be easiest to call signal 11 so it would dump core 
and you could get a backtrace of the failure condition out of that, now 
that I think about it.

No that isn't exactly a portable solution (won't work on Windows).  I 
don't know of anything else that gives you "killable" tests that are 
repeatable.  Just an idea if you really want to try that.  The worst 
thing about it is the warts in the code with randomFailure.  We ended up 
moving it to an extremely low level library that everybody referenced 
and left it there.  Then commented the heck out of it.  I would guess 
that pool creation would be my chosen place in the SVN code.

This requires hand testing to really be useful, but I don't see many 
ways around that.


	Kirby



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn eats all memory, OOM killer kicks in

Posted by Greg Stein <gs...@lyra.org>.
On Mon, Apr 15, 2002 at 11:01:43AM -0500, cmpilato@collab.net wrote:
>...
> So, while your commit succeeded in the repository, your client was
> Killed before the working copy was fully modified to reflect that it
> *knew* your commit had succeeded (thus the reason why some of your
> files were still marked for addition, and some weren't).  Well, that's
> just a bogus place to be from a working copy standpoint, as you can
> imagine.

But even if it crashed, it *should* be okay, right? The WC would still show
items to commit, the user would try it, and then get a conflict. An 'update'
would then merge in the changes (already there!) and all would be happy.

The thing is, the client should be able to die (kill -9) at any point, and
thing should work except for a possible need to use 'svn cleanup'.

Maybe we can think of some way to test for killability in the client?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn eats all memory, OOM killer kicks in

Posted by cm...@collab.net.
"Marcelo E. Magallon" <ma...@bigfoot.com> writes:

> >> cmpilato@collab.net writes:
> 
>  > >  At this point I assume the repository has been corrupted.  How can I
>  > >  verify or deny this?
>  > 
>  > You could use the `svnadmin' and `svnlook' tools to see what your
>  > youngest revision is, what paths exist in it, etc.  That would be
>  > really useful information.

I'm pretty sure I know what happened to you.  The new commit process'
memory usage wasn't soooooo bad, up to the point just after the commit
succeeded -- then things took a turn for the REALLY BAD.  In between
time the last `.' gets printed after "Transmitting..." and the time
`Committed revision XXX' shows up, your working copy is being updated
to reflect that all those added files are a) no longer added, but
regular working copy files, and b) have a revision number of XXX.  The
problem is that this process itself (which we refer to as "revision
bumping") was eating up gobs of memory.

So, while your commit succeeded in the repository, your client was
Killed before the working copy was fully modified to reflect that it
*knew* your commit had succeeded (thus the reason why some of your
files were still marked for addition, and some weren't).  Well, that's
just a bogus place to be from a working copy standpoint, as you can
imagine.

I am fixing our memory usage right now (actually, this fix is done,
I'm just recompiling and testing).  Watch the svn@ list for the commit.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn eats all memory, OOM killer kicks in

Posted by "Marcelo E. Magallon" <ma...@bigfoot.com>.
>> cmpilato@collab.net writes:

 > >  At this point I assume the repository has been corrupted.  How can I
 > >  verify or deny this?
 > 
 > You could use the `svnadmin' and `svnlook' tools to see what your
 > youngest revision is, what paths exist in it, etc.  That would be
 > really useful information.

 Hmm... ok...

[46 ysabell:~/tmp/wmaker/work/svn] svn ci -m "test"
Adding    branches/upstream/src/DI.h

svn_error: #21068 : <File already exists in revision>
  Commit failed (details follow):

svn_error: #21068 : <File already exists in revision>
  file already exists: filesystem `/home/marcelo/tmp/wmaker/svn/db', transaction `2', path `/branches/upstream/src/DI.h'

[50 ysabell:~/tmp/wmaker/work/svn] svnadmin lscr /home/marcelo/tmp/wmaker/svn /branches/upstream/src/DI.h
1

[55 ysabell:~/tmp/wmaker/work/svn] svnadmin lstxns --long /home/marcelo/tmp/wmaker/svn
Txn 1:
Created: Sun 14 Apr 2002 09:51:08.211195 (day 104, dst 1, gmt_off 007200)
Author: marcelo
Log (4 bytes):
test
==========================================
 branches/ <1.1>
  upstream/ <2.1>
   src/ <568.1>
    DI.h <569.1> [7515]

Txn 2:
Created: Sun 14 Apr 2002 09:52:44.540481 (day 104, dst 1, gmt_off 007200)
Author: marcelo
Log (4 bytes):
test
==========================================
 branches/ <1.1>
  upstream/ <2.1>
   src/ <568.1>
    DI.h <569.1> [7515]

(I hope I didn't trim much information)

[58 ysabell:~/tmp/wmaker/work/svn] svnlook /home/marcelo/tmp/wmaker/svn tree
/ 
 branches/ 
  upstream/ 
   src/ 
    DI.h 

[61 ysabell:~/tmp/wmaker/work/svn] svn status | grep DI.h
A      ./branches/upstream/src/DI.h

[62 ysabell:~/tmp/wmaker/work/svn] svn status | grep ^A | wc -l
    194

[63 ysabell:~/tmp/wmaker/work/svn] find -name .svn -prune -o -type f -print | wc -l
    733

If I'm reading the output of those last three commands correctly, some
files were recorded as added to the repository and some are still
scheduled for addition.

If I check out everything from the repository I get:

[73 ysabell:~/tmp/wmaker/work/tmp] find -name .svn -prune -o -type f -print | wc -l
    733
[77 ysabell:~/tmp/wmaker/work] diff -ruN -x .svn svn/ tmp/svn/
[78 ysabell:~/tmp/wmaker/work] 

where svn/ contains the original checked-in files and tmp/svn/ is the
fresh checkout.

There doesn't seem to be differences between the working copy where I
performed the check-in and a newly checked-out copy.  It seems the
repository is ok but the working copy was left in an inconsistent state.
An 'svn update' in the original working copy spin on:

select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 64000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 128000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 256000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 512000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
...

I hope this is useful,

Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn eats all memory, OOM killer kicks in

Posted by cm...@collab.net.
"Marcelo E. Magallon" <ma...@bigfoot.com> writes:

>  * I check in
> 
>     cd ../..
>     svn ci -m "Window Maker 0.64.0"
>     Adding ...
>     Adding ...
>     Adding ...
>     Transfering .................................................
>     .............................................................
>     [...]
>     ...................Killed

I would not be surprised if you just plain ran outta memory.  On the
very top of my current TODO list is the task of examining the
performance of our commit system, which I just rewrote a couple of
weeks ago with (admittedly) no real attention paid to performance.

>  * A second check-in attempt produces this:
> 
>     Adding    branches/upstream/util/wmsetup.c
> 
>     svn_error: #21068 : <File already exists in revision>
>       Commit failed (details follow):
> 
>     svn_error: #21068 : <File already exists in revision>
>       file already exists: filesystem `/home/marcelo/tmp/wmaker/svn/db', transaction `3', path `branches/upstream/util/wmsetup.c'

Hm...if the commit failed, then this shouldn't have happend.

>  At this point I assume the repository has been corrupted.  How can I
>  verify or deny this?

You could use the `svnadmin' and `svnlook' tools to see what your
youngest revision is, what paths exist in it, etc.  That would be
really useful information.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org