You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Stein <gs...@lyra.org> on 2001/08/29 04:44:43 UTC

latest status

The Karl/Mike fix for svndiff stream parsing seemed to have done the trick.
While they reported "it didn't", my testing has shown that it *did*. The
repro script that I posted a little while ago now succeeds.

Given that, I began running the mass-commit script again. We still have the
two remaining problems:

* WC locks bugging up the commits
* the DB_INCOMPLETE thang

The latter is actually posing a significant problem now. I kept running out
of disk space on my box (meager disk for the moment). So I went and started
tossing the DB logs while the script was running. But it didn't seem to
help...

I also noticed the Apache memory slowly increasing. Dunno if it was simply
representing the high-water mark of memory usage, or if there is an actual
problem in there, but it was there.

That led me to investigate Apache's memory usage a bit. I figured maybe it
was opening some anonymous memory maps, thus consuming disk space, and also
reflecting in the memory page. Well... a simple "ls -l /proc/PID#/fd" ought
to show me the mmaps.

Oops! What I found instead was a number of file descriptors pointing to the
DB files. These leaked file descriptors also referenced the log files which
I was trying to delete... Of course, the system still referred to them, so
they were still using disk space.

[ the neat part is that I "gracefully" restarted Apache in the middle of the
  mass-commit. it tossed all those file descriptors and the mass-commit
  didn't even flinch while Apache was restarting (actually, clients never
  see apache go down when it does a graceful restart, so I wasn't
  surprised... but cool nonetheless) ]

After the graceful restart, the logs went "poof" and the disk space was back
to normal.

I am now suspecting the leaked descriptors are because we bail out during
the DB->close routines. When one of them returns DB_INCOMPLETE, I think the
logic just bails out (gotta go look).

Regardless of what is happening, we have to get those descriptors closed. We
can't be leaking descriptors on the server like that. So... that is my
current task.

Does anybody have any particular information on receiving DB_INCOMPLETE from
a db->close operation? When we get that from txn_checkpoint(), we sleep for
a second and then try again. Should we do the same for a close? Should we
simply ignore it and move along?

I'll be trying some of these things, but hints from other people would give
me a nice head start.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: latest status

Posted by Greg Stein <gs...@lyra.org>.
On Wed, Aug 29, 2001 at 12:56:20PM +0200, Sander Striker wrote:
> > On Tue, Aug 28, 2001 at 09:44:43PM -0700, Greg Stein wrote:
>...
> > Thus: simultaneous access is causing DB_INCOMPLETE (not a surprise, given
> > the Berkeley docs), the DB_INCOMPLETE is messing up our close, and we are
> > leaking.
> 
> Yes, that is what is happening.  The svn_fs_close_fs ends up calling
> cleanup_fs_apr (via the apr_pool_destroy of the fs pool).  This in turn
> calls cleanup_fs.  This call has a series of SVN_ERR wrapped db->close
> calls.  Ofcourse when another thread is writing to the db DB_INCOMPLETE
> is returned and cleanup_fs bails out.

Right.

> The docs state DB_INCOMPLETE can be safely ignored in db->close and
> db->sync.  These urls might be helpfull (and important regarding corruption/
> recovery):
> http://www.sleepycat.com/docs/ref/program/errorret.html#DB_INCOMPLETE
> http://www.sleepycat.com/docs/api_c/db_close.html

Seen 'em long ago :-)

> With this in mind there are 2 solutions for the problem.  The first is
> to tell the db not to flush (so no DB_INCOMPLETE can occur).  The second
> is _try_ to flush but ignore DB_INCOMPLETE when flushing doesn't work.

We'll take the second. We do use transactions, so even if we didn't sync, we
wouldn't lose any information in the event of a crash. However, recovery
will be easier if we try to sync changes.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [SVN-DEV] Re: latest status

Posted by kf...@collab.net.
"C. Scott Ananian" <ca...@lesser-magoo.lcs.mit.edu> writes:
> incidentally: is cvs2svn done? (didn't appear so from contents of
> tools/cvs2svn)  and if not, how is svn planning on maintaining change
> history when shifting to self-hosting?

cvs2svn is not done.  It's a pretty big task (think branches).  Though
Greg Stein got a good start on it, we decided it was too much of a
momentum sink to do in parallel with all the M3 work.

We're not porting the history over.  We're just leaving the old CVS
repository up so people can examine the history, and posting on the
web site a generated ChangeLog covering all of the CVS history.

Every project deserves a fresh start, right? :-)

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [SVN-DEV] Re: latest status

Posted by cm...@collab.net.
"C. Scott Ananian" <ca...@lesser-magoo.lcs.mit.edu> writes:

> incidentally: is cvs2svn done? (didn't appear so from contents of
> tools/cvs2svn)  and if not, how is svn planning on maintaining change
> history when shifting to self-hosting?

The short answers:
   no, and
   we're not.

:-)

We debated this for quite some time, and decided to let CVS history
live forever in CVS history, and begin afresh when we self-host with
SVN history.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [SVN-DEV] Re: latest status

Posted by "C. Scott Ananian" <ca...@lesser-magoo.lcs.mit.edu>.
On Wed, 29 Aug 2001, Greg Stein wrote:

> More and varied testing is, of course, still called for before we shift to
> self-hosting. At this point, we do not have any known showstoppers. That
> means we have 5 left ;-)

incidentally: is cvs2svn done? (didn't appear so from contents of
tools/cvs2svn)  and if not, how is svn planning on maintaining change
history when shifting to self-hosting?
 --s

Chechnya shotgun Flintlock operative strategic COBRA JUDY non-violent protest 
class struggle Khaddafi operation Pakistan Ft. Meade Delta Force Morwenstow 
              ( http://lesser-magoo.lcs.mit.edu/~cananian )
 --
 "These students are going to have to find out what law and order is
 all about."  -- Brig. General Robert Canterbury, Noon, May 4, 1970,
 minutes before his troops shot 13 unarmed Kent State students, killing 4.
 --
            [http://www.cs.cmu.edu/~dst/DeCSS/Gallery/]
#!/usr/bin/perl -w
# 526-byte qrpff, Keith Winstein and Marc Horowitz <si...@mit.edu>
# MPEG 2 PS VOB file on stdin -> descrambled output on stdout
# arguments: title key bytes in least to most-significant order
$_='while(read+STDIN,$_,2048){$a=29;$c=142;if((@a=unx"C*",$_)[20]&48){$h=5;
$_=unxb24,join"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$b=73;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=($t=255)&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*8^$q<<6))<<9
,$_=(map{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;$t
^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)[$_%8]}(16..271))
[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}print+x"C*",@a}';s/x/pack+/g;eval


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: latest status

Posted by Sander Striker <st...@apache.org>.
Hi,

> On Tue, Aug 28, 2001 at 09:44:43PM -0700, Greg Stein wrote:
>>...
>> Given that, I began running the mass-commit script again. We
>> still have the two remaining problems:
>>
>> * WC locks bugging up the commits
>> * the DB_INCOMPLETE thang
>>
>> The latter is actually posing a significant problem now. I kept
>> running out
>>...
>> Oops! What I found instead was a number of file descriptors
>> pointing to the DB files. These leaked file descriptors also
>> referenced the log files which I was trying to delete... Of
>> course, the system still referred to them, so they were still
>> using disk space.
>>...
>> I am now suspecting the leaked descriptors are because we bail
>> out during the DB->close routines. When one of them returns
>> DB_INCOMPLETE, I think the logic just bails out (gotta go look).
>>
>> Regardless of what is happening, we have to get those descriptors
>> closed. We can't be leaking descriptors on the server like that.
>> So... that is my current task.
>
> I just added a "sleep 1" after each commit in the mass-commit script. This
> prevented all occurances of the DB_INCOMPLETE error. Further, it prevented
> the descriptor leak.
>
> [ basically, it gave a pause to let Apache clean up the previous request
>   before processing the next commit request ]
>
> Thus: simultaneous access is causing DB_INCOMPLETE (not a surprise, given
> the Berkeley docs), the DB_INCOMPLETE is messing up our close, and we are
> leaking.

Yes, that is what is happening.  The svn_fs_close_fs ends up calling
cleanup_fs_apr (via the apr_pool_destroy of the fs pool).  This in turn
calls cleanup_fs.  This call has a series of SVN_ERR wrapped db->close
calls.  Ofcourse when another thread is writing to the db DB_INCOMPLETE
is returned and cleanup_fs bails out.

The docs state DB_INCOMPLETE can be safely ignored in db->close and
db->sync.  These urls might be helpfull (and important regarding corruption/
recovery):
http://www.sleepycat.com/docs/ref/program/errorret.html#DB_INCOMPLETE
http://www.sleepycat.com/docs/api_c/db_close.html

With this in mind there are 2 solutions for the problem.  The first is
to tell the db not to flush (so no DB_INCOMPLETE can occur).  The second
is _try_ to flush but ignore DB_INCOMPLETE when flushing doesn't work.

Solution 1:
--- libsvn_fs/fs.c      Tue Aug 28 13:08:41 2001
+++ libsvn_fs/fs.c~     Wed Aug 29 12:40:24 2001
@@ -91,7 +91,7 @@
       char *msg = apr_psprintf (fs->pool, "closing `%s' database", name);

       *db_ptr = 0;
-      SVN_ERR (DB_WRAP (fs, msg, db->close (db, 0)));
+      SVN_ERR (DB_WRAP (fs, msg, db->close (db, DB_NOSYNC)));
     }

   return SVN_NO_ERROR;

Solution 2:
--- libsvn_fs/fs.c      Tue Aug 28 13:08:41 2001
+++ libsvn_fs/fs.c~     Wed Aug 29 12:28:13 2001
@@ -89,9 +89,18 @@
     {
       DB *db = *db_ptr;
       char *msg = apr_psprintf (fs->pool, "closing `%s' database", name);
+      int db_err;

       *db_ptr = 0;
-      SVN_ERR (DB_WRAP (fs, msg, db->close (db, 0)));
+      db_err = db->close(db, 0);
+
+      /* According to the Berkeley documentation it is safe to
+         ignore DB_INCOMPLETE on db->close and db->sync.
+      */
+      if (db_err == DB_INCOMPLETE)
+          db_err = 0;
+
+      SVN_ERR (DB_WRAP (fs, msg, db_err)));
     }

   return SVN_NO_ERROR;

> While I'm going to look into this tomorrow, I'll note that it is
> not really
> a showstopper for M3. The database is not damaged, content is fine, and
> further processing is fine. While we may leak descriptors and hold open
> logs, it is very easy to have Apache recycle the processes to clean these
> up. I *do* expect to get the close mostly fixed up, but I also want to
> ensure that people have it in the proper perspective.
>
>
> More and varied testing is, of course, still called for before we shift to
> self-hosting. At this point, we do not have any known showstoppers. That
> means we have 5 left ;-)
>
> Cheers,
> -g

Keep it up, M3 seems really close now :)

Sander


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: latest status

Posted by Greg Stein <gs...@lyra.org>.
On Tue, Aug 28, 2001 at 09:44:43PM -0700, Greg Stein wrote:
>...
> Given that, I began running the mass-commit script again. We still have the
> two remaining problems:
> 
> * WC locks bugging up the commits
> * the DB_INCOMPLETE thang
> 
> The latter is actually posing a significant problem now. I kept running out
>...
> Oops! What I found instead was a number of file descriptors pointing to the
> DB files. These leaked file descriptors also referenced the log files which
> I was trying to delete... Of course, the system still referred to them, so
> they were still using disk space.
>...
> I am now suspecting the leaked descriptors are because we bail out during
> the DB->close routines. When one of them returns DB_INCOMPLETE, I think the
> logic just bails out (gotta go look).
> 
> Regardless of what is happening, we have to get those descriptors closed. We
> can't be leaking descriptors on the server like that. So... that is my
> current task.

I just added a "sleep 1" after each commit in the mass-commit script. This
prevented all occurances of the DB_INCOMPLETE error. Further, it prevented
the descriptor leak.

[ basically, it gave a pause to let Apache clean up the previous request
  before processing the next commit request ]

Thus: simultaneous access is causing DB_INCOMPLETE (not a surprise, given
the Berkeley docs), the DB_INCOMPLETE is messing up our close, and we are
leaking.


While I'm going to look into this tomorrow, I'll note that it is not really
a showstopper for M3. The database is not damaged, content is fine, and
further processing is fine. While we may leak descriptors and hold open
logs, it is very easy to have Apache recycle the processes to clean these
up. I *do* expect to get the close mostly fixed up, but I also want to
ensure that people have it in the proper perspective.


More and varied testing is, of course, still called for before we shift to
self-hosting. At this point, we do not have any known showstoppers. That
means we have 5 left ;-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org