You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Matt Pounsett <ma...@cira.ca> on 2004/09/07 23:21:29 UTC
Lockups on large initial import (was: Re: Failures importing binaries)
Okay.. I finally got the time to look at this again, and here's what
I've got.
Running Joe Orton's subversion-1.0.6 and mod_dav_svn-1.0.6 RPMs on
RedHat Edge Server 3.x (fully updated). The repository in question is
accessed via https using Apache (RedHat's httpd-2.0.46-38.ent). The
https certificate is self-signed (probably irrelevant, but what the
hay). I've changed the default LANG environment variable from
en_US.UTF-8 to just en_US, and have removed the "AddDefaultCharset
UTF-8" which was set by default in the httpd.conf ... I did this to
remove problems we were having with ISO-8859-1 encoded files which our
web developer's tools output.
The web site in question is approximately 1.4GB of data spread over
~17000 files. It's a mixture of HTML, a small number of images
(decoration and buttons, nothing big), Word docs, PDFs and raw text.
The largest file is 18M, and the smallest is 0 bytes.
I've tried doing my initial import using two basic methods. Both
methods fail consistently, but exactly how they fail seems to vary.
First:
svnadmin create
svn import
This method fails in one of two ways:
1) svn locks up. An strace I ran on one iteration showed it was
waiting for a write() call to complete. Apache logs a PUT and
PROPPATCH for the second-to-last file displayed by the svn client. In
this state, svn doesn't appear to respond to any keyboard input, and so
far after 20 minutes it hasn't timed out and died on its own. strace
shows it attempting to handle a ^C, but the sighandler doesn't cause it
to exit. So far I've only been able to get svn to exit by sending it a
sig kill.
2) svn dies with an error like the following:
svn: PUT of
/web-www/!svn/wrk/9384d977-87e3-0310-9c18-e8a479a54b6e/www/fr/webcast/
2002/2002.05.28/fcir020528-avs/msh-jm.htm: Could not read status line:
Connection reset by peer (https://svn.cira.ca)
Regardless of which way it dies, this method leaves a large number of
log.NNNN files in the db directory (anywhere from dozens to thousands,
depending on where in the import it died), and causes the repository to
be unusable. Subsequent attempts to access the repository for any type
of transaction result in a timeout after several minutes.. it must be
deleted and re-created with svnadmin.
Second:
svn create
svn checkout
cp <orig_source> <workdir>
svn add *
svn commit
The commit makes it past all the "Adding.." lines and into Transmitting
file data. This runs for some period of time, then appears to hang for
a while (one strace showed it waiting on a select() call), and
eventually dies with an error like the following:
svn: Commit failed (details follow):
svn: At least one property change failed; repository is unchanged
The last entry logged by Apache in this case is a successful PUT.
I've also received a timeout message here.. though my tests today
haven't produced one so I haven't got a capture of the exact text.
This also appears to leave the database in an unusable state.
Sometimes this leaves only a single log.NNN file, sometimes a couple
dozen.
Of note in all this is that I can regularly import this web site using
a file:// URL... however, putting the resulting repository behind
Apache and trying to do a checkout results in a similar set of
failures.
When the db gets locked up, I've tried running svnadmin recover on it,
and that locks up on an fcntl64() call on locks/db.lock
Because this works fine using only svn with a file:// URL, and because
this web server allows me to read and write many large files using DAV,
I'm inclined to think the problem is in mod_dav_svn somewhere.
I'm at a bit of an impasse now. I can't think of anything else to try
or test. Is there any other information I can provide, or specific
troubleshooting tasks I can perform to help one of the developers track
this down?
Thanks (especially for reading all the way to here)..
Matt
Matt Pounsett CIRA - Canadian Internet Registration
Authority
Technical Support Programmer 350 Sparks Street,
Suite 1110
matt.pounsett@cira.ca Ottawa, Ontario,
Canada
613.237.5335 ext. 231
http://www.cira.ca
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Lockups on large initial import
Posted by Matt Pounsett <ma...@cira.ca>.
Another self-followup:
On Sep 10, 2004, at 21:00, Matt Pounsett wrote:
> I re-ran my tests today using svnserve instead of Apache, and had no
> problem. I moved back to Apache and got this during the Transferring
> Data phase:
> svn: Commit failed (details follow):
> svn: At least one property change failed; repository is unchanged
>
> ... and the db is now locked up.
>
> I ran your script over an http connection and got a failure as well.
>
> Are any of the developers watching this thread? This seems pretty
> consistently repeatable.. is there any further troubleshooting you'd
> like us to do to help track down the problem?
More info on this particular issue.
I managed to catch another "At least one property change failed" error
this morning, and here's what I found. It appears to me as if the
problem has something to do with mod_dav_svn failing to remove db locks
as Apache expires its children and HTTP requests get passed off to new
children.
After getting the error, I checked on my httpd processes, and found a
few of them waiting for futex calls to return. Now here's where it
gets interesting... (note for completeness that I'm the only one
accessing this httpd at the moment, and I'm only running one svn client
at a time):
% lsof __db*
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
httpd 12883 apache mem REG 9,0 16384 3597498 __db.005
httpd 12883 apache mem REG 9,0 917504 3597497 __db.004
httpd 12883 apache mem REG 9,0 327680 3597496 __db.003
httpd 12883 apache mem REG 9,0 278528 3597495 __db.002
httpd 12883 apache mem REG 9,0 16384 3597494 __db.001
httpd 12886 apache mem REG 9,0 16384 3597498 __db.005
httpd 12886 apache mem REG 9,0 917504 3597497 __db.004
httpd 12886 apache mem REG 9,0 327680 3597496 __db.003
httpd 12886 apache mem REG 9,0 278528 3597495 __db.002
httpd 12886 apache mem REG 9,0 16384 3597494 __db.001
httpd 12889 apache mem REG 9,0 16384 3597498 __db.005
httpd 12889 apache mem REG 9,0 917504 3597497 __db.004
httpd 12889 apache mem REG 9,0 327680 3597496 __db.003
httpd 12889 apache mem REG 9,0 278528 3597495 __db.002
httpd 12889 apache mem REG 9,0 16384 3597494 __db.001
% strace -p 12883 -p 12886 -p 12889
Process 12883 attached - interrupt to quit
Process 12886 attached - interrupt to quit
Process 12889 attached - interrupt to quit
[pid 12883] futex(0xb6eca2b0, FUTEX_WAIT, 2, NULL <unfinished ...>
[pid 12886] futex(0xb6f0d5c8, FUTEX_WAIT, 2, NULL <unfinished ...>
[pid 12889] futex(0xb6f0d5c8, FUTEX_WAIT, 2, NULL <unfinished ...>
Process 12883 detached
Process 12886 detached
Process 12889 detached
So the only httpd processes accessing the database are all waiting for
some other process to release a lock.
Matt Pounsett Canadian Internet Registration Authority
Technical Support Programmer 350 Sparks Street, Suite 1110
matt.pounsett@cira.ca Ottawa, Ontario, Canada
613.237.5335 ext. 231 http://www.cira.ca
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Lockups on large initial import
Posted by Matt Pounsett <ma...@cira.ca>.
On Sep 08, 2004, at 04:27, Alex R. Mosteo wrote:
> Matt Pounsett wrote:
>
> [big and interesting snip]
>
>> Because this works fine using only svn with a file:// URL, and
>> because this web server allows me to read and write many large files
>> using DAV, I'm inclined to think the problem is in mod_dav_svn
>> somewhere.
>> I'm at a bit of an impasse now. I can't think of anything else to
>> try or test. Is there any other information I can provide, or
>> specific troubleshooting tasks I can perform to help one of the
>> developers track this down?
>
> I agree with your conclusions, but must be noted that someone else
> posted the other day a description that could match this problem using
> svnserve. So maybe it is a race condition inside BDB or something
> else.
I re-ran my tests today using svnserve instead of Apache, and had no
problem. I moved back to Apache and got this during the Transferring
Data phase:
svn: Commit failed (details follow):
svn: At least one property change failed; repository is unchanged
... and the db is now locked up.
I ran your script over an http connection and got a failure as well.
Are any of the developers watching this thread? This seems pretty
consistently repeatable.. is there any further troubleshooting you'd
like us to do to help track down the problem?
Matt Pounsett CIRA - Canadian Internet Registration
Authority
Technical Support Programmer 350 Sparks Street,
Suite 1110
matt.pounsett@cira.ca Ottawa, Ontario,
Canada
613.237.5335 ext. 231
http://www.cira.ca
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Lockups on large initial import
Posted by "Alex R. Mosteo" <al...@mosteo.com>.
Alex R. Mosteo wrote:
> Lee Merrill wrote:
>
>> Just FYI, I am using RedHat 7.3, neon 0.24-7-1, Apache 2.0.50, and
>> Berkeley DB 4.2.52.NC.
>
>
> Thanks, Lee. We have several differences:
>
> svn 1.0.6 and BDB 4.1.25 on Mandrake 10
>
> These are the highest versions prepackaged for M10. It seems the culprit
> is one of these.
>
> If I get the time will upgrade manually to BDB 4.2 (after a dump ;) and
> retest. I'll inform if so.
Could this patch in the Berkeley homepage have something to do?
"Long-running applications can hang in the Berkeley DB cache.
Apply the following patch to the db-4.2.52 release."
I'll try with and without the patch.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Lockups on large initial import
Posted by "Alex R. Mosteo" <al...@mosteo.com>.
Lee Merrill wrote:
> Just FYI, I am using RedHat 7.3, neon 0.24-7-1, Apache 2.0.50, and
> Berkeley DB 4.2.52.NC.
Thanks, Lee. We have several differences:
svn 1.0.6 and BDB 4.1.25 on Mandrake 10
These are the highest versions prepackaged for M10. It seems the culprit
is one of these.
If I get the time will upgrade manually to BDB 4.2 (after a dump ;) and
retest. I'll inform if so.
Kind regards,
A. Mosteo.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Lockups on large initial import
Posted by Lee Merrill <Le...@bustech.com>.
Hi Alex,
I tried your script, with Subversion 1.10, and it worked fine, with
both Berkeley DB, and with the new fsfs filesystem, using http access.
So maybe this has been fixed, in Subversion, or in the version of
Apache/Berkeley/Linux/etc. that I am using.
Just FYI, I am using RedHat 7.3, neon 0.24-7-1, Apache 2.0.50, and
Berkeley DB 4.2.52.NC.
Lee
> My experience has been the same, with both failure types, also using
> Apache.
>
> I agree with your conclusions, but must be noted that someone else
> posted the other day a description that could match this problem using
> svnserve. So maybe it is a race condition inside BDB or something else.
>
> I wanted to do a test but was unable, maybe you want to try it: to use
> a fsfs repository.
>
>
--
+=========================================================
+ Lee Merrill lee@bustech.com 919-866-2008
+=========================================================
Unless otherwise stated, any views presented in this email are solely those of the author and do not necessarily represent those of the company.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Re: Lockups on large initial import
Posted by "Alex R. Mosteo" <al...@mosteo.com>.
Matt Pounsett wrote:
[big and interesting snip]
> Because this works fine using only svn with a file:// URL, and because
> this web server allows me to read and write many large files using DAV,
> I'm inclined to think the problem is in mod_dav_svn somewhere.
>
> I'm at a bit of an impasse now. I can't think of anything else to try
> or test. Is there any other information I can provide, or specific
> troubleshooting tasks I can perform to help one of the developers track
> this down?
Thanks Matt for putting it together so clearly. My experience has been
the same, with both failure types, also using Apache.
I agree with your conclusions, but must be noted that someone else
posted the other day a description that could match this problem using
svnserve. So maybe it is a race condition inside BDB or something else.
I wanted to do a test but was unable, maybe you want to try it: to use a
fsfs repository. I've been unable to compile the latest RC so I'm stuck
in 1.0.6 until a .rpm is available. If you do this test, please report
on it.
I'll repost my test script just in case someone else want to stress his
configuration. I suppose that with minor changes it could be used with
svnserve or directly using file://...
(Note that it needs plenty of disk space available).
----8<-----------
#!/bin/sh
# test_data is a folder with randomly created files
rm -rf test_data
mkdir test_data
filesize=10000 # Adjust this filesize for different experiences
x=$((100000000 / filesize))
echo "Creating binaries..."
while ((x)); do
head -c $filesize /dev/urandom > test_data/random-$x.bin;
x=$((x - 1));
echo Remaining to be created: $x...
done
# svn-test is the folder with the repository
sudo rm -rf svn-test
sudo mkdir svn-test
sudo svnadmin create svn-test
sudo chown nobody:nobody -R svn-test # nobody is my apache user
echo "Beginning import..."
svn import test_data http://your.server.here/svn-test/test_data -m ""
echo "Verifying..."
sudo svnadmin verify svn-test
echo "Done."
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org