You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Rick Jones <ri...@hp.com> on 2007/08/13 20:59:24 UTC

lstat after rmdir bug?

It would seem I've had a few cases where 1.4.2 will attempt to lstat a 
directory after it has rmdir'ed that directory.  These seem to correlate 
completely with error messages such as:

raj@tardy:~/netperf2_trunk$ svn commit -m "initial stab at measure CPU 
but only confidence on result change
 > "
Sending        src/netlib.c
Sending        src/netsh.c
Sending        src/netsh.h
Sending        src/nettest_bsd.c
Transmitting file data ....svn: Commit failed (details follow):
svn: MERGE request failed on '/svn/netperf2/trunk/src'
svn: Can't read directory '/svn/netperf2/db/transactions/127-1.txn': 
Partial results are valid but processing is incomplete

reported by the client (also 1.4.2) on a commit.  In this case the 
server is PA-RISC Debian "testing" and the client is x86 Debian testing. 
    I have tried to compile the 1.4.4 bits from unstable, but on PA-RISC 
that Debian package will not compile.  Of course I've no idea if what 
I'm seeing is something already fixed in 1.4.4 - my perusal of the 
release notes, while providing some intriguing entries, found nothing 
that appeared to be an exact match.

It appears that this happens most redily with a "large" commit rather 
than a "small" commit, where small is say a one-liner into something like:

http://www.netperf.org/svn/sandbox/trunk/foo

This has had a discussion (of sorts) over in "users" with the title:

"Can't read directory" - what is the way out?

That thread begins in the archives at:

http://subversion.tigris.org/servlets/ReadMsg?list=users&msgNo=68894

I'm left wondering if what I've encountered is a server bug, or 
something else.  Your thoughts on that prior to my trying to file a bug 
would be great.

rick jones

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: lstat after rmdir bug?

Posted by Rick Jones <ri...@hp.com>.
Thanks for taking the closer look at the trace.  WRT your questions:

>  - Server's distro, libc version, apr version, subversion version
>    (and whether each is a vanilla or distro-provided version).

Debian:

www:/svn/netperf2# cat /etc/debian_version
lenny/sid

at the time the traces were taken, everything was from "testing" - which had svn 
1.4.2mumble.  to see if it was an already fixed problem, a bit of "unstable" was 
brought-in to get 1.4.4.  even after 1.4.4mumble was installed the problems 
persist.

everything is distro provided.  here are the apache bits:

www:/etc/apache2/mods-enabled# dpkg -l | grep -i apache
ii  apache2                        2.2.3-5                             Next 
generation, scalable, extendable web se
rc  apache2-common                 2.0.54-5                            next 
generation, scalable, extendable web se
ii  apache2-mpm-worker             2.2.3-5                             High 
speed threaded model for Apache HTTPD
ii  apache2-threaded-dev           2.2.3-5 
development headers for apache2
ii  apache2-utils                  2.2.3-5                             utility 
programs for webservers
ii  apache2.2-common               2.2.3-5                             Next 
generation, scalable, extendable web se
rc  libapache2-mod-mime-xattr      0.3-2                               Apache2 
module to get MIME info from filesys
ii  libapache2-svn                 1.4.2dfsg1-2 
Subversion server modules for Apache
ii  libapr0                        2.0.55-4                            the 
Apache Portable Runtime
ii  libapr1                        1.2.7-8.2                           The 
Apache Portable Runtime Library
ii  libapr1-dev                    1.2.7-8.2                           The 
Apache Portable Runtime Library - Develo
ii  libaprutil1                    1.2.7+dfsg-2+b1                     The 
Apache Portable Runtime Utility Library
ii  libaprutil1-dev                1.2.7+dfsg-2+b1                     The 
Apache Portable Runtime Utility Library

and the libc bits:

www:/etc/apache2/mods-enabled# dpkg -l | grep -i libc
ii  glibc-doc                      2.6-2                               GNU C 
Library: Documentation
ii  libc6                          2.6-2                               GNU C 
Library: Shared libraries
ii  libc6-dbg                      2.6-2                               GNU C 
Library: Libraries with debugging symb
ii  libc6-dev                      2.6-2                               GNU C 
Library: Development Libraries and Hea

and the subversion bits:

www:/etc/apache2/mods-enabled# dpkg -l | grep -i subv
ii  libapache2-svn                 1.4.2dfsg1-2 
Subversion server modules for Apache
ii  libsvn0                        1.2.3dfsg1-3                        shared 
libraries used by Subversion (aka. sv
ii  libsvn1                        1.4.4dfsg1-1                        Shared 
libraries used by Subversion
ii  python2.3-subversion           1.2.3dfsg1-3                        python 
modules for interfacing with Subversi
ii  subversion                     1.4.4dfsg1-1                        Advanced 
version control system
ii  subversion-tools               1.4.2dfsg1-2                        Assorted 
tools related to Subversion


>  - Which apache modules the server is running.

I believe it would be these:

www:/etc/apache2/mods-enabled# ls
alias.load            autoindex.conf  dav_fs.load   negotiation.load
auth_basic.load       autoindex.load  dav_svn.conf  setenvif.load
authn_file.load       cgi.load        dav_svn.load  status.load
authz_default.load    cgid.conf       dir.conf      userdir.conf
authz_groupfile.load  cgid.load       dir.load      userdir.load
authz_host.load       dav.load        env.load
authz_user.load       dav_fs.conf     mim.load

>  - Whether the error is consistently repeatable, and whether there's any
>    pattern to the failures, or whether everyone's seeing them.

It is consistently repeatable in the netperf2 and netperf4 repositories.  It 
does not appear to be _consistently_ repeatable in the sandbox repository.  For 
example, I am able to update "foo" in sandbox with some oneliner changes, but 
when I tried to add those traces I got the error.

thanks,

rick

> 
> Regards,
> Malcolm

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: lstat after rmdir bug?

Posted by Malcolm Rowe <ma...@farside.org.uk>.
On Fri, Aug 17, 2007 at 06:34:33PM +0100, Malcolm Rowe wrote:
> Hi Rick,
> 
> On Mon, Aug 13, 2007 at 01:59:24PM -0700, Rick Jones wrote:
> >  It would seem I've had a few cases where 1.4.2 will attempt to lstat a 
> >  Transmitting file data ....svn: Commit failed (details follow):
> >  svn: MERGE request failed on '/svn/netperf2/trunk/src'
> >  svn: Can't read directory '/svn/netperf2/db/transactions/127-1.txn': Partial 
> >  results are valid but processing is incomplete
> > 
> >  reported by the client (also 1.4.2) on a commit.  In this case the server is 
> >  PA-RISC Debian "testing" and the client is x86 Debian testing.    I have 

Looks like this might by this problem:
http://www.mail-archive.com/debian-hppa@lists.debian.org/msg05468.html
(pointing to a problem in either glibc or the kernel).

Unfortunately I can't work out from the thread referenced whether the
patch is correct or has made it anywhere - it seemed to devolve into a
discussion about glibc build problems.

Regards,
Malcolm

Re: lstat after rmdir bug?

Posted by Malcolm Rowe <ma...@farside.org.uk>.
Hi Rick,

On Mon, Aug 13, 2007 at 01:59:24PM -0700, Rick Jones wrote:
>  It would seem I've had a few cases where 1.4.2 will attempt to lstat a 
>  directory after it has rmdir'ed that directory.  These seem to correlate 
>  completely with error messages such as:
> 
>  raj@tardy:~/netperf2_trunk$ svn commit -m "initial stab at measure CPU but 
>  only confidence on result change
>  > "
>  Sending        src/netlib.c
>  Sending        src/netsh.c
>  Sending        src/netsh.h
>  Sending        src/nettest_bsd.c
>  Transmitting file data ....svn: Commit failed (details follow):
>  svn: MERGE request failed on '/svn/netperf2/trunk/src'
>  svn: Can't read directory '/svn/netperf2/db/transactions/127-1.txn': Partial 
>  results are valid but processing is incomplete
> 
>  reported by the client (also 1.4.2) on a commit.  In this case the server is 
>  PA-RISC Debian "testing" and the client is x86 Debian testing.    I have 
>  tried to compile the 1.4.4 bits from unstable, but on PA-RISC that Debian 
>  package will not compile.  Of course I've no idea if what I'm seeing is 
>  something already fixed in 1.4.4 - my perusal of the release notes, while 
>  providing some intriguing entries, found nothing that appeared to be an 
>  exact match.
> 

Preethi (cc'd) sent me an strace off-list, and David Anderson and I took
a look at it this morning.

So, the main problem is this part of the strace.  This is right at the
end of the commit process, just as we're removing the old transaction
directory:

getdents(13, {{d_ino=1355922, d_off=12, d_reclen=12, d_name="."}
{d_ino=1254356, d_off=24, d_reclen=16, d_name=".."} {d_ino=1355964,
d_off=52, d_reclen=20, d_name="node.0.0"} {d_ino=1356276, d_off=68,
d_reclen=20, d_name="rev-lock"} {d_ino=1356278, d_off=84, d_reclen=20,
d_name="changes"} {d_ino=1356280, d_off=116, d_reclen=20,
d_name="next-ids"} {d_ino=1356282, d_off=132, d_reclen=20,
d_name="node.3.0"} {d_ino=1356283, d_off=160, d_reclen=28,
d_name="node.0.0.children"} {d_ino=1356284, d_off=180, d_reclen=20,
d_name="node.1t.0"} {d_ino=1356285, d_off=4096, d_reclen=28,
d_name="node.3.0.children"}}, 4096) = 204
lstat64("/svn/netperf2/db/transactions/131-1.txn/.", ...) = 0
lstat64("/svn/netperf2/db/transactions/131-1.txn/..", ...) = 0
lstat64("/svn/netperf2/db/transactions/131-1.txn/node.0.0", ...) = 0
unlink("/svn/netperf2/db/transactions/131-1.txn/node.0.0") = 0
  [ ditto for rev-lock, changes, next-ids, node-3.0 ]
lstat64("/svn/netperf2/db/transactions/131-1.txn/node.0.0.children",
...) = 0
unlink("/svn/netperf2/db/transactions/131-1.txn/node.0.0.children") = 0
lstat64("/svn/netperf2/db/transactions/131-1.txn/node.1t.0children",
0x419dbe88) = -1 ENOENT (No such file or directory)

That last filename should be "node.1t.0", as you can see from the
results of the getdents() syscall.  Since APR just concatenates the
dirname and dentry name it got from readdir() and passes it to lstat(),
I'm failing to see where the problem could be coming from - unless
there's a bug in PA-RISC's APR or libc.

After we get APR_INCOMPLETE from apr_read_dir(), we return from the
commit with that error, abort the transaction (which successfully -- at
least, in the strace I saw -- deletes all the remaining files and the
dir) and finally we return the APR_INCOMPLETE error back to the client.

The client responds with a DELETE of the activity, which is the part
that's doing the final lstat() of the (now non-existent) transaction
directory -- but that error is expected and ignored.

I'm a bit stuck at this point, but there are some things that it would
be useful to find out:

 - Server's distro, libc version, apr version, subversion version
   (and whether each is a vanilla or distro-provided version).
 - Which apache modules the server is running.
 - Whether the error is consistently repeatable, and whether there's any
   pattern to the failures, or whether everyone's seeing them.

Regards,
Malcolm

Re: lstat after rmdir bug?

Posted by Rick Jones <ri...@hp.com>.
This same error message keeps happening even after updating the server 
to a 1.4.4mumble from Debian.

At this point, beyond the previously mentioned system call traces what 
else should I try to gather for the bug report?

rick jones

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org