You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Bill Stoddard <bi...@wstoddard.com> on 2001/06/29 23:24:06 UTC

mod_file_cache broken on Windows

Or perhaps serving a cached file is tickling a bug in the bucket code.  To recreate, load
a file into the cache (use CacheFile) and pound on the server requesting the cached file
with apachebench.  Seg fault in apr_brigade_cleanup but the stack is completely trashed so
I can't see the calling chain.

Bill


Re: how to unsubscribe (was Re: mod_file_cache broken on Windows)

Posted by Cliff Woolley <cl...@yahoo.com>.
On Sun, 1 Jul 2001, [iso-8859-1] ITT INC. Montr�al wrote:

> how to unsubsribe from this news list
> Thanks
> Albert LASRY

Instructions are included in the headers of every message sent to the
list.  They say that to unsubscribe, you should send an email to:

new-httpd-unsubscribe@apache.org

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: mod_file_cache broken on Windows

Posted by "ITT INC. Montréal" <al...@p2i.net>.
how to unsubsribe from this news list
Thanks
Albert LASRY

----- Original Message -----
From: Bill Stoddard <bi...@wstoddard.com>
To: <ne...@apache.org>
Sent: Sunday, July 01, 2001 11:13 AM
Subject: Re: mod_file_cache broken on Windows


> Fixed...
>
> > So the problem is
> > related to one of the following:
> >
>
> <snip>
>
> >
> > 2. fields set in the cached apr_file_t that are not set in the new
apr_file_t created by the
> > apr_os_file_blah() trick file.
> > Since we are (should be :-) on the sendfile path, I wouldn't expect to
get into the XTHREAD code
> in
> > APR, but it could be that we are because of some of the settings in the
cached apr_file_t.
>
> The bug was introduced by the Win32 large file support in APR.
Specifically, we were using the event
> handle in the cached apr_file_t to do overlapped i/o on a socket. The
really bad mojo was that
> multiple threads were using this same handle. Backed out this portion of
the large file support
> patch and can serve files out of the cache w/o segfaulting. I'll do more
testing tomorrow.
>
> Bill
>
>
>


Re: mod_file_cache broken on Windows

Posted by Bill Stoddard <bi...@wstoddard.com>.
Fixed...

> So the problem is
> related to one of the following:
>

<snip>

>
> 2. fields set in the cached apr_file_t that are not set in the new apr_file_t created by the
> apr_os_file_blah() trick file.
> Since we are (should be :-) on the sendfile path, I wouldn't expect to get into the XTHREAD code
in
> APR, but it could be that we are because of some of the settings in the cached apr_file_t.

The bug was introduced by the Win32 large file support in APR. Specifically, we were using the event
handle in the cached apr_file_t to do overlapped i/o on a socket. The really bad mojo was that
multiple threads were using this same handle. Backed out this portion of the large file support
patch and can serve files out of the cache w/o segfaulting. I'll do more testing tomorrow.

Bill



Re: mod_file_cache broken on Windows

Posted by Cliff Woolley <cl...@yahoo.com>.
On Sat, 30 Jun 2001, Bill Stoddard wrote:

> Your patch calls apr_bucket_file_create with the cached file in the
> pconf pool.

Right.

> If I do the apr_os_file_get()/apr_os_file_put() trick to
> put the fd into an apr_file_t allocated out of the request pool before
> calling apr_bucket_file_create(), everything works (with HTTP/1.0
> non-keep alive request).  It is still broken for keep alive requests
> of course, which I know is the problem you are trying to fix...

Not surprising... that's basically taking us back to before my patch.
Good sanity check, though.

> The seg fault only happens when I am sending in multiple concurrent
> requests.  ab -n 100 -c 1 server/cached_file.html works. ab -n 100 -c
> >1 server/cached_file.html seg faults everytime. These are HTTP/1.0
> non-keepalive requests and no additional content filters are being
> installed, so we should never attempt to read from the file (ie, we
> should always use sendfile).  So the problem is related to one of the
> following:

See, I'm just not seeing that behavior.  I've always done my tests with
   ab -n 10000 -c 100 cached_file.html

No segfaults for me.  <scratching head>

I'll keep pounding on it, though, and look into your suggestions
some more...

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: mod_file_cache broken on Windows

Posted by Bill Stoddard <bi...@wstoddard.com>.
Your patch calls apr_bucket_file_create with the cached file in the pconf pool.  If I do the
apr_os_file_get()/apr_os_file_put() trick to put the fd into an apr_file_t allocated out of the
request pool before calling apr_bucket_file_create(), everything works (with HTTP/1.0 non-keep alive
request).  It is still broken for keep alive requests of course, which I know is the problem you are
trying to fix...

The seg fault only happens when I am sending in multiple concurrent requests.  ab -n 100 -c 1
server/cached_file.html  works. ab -n 100 -c >1 server/cached_file.html seg faults everytime. These
are HTTP/1.0 non-keepalive requests and no additional content filters are being installed, so we
should never attempt to read from the file (ie, we should always use sendfile).  So the problem is
related to one of the following:

1. that the file passed in on apr_bucket_file_is allocated out of the pconf pool...
If I had time, I would trace how the pconf pool in the apr_file_t is being used on the sendfile
path, especially during request cleanup.

2. fields set in the cached apr_file_t that are not set in the new apr_file_t created by the
apr_os_file_blah() trick file.
Since we are (should be :-) on the sendfile path, I wouldn't expect to get into the XTHREAD code in
APR, but it could be that we are because of some of the settings in the cached apr_file_t.

Bill

> On Fri, 29 Jun 2001, Cliff Woolley wrote:
>
> > Bummer.  Especially because that's the exact sort of thing I did when
> > testing it myself on Unix, and I didn't get any segfaults.  To help narrow
> > in on the problem, please try changing APR_HAS_XTHREAD_FILES to 0 in
> > apr.hw and try your test again.  That will help us figure out where the
> > problem is.
>
> So I'm pushing on the code... I have found some behaviour I can't explain
> yet, but I still haven't gotten it to segfault, so I'm guessing we're
> talking about two different problems.  What I'm seeing at the moment is
> that if I request the very *first* file on my list of cached files, it
> happens to have file descriptor #2.  When I request it and it gets served
> with sendfile(), I end up getting the beginning of my error log served as
> the request!  It seems to be the right number of bytes, just the wrong
> file.  That sucks.  I'll try to figure this out... in the meanwhile, I'm
> still curious to hear what you see when you switch to
> APR_HAS_XTHREAD_FILES=0.
>
> Thanks,
> Cliff
>
> --------------------------------------------------------------
>    Cliff Woolley
>    cliffwoolley@yahoo.com
>    Charlottesville, VA
>
>


Re: mod_file_cache broken on Windows

Posted by Cliff Woolley <cl...@yahoo.com>.
On Sat, 30 Jun 2001, Cliff Woolley wrote:

> talking about two different problems.  What I'm seeing at the moment is
> that if I request the very *first* file on my list of cached files, it
> happens to have file descriptor #2.  When I request it and it gets served
> with sendfile(), I end up getting the beginning of my error log served as
> the request!

Uggghhhhh... why oh why won't gdb on a threaded process on Linux produce
sane and/or reproducible results?  Anyway, I've been unable to reproduce
this behavior.  I switched to prefork, and it went away... switched back
to threaded, and it still works.  I guess I'll blame gdb for doing
something weird as I stepped through the code and then I'll blame IE for
caching the resulting bogus page and refusing to let go of it.

I'll keep playing with it, but for now, it seems fine on my end.

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: mod_file_cache broken on Windows

Posted by Cliff Woolley <cl...@yahoo.com>.
On Fri, 29 Jun 2001, Cliff Woolley wrote:

> Bummer.  Especially because that's the exact sort of thing I did when
> testing it myself on Unix, and I didn't get any segfaults.  To help narrow
> in on the problem, please try changing APR_HAS_XTHREAD_FILES to 0 in
> apr.hw and try your test again.  That will help us figure out where the
> problem is.

So I'm pushing on the code... I have found some behaviour I can't explain
yet, but I still haven't gotten it to segfault, so I'm guessing we're
talking about two different problems.  What I'm seeing at the moment is
that if I request the very *first* file on my list of cached files, it
happens to have file descriptor #2.  When I request it and it gets served
with sendfile(), I end up getting the beginning of my error log served as
the request!  It seems to be the right number of bytes, just the wrong
file.  That sucks.  I'll try to figure this out... in the meanwhile, I'm
still curious to hear what you see when you switch to
APR_HAS_XTHREAD_FILES=0.

Thanks,
Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: mod_file_cache broken on Windows

Posted by Cliff Woolley <cl...@yahoo.com>.
On Fri, 29 Jun 2001, Cliff Woolley wrote:

> > Or perhaps serving a cached file is tickling a bug in the bucket code.
> > To recreate, load a file into the cache (use CacheFile) and pound on
> > the server requesting the cached file with apachebench.  Seg fault in
> > apr_brigade_cleanup but the stack is completely trashed so I can't see
> > the calling chain.

It occurred to me when I woke up this morning that it bears mentioning
again that the part of my patch that added apr_file_flags_get() to APR
would have broken binary compatibility for apr_file_t's.  So just to be
sure, you did try a rebuild all, right?

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: mod_file_cache broken on Windows

Posted by Cliff Woolley <cl...@yahoo.com>.
On Fri, 29 Jun 2001, Bill Stoddard wrote:

> Or perhaps serving a cached file is tickling a bug in the bucket code.
> To recreate, load a file into the cache (use CacheFile) and pound on
> the server requesting the cached file with apachebench.  Seg fault in
> apr_brigade_cleanup but the stack is completely trashed so I can't see
> the calling chain.

Bummer.  Especially because that's the exact sort of thing I did when
testing it myself on Unix, and I didn't get any segfaults.  To help narrow
in on the problem, please try changing APR_HAS_XTHREAD_FILES to 0 in
apr.hw and try your test again.  That will help us figure out where the
problem is.  I strongly suspect that it's in APR's Win32 xthread support,
but proof would be good.

Thanks,
Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA