You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Dale Ghent <da...@elemental.org> on 2001/11/11 03:11:02 UTC

Two apache/2.0.29-dev problems

I'm running HEAD as of earlier this afternoon on Solaris 8+sendfile, with
the worker mpm.

I'm seeing two problems:

one, is that the httpd process (one, sometimes both of the children I have
started) will spin up, eating CPU. A truss of these busy processes shows
that there are some threads stuck in some sort of read() loop on one or
more sockets, and all the read()s are returning 0 bytes. This persists
until I kill the processes.

The second issue, is that httpd seems to hang in ap_lingering_close()
while shutting down (after an 'apachectl stop' is issued. It'll eventually
die when I send a few SIGTERMs to it, and leaves a core file behind with
the following stack trace:




#0  0xff348ee0 in apr_pool_clear (a=0x3d6620) at apr_pools.c:957
957         free_blocks(a->first->h.next);
(gdb) where
#0  0xff348ee0 in apr_pool_clear (a=0x3d6620) at apr_pools.c:957
#1  0x97160 in core_output_filter (f=0x3c8910, b=0x0) at core.c:3217
#2  0x90108 in ap_pass_brigade (next=0x3c8910, bb=0x499a58)
    at util_filter.c:276
#3  0x8ea64 in ap_lingering_close (dummy=0x3c8698) at connection.c:175
#4  0xff348d44 in run_cleanups (c=0x3c88d8) at apr_pools.c:833
#5  0xff348ec8 in apr_pool_clear (a=0x3c8598) at apr_pools.c:949
#6  0xff348f24 in apr_pool_destroy (a=0x3c8598) at apr_pools.c:995
#7  0x8294c in worker_thread (thd=0x18dd10, dummy=0x3c8598) at
worker.c:723
#8  0xff343048 in dummy_worker (opaque=0x18dd10) at thread.c:122

(gdb) where full
#0  0xff348ee0 in apr_pool_clear (a=0x3d6620) at apr_pools.c:957
No locals.
#1  0x97160 in core_output_filter (f=0x3c8910, b=0x0) at core.c:3217
        rv = 0
        c = (conn_rec *) 0x3c8698
        ctx = (core_output_filter_ctx_t *) 0x3c8950
#2  0x90108 in ap_pass_brigade (next=0x3c8910, bb=0x499a58)
    at util_filter.c:276
        e = (apr_bucket *) 0x499a58
#3  0x8ea64 in ap_lingering_close (dummy=0x3c8698) at connection.c:175
        dummybuf =
"\000\000\000;\000\000\000\024\000\000\000\n\000\000\000\n\000\000\000e\000\000\000\006\000\000\0019\000\000\000\000ÿÿ¹°\000<±°\000<¥°\000=:h\000\000\000\013ÿ5¥\234\000=%ð\000<¥°\000\000\000\000\000I¯
\000\000\000é\000\000\000\000\000<\206\230ÿ5¥\234\000%J\030\000==`üpYÜ\000\000\000\004\000\000\000\n\000=>7üpYx\000\004T°\000=8Ø\000\000\000+\000\000ÿ\000\000\000\000\000ÿ3ç \000<\206Ü\000<h\000<\206Ü\000\000\000\000\000\000\000\000\000<\205\230\000<\211\b\000\000\000H\000%G\020\000==`\000=<Ð\000=<\200\000=:P\000\000\000\000ÿ7Ï$"...
        nbytes = 512
        rc = 3967248
        total_linger_time = 0
#4  0xff348d44 in run_cleanups (c=0x3c88d8) at apr_pools.c:833
No locals.
#5  0xff348ec8 in apr_pool_clear (a=0x3c8598) at apr_pools.c:949
No locals.
#6  0xff348f24 in apr_pool_destroy (a=0x3c8598) at apr_pools.c:995
        blok = (union block_hdr *) 0x3c8598
#7  0x8294c in worker_thread (thd=0x18dd10, dummy=0x3c8598) at
worker.c:723
        process_slot = 0
        thread_slot = 17
        csd = (apr_socket_t *) 0x3c85c8
        ptrans = (apr_pool_t *) 0x3c8598
        rv = 3966360
#8  0xff343048 in dummy_worker (opaque=0x18dd10) at thread.c:122
No locals.
(gdb)



/dale

Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems

Posted by Dale Ghent <da...@elemental.org>.

On Sun, 11 Nov 2001, Justin Erenkrantz wrote:

| On Sun, Nov 11, 2001 at 08:30:48AM -0800, Brian Pane wrote:
| > I think this bit of cgid_handler() is the problem:
| >            /* Soak up all the script output */
| >            while (apr_file_gets(argsbuffer, HUGE_STRING_LEN, tempsock)
| > > 0) {
| >                continue;
| >            }
|
| Yup, that is completely bogus.  I just committed a fix that should
| make apr_file_gets loop while apr_file_gets returns APR_SUCCESS.
| To make it clearer, I also changed all of the apr_file_gets calls
| to check APR_SUCCESS explicitly.
|
| Dale, please let me know if this fixes it.  If so, we may want to
| bump the tag on this file for 2.0.28.  -- justin

So far, so good. I've been running the new code for 15 minutes now, and a
run-away process hasnt occured yet (one would've by now).

To add to the 2.0.28/2.0.29 discussion, this bug reared it's head only
after I made what was in the CVS tree my production server on port 80,
replacing 1.3.20, thus exposing it to way more traffic.

I never saw this occurance whilst testing the CVS tree by itself on port
8080. My point is that I think 2.0.16 is rather dated, and Apache 2 would
really benefit from a release of the latest code base.

alot of things have been changed and fixed between 2.0.16 and now, and I
think we'd rather spend the time fixing contemporary bugs, rather than
having users out on the net report bugs in 2.0.16 that have long since
been addressed.

/dale

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-devproblems)

Posted by Cliff Woolley <cl...@yahoo.com>.

On Mon, 12 Nov 2001, Bill Stoddard wrote:

> > Why not just start over?  Because then we pick up whatever brokenness
> > has crept into CVS in the mean time.  Bill S. says Win32 won't build at
> > the moment with current HEAD.

Yeah, some generally weird things are happening on HEAD.  Sticking with
2.0.28 is the way to go.

+1 for the cgid fix.

> I am for bumping the mod_cgid changes and rolling 2.0.28. We can make
> the tarball available today for voting on beta.

++1.

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-devproblems)

Posted by Bill Stoddard <bi...@wstoddard.com>.

> Cliff Woolley wrote:
> >
> > On Sun, 11 Nov 2001, Justin Erenkrantz wrote:
> >
> > > Dale, please let me know if this fixes it.  If so, we may want to
> > > bump the tag on this file for 2.0.28.  -- justin
> >
> > We're rapidly approaching the point in time where it would be better to
> > start on 2.0.29 than to keep bumping tags on 2.0.28 -- I don't know about
> > you all, but I'm starting to lose track of what changes have been tagged
> > in since the original 2.0.28 and which ones haven't.
>
> Basically, 2.0.28 has got OtherBill's fix to request.c to reduce the
> number of stats, plus some minor stuff, on top of the original tag.  I
> would like to push Justin's mod_cgid fix into 2.0.28, re-roll, then make
> the tarballs public, and vote for beta.  This won't even need additional
> testing on daedalus, because daedalus uses mod_cgi.
>
> Why not just start over?  Because then we pick up whatever brokenness
> has crept into CVS in the mean time.  Bill S. says Win32 won't build at
> the moment with current HEAD.
>

My bad. Win32 is building fine. I accidently hosed awk on my machine which causes all
sorts of weird build problems.

I am for bumping the mod_cgid changes and rolling 2.0.28. We can make the tarball
available today for voting on beta.

Bill

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-devproblems)

Posted by Greg Ames <gr...@remulak.net>.

Cliff Woolley wrote:
> 
> On Sun, 11 Nov 2001, Justin Erenkrantz wrote:
> 
> > Dale, please let me know if this fixes it.  If so, we may want to
> > bump the tag on this file for 2.0.28.  -- justin
> 
> We're rapidly approaching the point in time where it would be better to
> start on 2.0.29 than to keep bumping tags on 2.0.28 -- I don't know about
> you all, but I'm starting to lose track of what changes have been tagged
> in since the original 2.0.28 and which ones haven't.

Basically, 2.0.28 has got OtherBill's fix to request.c to reduce the
number of stats, plus some minor stuff, on top of the original tag.  I
would like to push Justin's mod_cgid fix into 2.0.28, re-roll, then make
the tarballs public, and vote for beta.  This won't even need additional
testing on daedalus, because daedalus uses mod_cgi.

Why not just start over?  Because then we pick up whatever brokenness
has crept into CVS in the mean time.  Bill S. says Win32 won't build at
the moment with current HEAD.

Greg

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Greg Stein <gs...@lyra.org>.

On Sun, Nov 11, 2001 at 02:21:29PM -0500, Joshua Slive wrote:
> 
> 
> > From: Cliff Woolley [mailto:cliffwoolley@yahoo.com]
> 
> > IMO, it'd be better to tag 2.0.29 right now with a pretty stable tree
> > that's by all accounts better than 2.0.28.  It's that or get bug reports
> > for the next n weeks about things we've already fixed.  Just a thought.
> 
> I can't argue with that, except to point out that we've been saying exactly
> that for every release attempt for months.  So why not release 2.0.28, tag
> 2.0.29, and release that if it turns out okay.  A release every other day is
> better than no release for months.

Agreed. It is absolutely insane that we haven't made a release. Just call it
alpha and get it out there. If you want to call it beta, then fine. But
continuing to hold up releases to "just wanna fix this next bug" is wrong.
Our releases are *always* going to have bugs. We just want to hold if it is
a critical bug that a *lot* of people will see. If it doesn't link for some
small set of users... too bad. But everybody else can enjoy a new beta.

> Of course, someone needs to do the actual work of roll+release, which is
> obviously non-trivial.  Since I can't volunteer to do it myself at the
> moment, I'll just shut up ;-)

Me neither :-(

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: 2.0.29? (Re: apparent mod_cgid bug)

Posted by Cliff Woolley <cl...@yahoo.com>.

On Mon, 12 Nov 2001, Greg Ames wrote:

> > > >  - fix segfault in prefork
>
> wasn't ever a problem in 2.0.28

I just figured that out like half an hour ago.  <sigh>  Oh well, all the
better...

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: 2.0.29? (Re: apparent mod_cgid bug)

Posted by Greg Ames <gr...@remulak.net>.

Cliff Woolley wrote:
 
> > >  - fix infinite loop in mod_cgid

just bumped into 2.0.28.  I'll straighten out the CHANGES file (PITA)
then roll.

> > >  - fix segfault in prefork

wasn't ever a problem in 2.0.28

Greg

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Cliff Woolley <cl...@yahoo.com>.

On Sun, 11 Nov 2001, Ryan Bloom wrote:

> >  - fix infinite loop in mod_cgid
> >  - fix segfault in prefork

> >  - fix build problems on Win32 (mktemp)
> >  - fix multithreading problem on BSDi (-D_REENTRANT)
> >  - fix file cleanup problems in apr_proc_create

> Stop tagging 2.0.28.  Half the changes above have nothing to do with bugs
> in the server.  They are new features or improvements in the code.

Of course.  I never said the whole thing should be retagged.  Those first
two above would be really nice to have fixed, and the next three would be
kinda nice, but aren't crucial AFAIK.  But in any event, with or without
these things,

> Just release 2.0.28 already.

+1.

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Ryan Bloom <rb...@covalent.net>.

On Sunday 11 November 2001 11:39 am, Cliff Woolley wrote:
> On Sun, 11 Nov 2001, Cliff Woolley wrote:
> > IMO, it'd be better to tag 2.0.29 right now with a pretty stable tree
> > that's by all accounts better than 2.0.28.  It's that or get bug reports
> > for the next n weeks about things we've already fixed.  Just a thought.
>
> Scratch that.  I see 2.0.28 was never officially rolled (even though some
> of us saw a candidate tarball).  So retag 2.0.28 all you want, Greg, or
> don't.  Anyway, forget I mentioned the 2.0.29 thing.
>
> FYI, I just checked, and these are the changes that have been committed
> since 2.0.28 was last tagged:
>
>  - fix infinite loop in mod_cgid
>  - fix segfault in prefork
>  - fix build problems on Win32 (mktemp)
>  - fix multithreading problem on BSDi (-D_REENTRANT)
>  - fix file cleanup problems in apr_proc_create
>  - fix BeOS User/Group problem
>  - include for memcpy in apr/user/unix/userinfo.c
>  - fail if shared modules without mod_so
>  - warn if a module is loaded twice
>  - various performance optimizations
>  - ap_lingering_close changes
>  - apr-util dbm changes
>  - add debian layout
>  - various doc fixes

Stop tagging 2.0.28.  Half the changes above have nothing to do with bugs
in the server.  They are new features or improvements in the code.

If we bump the 2.0.28 tag for all of these, we would be better to just roll 2.0.29.
Just release 2.0.28 already.

Ryan

______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Cliff Woolley <cl...@yahoo.com>.

On Sun, 11 Nov 2001, Cliff Woolley wrote:

> IMO, it'd be better to tag 2.0.29 right now with a pretty stable tree
> that's by all accounts better than 2.0.28.  It's that or get bug reports
> for the next n weeks about things we've already fixed.  Just a thought.

Scratch that.  I see 2.0.28 was never officially rolled (even though some
of us saw a candidate tarball).  So retag 2.0.28 all you want, Greg, or
don't.  Anyway, forget I mentioned the 2.0.29 thing.

FYI, I just checked, and these are the changes that have been committed
since 2.0.28 was last tagged:

 - fix infinite loop in mod_cgid
 - fix segfault in prefork
 - fix build problems on Win32 (mktemp)
 - fix multithreading problem on BSDi (-D_REENTRANT)
 - fix file cleanup problems in apr_proc_create
 - fix BeOS User/Group problem
 - include for memcpy in apr/user/unix/userinfo.c
 - fail if shared modules without mod_so
 - warn if a module is loaded twice
 - various performance optimizations
 - ap_lingering_close changes
 - apr-util dbm changes
 - add debian layout
 - various doc fixes

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

RE: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Joshua Slive <jo...@slive.ca>.


> From: Cliff Woolley [mailto:cliffwoolley@yahoo.com]

> IMO, it'd be better to tag 2.0.29 right now with a pretty stable tree
> that's by all accounts better than 2.0.28.  It's that or get bug reports
> for the next n weeks about things we've already fixed.  Just a thought.

I can't argue with that, except to point out that we've been saying exactly
that for every release attempt for months.  So why not release 2.0.28, tag
2.0.29, and release that if it turns out okay.  A release every other day is
better than no release for months.

Of course, someone needs to do the actual work of roll+release, which is
obviously non-trivial.  Since I can't volunteer to do it myself at the
moment, I'll just shut up ;-)

Joshua.

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Cliff Woolley <cl...@yahoo.com>.

On Sun, 11 Nov 2001, Ryan Bloom wrote:

> The question we need to ask is simple.  Is 2.0.28 better than 2.0.16?
> If so, then release it.  If not, then don't.

Absolutely.  But,

> Get 2.0.28 released.  I'm about to make some major changes to the
> core, and it could destabilize 2.0.29 for a few days.

IMO, it'd be better to tag 2.0.29 right now with a pretty stable tree
that's by all accounts better than 2.0.28.  It's that or get bug reports
for the next n weeks about things we've already fixed.  Just a thought.

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Ryan Bloom <rb...@covalent.net>.

On Sunday 11 November 2001 10:00 am, Joshua Slive wrote:
> > From: Cliff Woolley [mailto:cliffwoolley@yahoo.com]
> >
> > Still, these rapid-fire tags are good, because they're forcing us to hunt
> > down these pesky bugs.  We're starting to converge on something actually
> > stable.  :)
>
> Sure, but it's only people on this list who are testing.  If we actually
> released something, we would have the help of many others.
>
> Do you know that more than 500 people downloaded 2.0.16 this morning.  All
> those people are wasting their time!

The question we need to ask is simple.  Is 2.0.28 better than 2.0.16?  If so, then
release it.  If not, then don't.

We can keep holding back releases until we think we have a GA quality
release.  That won't mean that we can release it as a GA release.  We can't
do that until it is being used on real live sites other than apache.org.

Get 2.0.28 released.  I'm about to make some major changes to the core, and
it could destabilize 2.0.29 for a few days.

Ryan
______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

RE: 2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Joshua Slive <jo...@slive.ca>.

> From: Cliff Woolley [mailto:cliffwoolley@yahoo.com]

> Still, these rapid-fire tags are good, because they're forcing us to hunt
> down these pesky bugs.  We're starting to converge on something actually
> stable.  :)

Sure, but it's only people on this list who are testing.  If we actually
released something, we would have the help of many others.

Do you know that more than 500 people downloaded 2.0.16 this morning.  All
those people are wasting their time!

Joshua.

2.0.29? (Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems)

Posted by Cliff Woolley <cl...@yahoo.com>.

On Sun, 11 Nov 2001, Justin Erenkrantz wrote:

> Dale, please let me know if this fixes it.  If so, we may want to
> bump the tag on this file for 2.0.28.  -- justin

We're rapidly approaching the point in time where it would be better to
start on 2.0.29 than to keep bumping tags on 2.0.28 -- I don't know about
you all, but I'm starting to lose track of what changes have been tagged
in since the original 2.0.28 and which ones haven't.

Still, these rapid-fire tags are good, because they're forcing us to hunt
down these pesky bugs.  We're starting to converge on something actually
stable.  :)

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: apparent mod_cgid bug Re: Two apache/2.0.29-dev problems

Posted by Justin Erenkrantz <je...@ebuilt.com>.

On Sun, Nov 11, 2001 at 08:30:48AM -0800, Brian Pane wrote:
> I think this bit of cgid_handler() is the problem:
>            /* Soak up all the script output */
>            while (apr_file_gets(argsbuffer, HUGE_STRING_LEN, tempsock) 
> > 0) {
>                continue;
>            }

Yup, that is completely bogus.  I just committed a fix that should
make apr_file_gets loop while apr_file_gets returns APR_SUCCESS.
To make it clearer, I also changed all of the apr_file_gets calls
to check APR_SUCCESS explicitly.  

Dale, please let me know if this fixes it.  If so, we may want to
bump the tag on this file for 2.0.28.  -- justin

apparent mod_cgid bug Re: Two apache/2.0.29-dev problems

Posted by Brian Pane <bp...@pacbell.net>.

Dale Ghent wrote:
[...]

>A pstack showed that LWP 7 (thread 22) was doing this:
>
>-----------------  lwp# 7 / thread# 22  --------------------
> ff11a814 _read    (466b40, fc803968, fc80186c, 0, 0, 0) + c
> ff33ea48 apr_file_gets (1, fc805967, 466b40, 466b40, 64206865, 61646572)
>+ 38
> 0006e478 cgid_handler (0, 0, 0, ffffbc00, 0, 3df838) + 4c8
>
[...]

I think this bit of cgid_handler() is the problem:
            /* Soak up all the script output */
            while (apr_file_gets(argsbuffer, HUGE_STRING_LEN, tempsock) 
 > 0) {
                continue;
            }

Your truss output showed read returning 0.  apr_file_gets should be
returning APR_EOF in this case.  But APR_EOF > 0, so we'll be stuck
in that loop forever.

--Brian

Re: Two apache/2.0.29-dev problems

Posted by Dale Ghent <da...@elemental.org>.

On Sat, 10 Nov 2001, Brian Pane wrote:

| And if that doesn't work, /usr/proc/bin/pstack {pid} will show
| stack traces for all the LWPs

ah, good point. I keep forgetting about those nifty /usr/proc/bin
programs.

Anyway, a truss on the run-away process showed this:

/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
/7:     read(82, 0xFC803968, 1)                         = 0
....
....
and so on.

A pstack showed that LWP 7 (thread 22) was doing this:

-----------------  lwp# 7 / thread# 22  --------------------
 ff11a814 _read    (466b40, fc803968, fc80186c, 0, 0, 0) + c
 ff33ea48 apr_file_gets (1, fc805967, 466b40, 466b40, 64206865, 61646572)
+ 38
 0006e478 cgid_handler (0, 0, 0, ffffbc00, 0, 3df838) + 4c8
 00084b28 ap_run_handler (3db280, 0, 3df8c0, 3df838, 1e, 3dbaa8) + 3c
 00084ff4 ap_invoke_handler (3db280, 143400, 0, fefe0d24, 3dba5c,
fefe0d37) + 1c
 0005a8fc ap_process_request (3db280, c8, 4, 3db280, fffffff8, 467e38) +
68
 00056918 ap_process_http_connection (467c68, 567f4, 177a50, 167994,
183500, 140
9e0) + 124
 0008e9a8 ap_run_process_connection (467c68, 467c68, 467b98, 10, 0,
140de8) + 3c
 000823ec process_socket (467b68, 467b98, 10, 10, 0, 0) + 90
 00082930 worker_thread (176c00, 0, 82888, 0, 0, 0) + a8
 ff343040 dummy_worker (18dcf8, ff155d18, 0, 5, 1, fe401000) + c
 ff05bad0 _thread_start (18dcf8, 0, 0, 0, 0, 0) + 40

Other threads were still functioning, handling requests and so forth,
albeit a bit slowly.

Eventually, the httpd process died with a SIGSEGV and left a core file,
the stack of which is here:

------------------------------------------------------------------
#0  0xff348ee0 in apr_pool_clear (a=0x2ab778) at apr_pools.c:957
957         free_blocks(a->first->h.next);
(gdb) where
#0  0xff348ee0 in apr_pool_clear (a=0x2ab778) at apr_pools.c:957
#1  0x97160 in core_output_filter (f=0x21bf20, b=0x0) at core.c:3217
#2  0x90108 in ap_pass_brigade (next=0x21bf20, bb=0x2892e0)
    at util_filter.c:276
#3  0x8ea64 in ap_lingering_close (dummy=0x21bca8) at connection.c:175
#4  0xff348d44 in run_cleanups (c=0x21bee8) at apr_pools.c:833
#5  0xff348ec8 in apr_pool_clear (a=0x21bba8) at apr_pools.c:949
#6  0xff348f24 in apr_pool_destroy (a=0x21bba8) at apr_pools.c:995
#7  0x8294c in worker_thread (thd=0x18db78, dummy=0x21bba8) at
worker.c:723
#8  0xff343048 in dummy_worker (opaque=0x18db78) at thread.c:122
(gdb) info thread
  4 LWP    2          0xff11ad54 in _signotifywait () from
/usr/lib/libc.so.1
  3 LWP    1          0xff119590 in _poll () from /usr/lib/libc.so.1
  2 LWP    4          0xff11b394 in ___lwp_cond_wait () from
/usr/lib/libc.so.1
* 1 LWP    3          0xff348ee0 in apr_pool_clear (a=0x2ab778)
    at apr_pools.c:957


HTH, /dale

Re: Two apache/2.0.29-dev problems

Posted by Brian Pane <bp...@pacbell.net>.

Justin Erenkrantz wrote:

>On Sat, Nov 10, 2001 at 09:21:48PM -0500, Dale Ghent wrote:
>
>>On Sat, 10 Nov 2001, Justin Erenkrantz wrote:
>>
>>| Can you reproduce this?  Would it be possible to attach to this via
>>| gdb and get a backtrace?  -- justin
>>
>>I tried that, but gdb is showing me only the main thread that's running
>>poll(). Is there a way to get it to show all threads?
>>
>
>To list all running threads: info thread
>To switch to a specific thread context: thread <#>
>

And if that doesn't work, /usr/proc/bin/pstack {pid} will show
stack traces for all the LWPs

--Brian

Re: Two apache/2.0.29-dev problems

Posted by Justin Erenkrantz <je...@ebuilt.com>.

On Sat, Nov 10, 2001 at 09:21:48PM -0500, Dale Ghent wrote:
> On Sat, 10 Nov 2001, Justin Erenkrantz wrote:
> 
> | Can you reproduce this?  Would it be possible to attach to this via
> | gdb and get a backtrace?  -- justin
> 
> I tried that, but gdb is showing me only the main thread that's running
> poll(). Is there a way to get it to show all threads?

To list all running threads: info thread
To switch to a specific thread context: thread <#>

HTH.  -- justin

Re: Two apache/2.0.29-dev problems

Posted by Dale Ghent <da...@elemental.org>.

On Sat, 10 Nov 2001, Justin Erenkrantz wrote:

| Can you reproduce this?  Would it be possible to attach to this via
| gdb and get a backtrace?  -- justin

I tried that, but gdb is showing me only the main thread that's running
poll(). Is there a way to get it to show all threads?

/dale

Re: Two apache/2.0.29-dev problems

Posted by Justin Erenkrantz <je...@ebuilt.com>.

On Sat, Nov 10, 2001 at 09:11:02PM -0500, Dale Ghent wrote:
> 
> I'm running HEAD as of earlier this afternoon on Solaris 8+sendfile, with
> the worker mpm.
> 
> I'm seeing two problems:
> 
> one, is that the httpd process (one, sometimes both of the children I have
> started) will spin up, eating CPU. A truss of these busy processes shows
> that there are some threads stuck in some sort of read() loop on one or
> more sockets, and all the read()s are returning 0 bytes. This persists
> until I kill the processes.

Can you reproduce this?  Would it be possible to attach to this via 
gdb and get a backtrace?  -- justin