You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Cliff Woolley <cl...@yahoo.com> on 2001/11/12 09:53:26 UTC

segfault with worker, ap_lingering_close

If you run the httpd-test limits.t test with the worker MPM (on Linux at
least), you'll see that it hangs trying to perform subtest #2, though the
test doesn't hang with prefork.  It was kind of rough coaxing gdb into
telling me what was going on (we all know how well
linux+gdb+multithreading get along :-/ ), but anyway, here's the
backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x4003bfe0 in apr_pool_clear (a=0x8183ed4) at apr_pools.c:957
957         free_blocks(a->first->h.next);
(gdb) bt
#0  0x4003bfe0 in apr_pool_clear (a=0x8183ed4) at apr_pools.c:957
#1  0x80bee97 in core_output_filter (f=0x817a214, b=0x0) at core.c:3217
#2  0x80b8b65 in ap_pass_brigade (next=0x817a214, bb=0x817a264)
    at util_filter.c:276
#3  0x80b77ac in ap_flush_conn (c=0x8179f84) at connection.c:138
#4  0x80b7805 in ap_lingering_close (dummy=0x8179f84) at connection.c:175
#5  0x4003be2a in run_cleanups (c=0x817a244) at apr_pools.c:833
#6  0x4003bfbf in apr_pool_clear (a=0x8179e84) at apr_pools.c:949
#7  0x4003c02c in apr_pool_destroy (a=0x8179e84) at apr_pools.c:995
#8  0x80ad9dd in worker_thread (thd=0x815273c, dummy=0x81d9ad8) at
worker.c:723
#9  0x40036cbe in dummy_worker (opaque=0x815273c) at thread.c:122
#10 0x401d9065 in pthread_start_thread (arg=0xbf3ffc00) at manager.c:274


--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA




Re: segfault with worker, ap_lingering_close

Posted by Ryan Bloom <rb...@covalent.net>.
On Monday 12 November 2001 06:15 am, Dale Ghent wrote:
> On Mon, 12 Nov 2001, Cliff Woolley wrote:
> | If you run the httpd-test limits.t test with the worker MPM (on Linux at
> | least), you'll see that it hangs trying to perform subtest #2, though the
> | test doesn't hang with prefork.  It was kind of rough coaxing gdb into
> | telling me what was going on (we all know how well
> | linux+gdb+multithreading get along :-/ ), but anyway, here's the
> | backtrace:
>
> yeah, I've been seeing this with httpd+worker on solaris 8. I included a
> stack trace of it in my second mail to the list yesterday entitled "Two
> apache/2.0.29-dev problems"

My best guess is that occasionally, the socket is being cleaned before
lingering close is called.  I'll be modifying the order of that code today,
which may solve the problem.

Ryan

______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: segfault with worker, ap_lingering_close

Posted by Dale Ghent <da...@elemental.org>.
On Mon, 12 Nov 2001, Cliff Woolley wrote:

| On Mon, 12 Nov 2001, Dale Ghent wrote:
|
| > yeah, I've been seeing this with httpd+worker on solaris 8. I included a
| > stack trace of it in my second mail to the list yesterday entitled "Two
| > apache/2.0.29-dev problems"
|
| Ahh, so I see.  I didn't make the connection before.  :-/

If it helps any, I just got a core this morning that was caused by this
(it seems to be the only reason why I'm getting cores right now.) Here's a
full stack trace:

#0  0xff349024 in apr_pool_clear (a=0x3c6cd8) at apr_pools.c:957
957         free_blocks(a->first->h.next);

(gdb) where full
#0  0xff349024 in apr_pool_clear (a=0x3c6cd8) at apr_pools.c:957
No locals.
#1  0x9723c in core_output_filter (f=0x3bf000, b=0x0) at core.c:3217
        rv = 0
        c = (conn_rec *) 0x3bed88
        ctx = (core_output_filter_ctx_t *) 0x3bf040
#2  0x901e4 in ap_pass_brigade (next=0x3bf000, bb=0x2897f8)
    at util_filter.c:276
        e = (apr_bucket *) 0x2897f8
#3  0x8eb40 in ap_lingering_close (dummy=0x3bed88) at connection.c:175
        dummybuf =
"\000\000\0006\000\000\000\013\000\000\000\f\000\000\000\n\000\000\000e\000\000\000\001\000\000\001;\000\000\000\000ÿÿ¹°\000<\027¸\000<\fÐ\000\000\000\000\000\000\000\000ÿ5¦ô\000<L°\000<\f \000\000\000\000\0003H\210\000\000\031\217\000\000\000\000\000;í\210ÿ5¦ô\000\efÈ\000<TPý`¹Ü\000\000\000\004\000\000\000\n\000<U\eý`¹x\000\004U\220\000<O\210\000\000\000D\000\000ÿ\000\000\000\000\000ÿ3è\000\000\000\000\000ÿ3Ú\224\000;íÌ\000\000\000\000\000\000\000\000\000;ì\210\000;ïø\000\000\000H\000\ecÀ\000<TP\000<SÀ\000<Sp\000\035yX",
'\000' <repeats 33 times>...
        nbytes = 512
        rc = 3928064
        total_linger_time = 0
#4  0xff348e88 in run_cleanups (c=0x3befc8) at apr_pools.c:833
No locals.
#5  0xff34900c in apr_pool_clear (a=0x3bec88) at apr_pools.c:949
No locals.
#6  0xff349068 in apr_pool_destroy (a=0x3bec88) at apr_pools.c:995
        blok = (union block_hdr *) 0x3bec88
#7  0x82a28 in worker_thread (thd=0x18dce8, dummy=0x3bec88) at
worker.c:723
        process_slot = 0
        thread_slot = 6
        csd = (apr_socket_t *) 0x3becb8
        ptrans = (apr_pool_t *) 0x3bec88
        rv = 3927176
#8  0xff34312c in dummy_worker (opaque=0x18dce8) at thread.c:122
No locals.
(gdb)



Re: segfault with worker, ap_lingering_close

Posted by Cliff Woolley <cl...@yahoo.com>.
On Mon, 12 Nov 2001, Dale Ghent wrote:

> yeah, I've been seeing this with httpd+worker on solaris 8. I included a
> stack trace of it in my second mail to the list yesterday entitled "Two
> apache/2.0.29-dev problems"

Ahh, so I see.  I didn't make the connection before.  :-/

Thanks,
--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: segfault with worker, ap_lingering_close

Posted by Dale Ghent <da...@elemental.org>.
On Mon, 12 Nov 2001, Cliff Woolley wrote:

|
| If you run the httpd-test limits.t test with the worker MPM (on Linux at
| least), you'll see that it hangs trying to perform subtest #2, though the
| test doesn't hang with prefork.  It was kind of rough coaxing gdb into
| telling me what was going on (we all know how well
| linux+gdb+multithreading get along :-/ ), but anyway, here's the
| backtrace:

yeah, I've been seeing this with httpd+worker on solaris 8. I included a
stack trace of it in my second mail to the list yesterday entitled "Two
apache/2.0.29-dev problems"

/dale