You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Andre Breiler <an...@rd.bbc.co.uk> on 2003/03/02 23:25:01 UTC

ap2 , parent process in worker mpm dies under load

Hi,

the ap2 (2.0.43) parent process dies (but childs arn't) under load.
This is with worker mpm on solaris 8 (multiprocessor).

I wonder if anyone has the same problem.

What I have tried so far is:
- compile with cc & gcc
- use differnt thread implementations
- 32 & 64 bit
.
All seem to die with a SEGV or SIGBUS due to the fact that after returning
from a function call the registers have wrong values.

Is here anything I should look for ?

Thanx,
 Andre'
PS: cores,binaries,config etc. are available



Re: ap2 , parent process in worker mpm dies under load

Posted by Andre Breiler <an...@rd.bbc.co.uk>.
Hi

On Mon, 3 Mar 2003, Jeff Trawick wrote:

> Andre Breiler wrote:
> >
> > On Sun, 2 Mar 2003, Jeff Trawick wrote:
> >
> > >Andre Breiler wrote:
> > >
> > >
> > >>the ap2 (2.0.43) parent process dies (but childs arn't) under load.
> > >>This is with worker mpm on solaris 8 (multiprocessor).
> 
> ...
> 
> > --- snip 1 ---
> > program terminated by signal BUS (Bus Error)
> > 0xffffffff7ee2c880:
> > Current function is ap_wait_or_timeout
> >   222       rv = apr_proc_wait_all_procs(ret, exitcode, status, 
> > APR_NOWAIT, p);
> > (/tool/lang9.1/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where
> >   [1] 0xffffffff7ee2c880(0xffffffff0000080e, 0xffffffff00000836, 
> > 0xffffffff0000083a, 0x1, 0x100133e58, 0x0), at 0xffffffff7ee2c87f
> > =>[2] ap_wait_or_timeout(status = 0xffffffff0000083a, exitcode = 
> > 0xffffffff00000836, ret = 0xffffffff0000080e, p = 0x100133e58), line 
> > 222 in "mpm_common.c"
> > dbx: warning: invalid frame pointer
> > --- snap core.httpd.323.u0 ---
> 
> Can you reproduce the problem in 32-bit mode?

Yes but I don't have 32bit cores left so if you need them I'll see to get
some.

> If you run pstack on the core, does it display something similar?

Yes:
--- snip ---
core '../cores/core.httpd.323.u0' of 323:       /usr/local/apache2/bin/httpd -f /etc/httpd2.conf
 ffffffff7ee2c880 apr_proc_wait_all_procs (ffffffff0000080e, ffffffff00000836, ffffffff0000083a, 1, 100133e58, 0) + 50
 00000001000802f0 ap_wait_or_timeout (ffffffff0000083a, ffffffff00000836, ffffffff0000080e, 100133e58, 100167508, 0) + b8
 0000000100063430 server_main_loop (3c0000003f, 410000003d, 3e00000003, 0, 700000043, 800000002) + 98
--- snap ---

Thanx,
 Andre'
-- 
Andre' Breiler              | Tel: +44 (0) 1628 407777
BBC Internet Services       | URL: http://support.bbc.co.uk
Maiden House, Vanwell Road  |
Maidenhead, SL6 4UB         | Mail me if possible. And use a Subject line.


Re: ap2 , parent process in worker mpm dies under load

Posted by Jeff Trawick <tr...@attglobal.net>.
Andre Breiler wrote:

> Hi,
>
> On Sun, 2 Mar 2003, Jeff Trawick wrote:
>
>
> >Andre Breiler wrote:
> >
> >
> >>the ap2 (2.0.43) parent process dies (but childs arn't) under load.
> >>This is with worker mpm on solaris 8 (multiprocessor).

...

> --- snip 1 ---
> program terminated by signal BUS (Bus Error)
> 0xffffffff7ee2c880:
> Current function is ap_wait_or_timeout
>   222       rv = apr_proc_wait_all_procs(ret, exitcode, status, 
> APR_NOWAIT, p);
> (/tool/lang9.1/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where
>   [1] 0xffffffff7ee2c880(0xffffffff0000080e, 0xffffffff00000836, 
> 0xffffffff0000083a, 0x1, 0x100133e58, 0x0), at 0xffffffff7ee2c87f
> =>[2] ap_wait_or_timeout(status = 0xffffffff0000083a, exitcode = 
> 0xffffffff00000836, ret = 0xffffffff0000080e, p = 0x100133e58), line 
> 222 in "mpm_common.c"
> dbx: warning: invalid frame pointer
> --- snap core.httpd.323.u0 ---

Can you reproduce the problem in 32-bit mode?

If you run pstack on the core, does it display something similar?

Thanks!



Re: ap2 , parent process in worker mpm dies under load

Posted by Andre Breiler <an...@rd.bbc.co.uk>.
Hi,

On Sun, 2 Mar 2003, Jeff Trawick wrote:

> Andre Breiler wrote:
> 
> > the ap2 (2.0.43) parent process dies (but childs arn't) under load.
> > This is with worker mpm on solaris 8 (multiprocessor).
> 
> ...
> 
> > All seem to die with a SEGV or SIGBUS due to the fact that after returning
> > from a function call the registers have wrong values.
> 
> can you post backtraces from the coredumps?

Sure:
--- snip 1 ---
program terminated by signal BUS (Bus Error)
0xffffffff7ee2c880:     <bad address 0xffffffff7ee2c880>
Current function is ap_wait_or_timeout
  222       rv = apr_proc_wait_all_procs(ret, exitcode, status, APR_NOWAIT, p);
(/tool/lang9.1/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where 
  [1] 0xffffffff7ee2c880(0xffffffff0000080e, 0xffffffff00000836, 0xffffffff0000083a, 0x1, 0x100133e58, 0x0), at 0xffffffff7ee2c87f
=>[2] ap_wait_or_timeout(status = 0xffffffff0000083a, exitcode = 0xffffffff00000836, ret = 0xffffffff0000080e, p = 0x100133e58), line 222 in "mpm_common.c"
dbx: warning: invalid frame pointer
--- snap core.httpd.323.u0 ---

--- snip core.httpd.27942.u0 ---
program terminated by signal SEGV (Segmentation Fault)
0xffffffff7ee2c880:     <bad address 0xffffffff7ee2c880>
Current function is ap_wait_or_timeout
  222       rv = apr_proc_wait_all_procs(ret, exitcode, status, APR_NOWAIT, p);
(/tool/lang9.1/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where
current thread: t@1
  [1] 0xffffffff7ee2c880(0x97ffff9b8, 0x97ffff9e0, 0x97ffff9e4, 0x1, 0x100133e58, 0x0), at 0xffffffff7ee2c87f
=>[2] ap_wait_or_timeout(status = 0x97ffff9e4, exitcode = 0x97ffff9e0, ret = 0x97ffff9b8, p = 0x100133e58), line 222 in "mpm_common.c"
dbx: warning: invalid frame pointer
--- snap core.httpd.27942.u0 ---

--- snip core.httpd.26969.u0 ---
program terminated by signal SEGV (Segmentation Fault)
Current function is server_main_loop
 1645           perform_idle_server_maintenance();
(/tool/lang9.1/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where
current thread: t@1
=>[1] server_main_loop(remaining_children_to_start = 0), line 1645 in "worker.c"
  [2] ap_mpm_run(_pconf = 0x100133e58, plog = 0x10015dfa8, s = 0x1001634e8), line 1743 in "worker.c"
  [3] main(argc = 3, argv = 0xffffffff7ffffcf8), line 643 in "main.c"
--- snap core.httpd.26969.u0 ---

--- snip core.httpd.18016.u0 ---
program terminated by signal BUS (Bus Error)
Current function is server_main_loop
 1645           perform_idle_server_maintenance();
(/tool/lang9.1/SUNWspro/bin/../WS6U2/bin/sparcv9/dbx) where
current thread: t@1
=>[1] server_main_loop(remaining_children_to_start = 0), line 1645 in "worker.c"
  [2] ap_mpm_run(_pconf = 0x100133e58, plog = 0x10015dfa8, s = 0x100167508), line 1743 in "worker.c"
  [3] main(argc = 3, argv = 0xffffffff7ffffd08), line 643 in "main.c"
--- snap core.httpd.18016.u0 ---


The other cores are following the same pattern.

Bye Andre'
-- 
Andre' Breiler              | Tel: +44 (0) 1628 407777
BBC Internet Services       | URL: http://support.bbc.co.uk
Maiden House, Vanwell Road  |
Maidenhead, SL6 4UB         | Mail me if possible. And use a Subject line.


Re: ap2 , parent process in worker mpm dies under load

Posted by Jeff Trawick <tr...@attglobal.net>.
Andre Breiler wrote:

> Hi,
>
> the ap2 (2.0.43) parent process dies (but childs arn't) under load.
> This is with worker mpm on solaris 8 (multiprocessor).

...

> All seem to die with a SEGV or SIGBUS due to the fact that after returning
> from a function call the registers have wrong values.

can you post backtraces from the coredumps?