You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Colm MacCárthaigh <co...@Redbrick.DCU.IE> on 2002/06/19 10:26:46 UTC

2.0.38-39 lockup problem ?

I havnt tracked down a cause yet, but this happens with the 2.0.38
tarball, the 2.0.39 one, and the current HEAD. I'm running Solaris
2.8 on sparc, 

colmmacc@prodigy (~) $ uname -a 
SunOS prodigy 5.8 Generic_108528-13 sun4u sparc SUNW,Ultra-4

it's a nice and beefy E450. I was running 2.0.36 with my patches for
mod_cgid and mod_include applied (now in 2.0.39) , no problems whatsoever,
I used:

./configure --prefix=/local/web \
--enable-ssl \
--with-ssl=/usr/local/ssl \
--enable-module=unique_id \
--enable-module=env \
--enable-mods-shared=MAX \
--enable-rewrite \
--enable-info --enable-proxy --enable-proxy-ftp --enable-proxy-http \
--enable-cgid \
--enable-suexec --with-suexec-caller=www --with-suexec-docroot=/local/web \
--with-mpm=worker \
--with-suexec-bin=/local/web/bin/suexec \
--with-suexec-logfile=/local/web/logs/suexec_log \
--with-suexec-userdir=public_html \
--with-suexec-uidmin=100 --with-suexec-gidmin=100 --with-suexec-umask=077


first of all, with worker:

Compiled in modules:
  core.c
  mod_access.c
  mod_auth.c
  mod_include.c
  mod_log_config.c
  mod_env.c
  mod_setenvif.c
  mod_proxy.c
  proxy_connect.c
  proxy_ftp.c
  proxy_http.c
  mod_ssl.c
  worker.c
  http_core.c
  mod_mime.c
  mod_status.c
  mod_autoindex.c
  mod_asis.c
  mod_info.c
  mod_suexec.c
  mod_cgid.c
  mod_negotiation.c
  mod_dir.c
  mod_imap.c
  mod_actions.c
  mod_userdir.c
  mod_alias.c
  mod_rewrite.c
  mod_so.c

load averages:  3.40,  3.47,  3.45                                                         07:19:47
217 processes: 177 sleeping, 2 running, 2 zombie, 34 stopped, 2 on cpu     
CPU states:  0.0% idle, 91.2% user,  8.8% kernel,  0.0% iowait,  0.0% swap
Memory: 2048M real, 31M free, 2009M swap in use, 1524M swap free

   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 18978 www        5   0    0  474M  415M cpu/3   91:43 45.04% httpd
  9847 www        4   0    0 1126M  961M run    279:03 44.73% httpd

happened within a few hours of having it running, took a look in /proc/NNNN/fd
, it didnt have anything too big open, truss -p revealed a very quick stream of
brk()'s

in case it was a worker mpm problem, recompiled last night with prefork

Compiled in modules:
  core.c
  mod_access.c
  mod_auth.c
  mod_include.c
  mod_log_config.c
  mod_env.c
  mod_setenvif.c
  mod_proxy.c
  proxy_connect.c
  proxy_ftp.c
  proxy_http.c
  mod_ssl.c
  prefork.c
  http_core.c
  mod_mime.c
  mod_status.c
  mod_autoindex.c
  mod_asis.c
  mod_info.c
  mod_suexec.c
  mod_cgi.c
  mod_negotiation.c
  mod_dir.c
  mod_imap.c
  mod_actions.c
  mod_userdir.c
  mod_alias.c
  mod_rewrite.c
  mod_so.c

and :

load averages: 19.79, 21.14, 19.19                                                       01:43:00
291 processes: 229 sleeping, 19 running, 3 zombie, 38 stopped, 2 on cpu       
CPU states:  0.0% idle, 95.9% user,  4.1% kernel,  0.0% iowait,  0.0% swap
Memory: 2048M real, 729M free, 1126M swap in use, 2393M swap free

   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 28884 www        3  22    0   52M   47M run      4:12  6.89% httpd
 17682 www        3  23    0   52M   47M run      4:12  6.57% httpd 
   511 www        3  22    0   52M   47M run      4:09  6.56% httpd
 29712 www        3  32    0   52M   48M cpu/1    4:16  6.50% httpd
 23149 www        3  23    0   52M   47M run      4:11  6.45% httpd
 26294 www        3  32    0   53M   48M run      4:16  6.42% httpd
   522 www        3  23    0   52M   47M run      4:13  6.41% httpd
   518 www        3  23    0   52M   47M run      4:13  6.38% httpd
 26313 www        3  32    0   53M   49M run      4:18  6.37% httpd
 29663 www        3  33    0   52M   47M run      4:10  6.35% httpd
   515 www        3  23    0   53M   48M run      4:19  6.34% httpd
   514 www        3  32    0   52M   47M run      4:13  6.33% httpd
   521 www        3  32    0   53M   47M run      4:15  6.17% httpd
   519 www        3  32    0   52M   47M run      4:14  6.07% httpd
   520 www        3  32    0   51M   46M run      4:08  6.07% httpd


once again, truss reveals a whole big pile of brk()'s, I wasnt around
to kill these, so I didnt to see much more.

So I'm now rebuilding without SSL, because I had some ssl related problems
months ago. But Just to make the list aware that there may be issues. Is
anyone else on Solaris seeing anything like this ? seems odd, 2.0.36 was
running without any problems at all. 

I don't think disabling ssl will fix it,  so I'm going to wait for it
again, and try and gather as much as I can. 

-- 
colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie  
Web:                                 http://devnull.redbrick.dcu.ie/ 

Re: 2.0.38-39 lockup problem ?

Posted by Bill Stoddard <bi...@wstoddard.com>.
Just a hunch but you might try compiling in mod_cgid (with worker) from 2.0.36.

Bill

>
> I havnt tracked down a cause yet, but this happens with the 2.0.38
> tarball, the 2.0.39 one, and the current HEAD. I'm running Solaris
> 2.8 on sparc,
>
> colmmacc@prodigy (~) $ uname -a
> SunOS prodigy 5.8 Generic_108528-13 sun4u sparc SUNW,Ultra-4
>
> it's a nice and beefy E450. I was running 2.0.36 with my patches for
> mod_cgid and mod_include applied (now in 2.0.39) , no problems whatsoever,
> I used:
>
> ./configure --prefix=/local/web \
> --enable-ssl \
> --with-ssl=/usr/local/ssl \
> --enable-module=unique_id \
> --enable-module=env \
> --enable-mods-shared=MAX \
> --enable-rewrite \
> --enable-info --enable-proxy --enable-proxy-ftp --enable-proxy-http \
> --enable-cgid \
> --enable-suexec --with-suexec-caller=www --with-suexec-docroot=/local/web \
> --with-mpm=worker \
> --with-suexec-bin=/local/web/bin/suexec \
> --with-suexec-logfile=/local/web/logs/suexec_log \
> --with-suexec-userdir=public_html \
> --with-suexec-uidmin=100 --with-suexec-gidmin=100 --with-suexec-umask=077
>
>
> first of all, with worker:
>
> Compiled in modules:
>   core.c
>   mod_access.c
>   mod_auth.c
>   mod_include.c
>   mod_log_config.c
>   mod_env.c
>   mod_setenvif.c
>   mod_proxy.c
>   proxy_connect.c
>   proxy_ftp.c
>   proxy_http.c
>   mod_ssl.c
>   worker.c
>   http_core.c
>   mod_mime.c
>   mod_status.c
>   mod_autoindex.c
>   mod_asis.c
>   mod_info.c
>   mod_suexec.c
>   mod_cgid.c
>   mod_negotiation.c
>   mod_dir.c
>   mod_imap.c
>   mod_actions.c
>   mod_userdir.c
>   mod_alias.c
>   mod_rewrite.c
>   mod_so.c
>
> load averages:  3.40,  3.47,  3.45
07:19:47
> 217 processes: 177 sleeping, 2 running, 2 zombie, 34 stopped, 2 on cpu
> CPU states:  0.0% idle, 91.2% user,  8.8% kernel,  0.0% iowait,  0.0% swap
> Memory: 2048M real, 31M free, 2009M swap in use, 1524M swap free
>
>    PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>  18978 www        5   0    0  474M  415M cpu/3   91:43 45.04% httpd
>   9847 www        4   0    0 1126M  961M run    279:03 44.73% httpd
>
> happened within a few hours of having it running, took a look in /proc/NNNN/fd
> , it didnt have anything too big open, truss -p revealed a very quick stream of
> brk()'s
>
> in case it was a worker mpm problem, recompiled last night with prefork
>
> Compiled in modules:
>   core.c
>   mod_access.c
>   mod_auth.c
>   mod_include.c
>   mod_log_config.c
>   mod_env.c
>   mod_setenvif.c
>   mod_proxy.c
>   proxy_connect.c
>   proxy_ftp.c
>   proxy_http.c
>   mod_ssl.c
>   prefork.c
>   http_core.c
>   mod_mime.c
>   mod_status.c
>   mod_autoindex.c
>   mod_asis.c
>   mod_info.c
>   mod_suexec.c
>   mod_cgi.c
>   mod_negotiation.c
>   mod_dir.c
>   mod_imap.c
>   mod_actions.c
>   mod_userdir.c
>   mod_alias.c
>   mod_rewrite.c
>   mod_so.c
>
> and :
>
> load averages: 19.79, 21.14, 19.19
01:43:00
> 291 processes: 229 sleeping, 19 running, 3 zombie, 38 stopped, 2 on cpu
> CPU states:  0.0% idle, 95.9% user,  4.1% kernel,  0.0% iowait,  0.0% swap
> Memory: 2048M real, 729M free, 1126M swap in use, 2393M swap free
>
>    PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
>  28884 www        3  22    0   52M   47M run      4:12  6.89% httpd
>  17682 www        3  23    0   52M   47M run      4:12  6.57% httpd
>    511 www        3  22    0   52M   47M run      4:09  6.56% httpd
>  29712 www        3  32    0   52M   48M cpu/1    4:16  6.50% httpd
>  23149 www        3  23    0   52M   47M run      4:11  6.45% httpd
>  26294 www        3  32    0   53M   48M run      4:16  6.42% httpd
>    522 www        3  23    0   52M   47M run      4:13  6.41% httpd
>    518 www        3  23    0   52M   47M run      4:13  6.38% httpd
>  26313 www        3  32    0   53M   49M run      4:18  6.37% httpd
>  29663 www        3  33    0   52M   47M run      4:10  6.35% httpd
>    515 www        3  23    0   53M   48M run      4:19  6.34% httpd
>    514 www        3  32    0   52M   47M run      4:13  6.33% httpd
>    521 www        3  32    0   53M   47M run      4:15  6.17% httpd
>    519 www        3  32    0   52M   47M run      4:14  6.07% httpd
>    520 www        3  32    0   51M   46M run      4:08  6.07% httpd
>
>
> once again, truss reveals a whole big pile of brk()'s, I wasnt around
> to kill these, so I didnt to see much more.
>
> So I'm now rebuilding without SSL, because I had some ssl related problems
> months ago. But Just to make the list aware that there may be issues. Is
> anyone else on Solaris seeing anything like this ? seems odd, 2.0.36 was
> running without any problems at all.
>
> I don't think disabling ssl will fix it,  so I'm going to wait for it
> again, and try and gather as much as I can.
>
> --
> colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie
> Web:                                 http://devnull.redbrick.dcu.ie/
>


Re: 2.0.38-39 lockup problem ?

Posted by Colm MacCárthaigh <co...@Redbrick.DCU.IE>.
On Wed, Jun 19, 2002 at 05:09:28PM +0100, Colm MacCárthaigh wrote:
> In the meantime I'm continueing to trace and living without a significant
> portion of vhosts.

recursion, ouch.

o.k., problem is that after all of the rewriting, apache hasnt found
anything, or a vhost, so it goes to return /error/HTTP_BAD_REQUEST.html.var ,
this request gets reinserted and  gets caught by the original rules, once 
again the host still isnt there, triggering ... /error/HTTP_BAD_REQUEST.html.var , 
and so on and so forth. 


*goes to mangle rewrite rules*

Anyone any idea why this suddenly happend witjh 2.0.39 ?

--
colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie  
Web:                                 http://devnull.redbrick.dcu.ie/ 

Re: 2.0.38-39 lockup problem ?

Posted by Colm MacCárthaigh <co...@Redbrick.DCU.IE>.
On Wed, Jun 19, 2002 at 11:13:48AM -0400, Cliff Woolley wrote:
> On Wed, 19 Jun 2002, [iso-8859-1] Colm MacCárthaigh wrote:
> 
> > will do, I've abondonded the rebuild without ssl, I just want this
> > to happen again. Generally takes 3-4 hours to pop up, expect feedback
> > later.
> 
> FWIW, I haven't seen anything like this on 2.0.38 or 2.0.39 on icarus.
> Not that a problem doesn't exist, just as a data point.

O.k., I've isolated the problem, by using netcat to see what was going to 
the server. The problem happens when the server gets queried with a Host: header 
that it doesnt know about. 

Turned off virtual hosts , didnt help. Turned off mod_rewrite, problem
went away. I'm redirecting some things based on Host.

Isolated to the following lines :

RewriteCond   %{HTTP_HOST}                      !^(.*)\,redbrick(.*)$ [NC]
RewriteRule   ^(.+)$                             %{HTTP_HOST}$1     [C]
RewriteRule   ^(.*):([0-9]+)/(.*)$              $1/$3            [C]
RewriteRule   ^(.+)/~(.*)$			http://www.redbrick.dcu.ie/~$2 [R,L]


well, more specifically, the second line of all those those.

I also have:

RewriteMap lowercase int:tolower

RewriteCond   %{HTTP_HOST}                 ^.*\.redbrick\.dcu\.ie$ [NC]
RewriteCond   %{HTTP_HOST}                 !^w*\.redbrick\.dcu\.ie$   [NC]
RewriteCond   %{HTTP_HOST}                 !^enigma\.redbrick\.dcu\.ie$   [NC]
RewriteCond   %{HTTP_HOST}                 !^prodigy\.redbrick\.dcu\.ie$   [NC]
RewriteCond   %{HTTP_HOST}                 !^lists\.redbrick\.dcu\.ie$   [NC]
RewriteCond   %{HTTP_HOST}                 !^mailman\.redbrick\.dcu\.ie$   [NC]
RewriteRule   ^(.+)                        ${lowercase:%{HTTP_HOST}}$1          [C]

at the start of the same rewrites config file.

Just 

RewriteCond   %{HTTP_HOST}                      !^(.*)\,redbrick(.*)$ [NC]
RewriteRule   ^(.+)$                             %{HTTP_HOST}$1     [C]

on their own will cause the problem. Changing the rule to:

RewriteRule banana	monkey

makes it go away, so it isnt the condition. Tried the following rules:

RewriteRule   ^(.+)$                             %{HTTP_HOST}$1
RewriteRule   ^(.*)$                             %{HTTP_HOST}$1
RewriteRule   (.*)                               %{HTTP_HOST}$1
RewriteRule   .*                             	 %{HTTP_HOST}${REQUEST_URI}

all produce the problem, nothing suspsicious is revealed in turning
rewrite logging on/up.

In the meantime I'm continueing to trace and living without a significant
portion of vhosts.

-- 
colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie  
Web:                                 http://devnull.redbrick.dcu.ie/ 

Re: 2.0.38-39 lockup problem ?

Posted by Cliff Woolley <jw...@virginia.edu>.
On Wed, 19 Jun 2002, [iso-8859-1] Colm MacC�rthaigh wrote:

> will do, I've abondonded the rebuild without ssl, I just want this
> to happen again. Generally takes 3-4 hours to pop up, expect feedback
> later.

FWIW, I haven't seen anything like this on 2.0.38 or 2.0.39 on icarus.
Not that a problem doesn't exist, just as a data point.

--Cliff


Re: 2.0.38-39 lockup problem ?

Posted by Colm MacCárthaigh <co...@Redbrick.DCU.IE>.
On Wed, Jun 19, 2002 at 01:32:00AM -0700, Justin Erenkrantz wrote:
> On Wed, Jun 19, 2002 at 09:26:46AM +0100, Colm MacCárthaigh wrote:
> > 
> > I havnt tracked down a cause yet, but this happens with the 2.0.38
> > tarball, the 2.0.39 one, and the current HEAD. I'm running Solaris
> > 2.8 on sparc, 
> 
> If you can, please run pstack on the running processes and report
> back where the process is.  -- justin

will do, I've abondonded the rebuild without ssl, I just want this
to happen again. Generally takes 3-4 hours to pop up, expect feedback
later.

-- 
colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie  
Web:                                 http://devnull.redbrick.dcu.ie/ 

Re: 2.0.38-39 lockup problem ?

Posted by "Paul J. Reder" <re...@remulak.net>.
Sorry, didn't mean to sow seeds of confusion, that problem was fixed by
an SSL specific fix (as Ryan mentioned). The backtrace just looked
really familiar, and the scenario sounded very close. I was just
wondering if there were a higher level problem that was being tickled in
different ways. I just thought there was no sense fixing each of the
symptoms if there is a bigger problem, but Ryan pointed out that it
only *looked* the same. So, never mind...

Greg Ames wrote:

> "Paul J. Reder" wrote:
> 
>>This looks exactly like the problem that Allan and I ran into when you
>>tried to send a request to http://foo.bar.org:443 (i.e. insecure request
>>over the secure port). It tried to generate an error and went into an
>>infinte loop. 
>>
> 
> Can you try that with current HEAD and let us know what happens?
> 
> Thanks,
> Greg
> 
> 
> 


-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein



RE: 2.0.38-39 lockup problem ?

Posted by Ryan Bloom <rb...@covalent.net>.
> From: gregames [mailto:gregames] On Behalf Of Greg Ames
> 
> "Paul J. Reder" wrote:
> >
> > This looks exactly like the problem that Allan and I ran into when
you
> > tried to send a request to http://foo.bar.org:443 (i.e. insecure
request
> > over the secure port). It tried to generate an error and went into
an
> > infinte loop.
> 
> Can you try that with current HEAD and let us know what happens?

While the problems were similar, they were not caused by the same code,
and the SSL problem would not have been touched at all by this patch.
The problem with mod_ssl was fixed a while ago, and it never touched the
ap_die code.  Basically, in mod_ssl's post_read_request phase, we check
for a flag to determine if the request was HTTP over the HTTPS port.
The problem is that if we find the flag, then we do an internal
redirect, which also runs the post_read_request function.  Because the
flag wasn't cleared, we just returned another error, which caused
another internal_redirect, which ran the same phase, etc.

Ryan



Re: 2.0.38-39 lockup problem ?

Posted by Greg Ames <gr...@apache.org>.
"Paul J. Reder" wrote:
> 
> This looks exactly like the problem that Allan and I ran into when you
> tried to send a request to http://foo.bar.org:443 (i.e. insecure request
> over the secure port). It tried to generate an error and went into an
> infinte loop. 

Can you try that with current HEAD and let us know what happens?

Thanks,
Greg

Re: 2.0.38-39 lockup problem ?

Posted by "Paul J. Reder" <re...@remulak.net>.
This looks exactly like the problem that Allan and I ran into when you
tried to send a request to http://foo.bar.org:443 (i.e. insecure request
over the secure port). It tried to generate an error and went into an
infinte loop. That special case was fixed by removing the SSL request
handler from the loop. This may be a bigger problem than that specific
case (or this one).

I seem to recall that the error gets created as a new request in the
ap_internal_redirect case. This seemed to cause the code to process it
from scratch. This prompted the same error to occur, etc.

Perhaps something needs to be setup or tagged differently so that the
error doesn't go back through all the same code that generated the
error condition in the first place. Just a clueless guess on my part.

Paul J. Reder

Greg Ames wrote:

> Colm MacCárthaigh wrote:
> 
> 
>>also; anyone looking to replicate, I can produce it from a vanilla
>>install by adding:
>>
>>RewriteCond     %{HTTP_HOST}    !monkey$        [NC]
>>RewriteRule     ^(.+)$          banana
>>
>>at the bottom of the standard httpd.conf , a request with a host
>>header anything other than monkey will hang the server. I've
>>just rebuilt  2.0.36 and tested against the same vailla config
>>and it doesnt happen.
>>
> 
> OK, with this config I can easily duplicate the problem on .39 but not .36. 
> ap_die lost its recursive error defense mechanism.  
> 
> We get there because mod_rewrite sees that there is a rule that maps any uri
> (including the error document uri) to "banana", then calls
> ap_os_is_path_absolute("banana").  It isn't absolute, so mod_rewrite sets a 400
> error and bails.  In .36, the brower gets the canned error strings that
> indicates recursive 400 errors.  In .39, we get:
> 
> #0  ap_die (type=400, r=0x860ddb8) at http_request.c:130
> #1  0x08066d59 in ap_internal_redirect (
>     new_uri=0x80e36c0 "/error/HTTP_BAD_REQUEST.html.var", r=0x85e2d10)
>     at http_request.c:482
> #2  0x08066675 in ap_die (type=400, r=0x85e2d10) at http_request.c:187
> #3  0x08066d59 in ap_internal_redirect (
>     new_uri=0x80e36c0 "/error/HTTP_BAD_REQUEST.html.var", r=0x85b9af8)
>     at http_request.c:482
> #4  0x08066675 in ap_die (type=400, r=0x85b9af8) at http_request.c:187
> #5  0x08066d59 in ap_internal_redirect (
>     new_uri=0x80e36c0 "/error/HTTP_BAD_REQUEST.html.var", r=0x858e590)
>     at http_request.c:482
> 
> etc etc
> 
> I just committed a fix to http_protocol.c::ap_die which backs out part of rev
> 1.145.  This restores the recursive error protection which is pretty important
> IMO.  We can rework the fix to 1.145 if we need to.
> 
> Greg
> 
> 
> 


-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein



Re: 2.0.38-39 lockup problem ?

Posted by Greg Ames <gr...@apache.org>.
Colm MacCárthaigh wrote:

> also; anyone looking to replicate, I can produce it from a vanilla
> install by adding:
> 
> RewriteCond     %{HTTP_HOST}    !monkey$        [NC]
> RewriteRule     ^(.+)$          banana
> 
> at the bottom of the standard httpd.conf , a request with a host
> header anything other than monkey will hang the server. I've
> just rebuilt  2.0.36 and tested against the same vailla config
> and it doesnt happen.

OK, with this config I can easily duplicate the problem on .39 but not .36. 
ap_die lost its recursive error defense mechanism.  

We get there because mod_rewrite sees that there is a rule that maps any uri
(including the error document uri) to "banana", then calls
ap_os_is_path_absolute("banana").  It isn't absolute, so mod_rewrite sets a 400
error and bails.  In .36, the brower gets the canned error strings that
indicates recursive 400 errors.  In .39, we get:

#0  ap_die (type=400, r=0x860ddb8) at http_request.c:130
#1  0x08066d59 in ap_internal_redirect (
    new_uri=0x80e36c0 "/error/HTTP_BAD_REQUEST.html.var", r=0x85e2d10)
    at http_request.c:482
#2  0x08066675 in ap_die (type=400, r=0x85e2d10) at http_request.c:187
#3  0x08066d59 in ap_internal_redirect (
    new_uri=0x80e36c0 "/error/HTTP_BAD_REQUEST.html.var", r=0x85b9af8)
    at http_request.c:482
#4  0x08066675 in ap_die (type=400, r=0x85b9af8) at http_request.c:187
#5  0x08066d59 in ap_internal_redirect (
    new_uri=0x80e36c0 "/error/HTTP_BAD_REQUEST.html.var", r=0x858e590)
    at http_request.c:482

etc etc

I just committed a fix to http_protocol.c::ap_die which backs out part of rev
1.145.  This restores the recursive error protection which is pretty important
IMO.  We can rework the fix to 1.145 if we need to.

Greg

Re: 2.0.38-39 lockup problem ?

Posted by Colm MacCárthaigh <co...@Redbrick.DCU.IE>.
On Wed, Jun 19, 2002 at 07:13:37PM +0100, Colm MacCárthaigh wrote:
> No idea, but I can tell you that it's definitely the mod_rewrite stuff
> that's causing the problem, I can produce it demand now, by taking
> out the line.

also; anyone looking to replicate, I can produce it from a vanilla
install by adding:


RewriteCond     %{HTTP_HOST}    !monkey$        [NC]
RewriteRule     ^(.+)$          banana

at the bottom of the standard httpd.conf , a request with a host
header anything other than monkey will hang the server. I've
just rebuilt  2.0.36 and tested against the same vailla config
and it doesnt happen. 

-- 
colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie  
Web:                                 http://devnull.redbrick.dcu.ie/ 

Re: 2.0.38-39 lockup problem ?

Posted by Colm MacCárthaigh <co...@Redbrick.DCU.IE>.
On Wed, Jun 19, 2002 at 01:47:20PM -0400, Greg Ames wrote:
> Do you mean it never creates a corefile when you kill it?

I got bet to restarting the server, so never got a chance to kill
it myself.

> If so, could you gcore it or attach to it with a debugger while it's looping?  

well I've fixed the problem by adding :

RewriteCond   %{REQUEST_URI} !/error/HTTP_BAD_REQUEST.html.var

but I've done it , I can create cores if you really want them. 
To save mail, the relevant details are here:

  output of pstack and gdb 
    http://redbrick.dcu.ie/~colmmacc/share/typescript	

  the rewrite log for the request
   http://redbrick.dcu.ie/~colmmacc/share/rewrite.log 

  The relevant configuration file
   http://redbrick.dcu.ie/~colmmacc/share/redbrick_rewrites.conf

The query I tried was:

GET / HTTP/1.0
Host: madeupcrap
\r\n

I cut the log short, because that one request logged over 1.3Mb of crud.

> Obviously there's
> out-of-control recursion going on, but the pstack output looks a bit suspect
> (ap_internal_redirect never calls itself directly, and where's mod_rewrite?).

No idea, but I can tell you that it's definitely the mod_rewrite stuff
that's causing the problem, I can produce it demand now, by taking
out the line.

-- 
colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie  
Web:                                 http://devnull.redbrick.dcu.ie/ 

Re: 2.0.38-39 lockup problem ?

Posted by Greg Ames <gr...@apache.org>.
Colm MacCárthaigh wrote:
> 
> On Wed, Jun 19, 2002 at 02:42:54PM +0100, Colm MacCárthaigh wrote:
> > I havnt killed it just yet, and if anyone has anything else for
> > me to try within the next few minutes mail me. After that I'll
> > kill it, but with SEGV, and keep a corefile.
> 
> well it's dead now, but no corefile. It got nailed by a server
> restart first. Did get this though:

Do you mean it never creates a corefile when you kill it?  If so, could you
gcore it or attach to it with a debugger while it's looping?  Obviously there's
out-of-control recursion going on, but the pstack output looks a bit suspect
(ap_internal_redirect never calls itself directly, and where's mod_rewrite?).

Greg

Re: 2.0.38-39 lockup problem ?

Posted by Colm MacCárthaigh <co...@Redbrick.DCU.IE>.
On Wed, Jun 19, 2002 at 02:42:54PM +0100, Colm MacCárthaigh wrote:
> I havnt killed it just yet, and if anyone has anything else for
> me to try within the next few minutes mail me. After that I'll
> kill it, but with SEGV, and keep a corefile.

well it's dead now, but no corefile. It got nailed by a server
restart first. Did get this though:

root@prodigy (~) # pfiles 28021
28021:  /local/web/bin/httpd -k start -DSSL
  Current rlimit: 256 file descriptors
   0: S_IFCHR mode:0666 dev:85,0 ino:413987 uid:0 gid:0 rdev:13,2
      O_RDONLY
   1: S_IFCHR mode:0666 dev:85,0 ino:413987 uid:0 gid:0 rdev:13,2
      O_WRONLY
   2: S_IFREG mode:0644 dev:85,3 ino:413843 uid:0 gid:0 size:33349027
      O_RDWR|O_APPEND
   3: S_IFSOCK mode:0666 dev:229,0 ino:42073 uid:0 gid:0 size:0
      O_RDWR
        sockname: AF_INET6 ::  port: 80
   4: S_IFDOOR mode:0444 dev:234,0 ino:59569 uid:0 gid:0 size:0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[180]
   5: S_IFSOCK mode:0666 dev:229,0 ino:56304 uid:0 gid:0 size:0
      O_RDWR
        sockname: AF_INET6 ::  port: 443
   6: S_IFREG mode:0644 dev:85,3 ino:413840 uid:0 gid:0 size:5691392
      O_WRONLY|O_APPEND
   7: S_IFREG mode:0644 dev:85,3 ino:413840 uid:0 gid:0 size:5691392
      O_WRONLY|O_APPEND
   8: S_IFREG mode:0644 dev:85,3 ino:413902 uid:0 gid:0 size:0
      O_WRONLY|O_APPEND
   9: S_IFIFO mode:0000 dev:230,0 ino:24137914 uid:0 gid:0 size:1
      O_RDWR|O_NONBLOCK
  10: S_IFIFO mode:0000 dev:230,0 ino:24137914 uid:0 gid:0 size:0
      O_RDWR
  11: S_IFREG mode:0644 dev:85,3 ino:413843 uid:0 gid:0 size:33349027
      O_RDWR|O_APPEND
  12: S_IFREG mode:0644 dev:85,3 ino:413900 uid:0 gid:0 size:801
      O_RDWR|O_APPEND
  13: S_IFSOCK mode:0666 dev:229,0 ino:570 uid:0 gid:0 size:0
      O_RDWR|O_NONBLOCK
        sockname: AF_INET6 ::ffff:136.206.15.10  port: 80
        peername: AF_INET6 ::ffff:194.170.1.130  port: 48364
root@prodigy (~) # grep 194.170.1.130 ~www/logs/* 
root@prodigy (~) # netstat -an | grep 194.170.1.130
136.206.15.10.80     194.170.1.130.48364   1024      0 24616      0 ESTABLISHED

So I can't even tell what URI was being requested. 

-- 
colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie  
Web:                                 http://devnull.redbrick.dcu.ie/ 

Re: 2.0.38-39 lockup problem ?

Posted by Colm MacCárthaigh <co...@Redbrick.DCU.IE>.
On Wed, Jun 19, 2002 at 01:32:00AM -0700, Justin Erenkrantz wrote:
> On Wed, Jun 19, 2002 at 09:26:46AM +0100, Colm MacCárthaigh wrote:
> > 
> > I havnt tracked down a cause yet, but this happens with the 2.0.38
> > tarball, the 2.0.39 one, and the current HEAD. I'm running Solaris
> > 2.8 on sparc, 
> 
> If you can, please run pstack on the running processes and report
> back where the process is.  -- justin

right here we go, happened again, prefork mpm, so mod_cgi. 

load averages:  2.07,  1.94,  1.72 14:38:42
468 processes: 421 sleeping, 2 running, 2 zombie, 41 stopped, 2 on cpu
CPU states: 37.9% idle, 56.3% user,  4.5% kernel,  1.3% iowait,  0.0%
swap
Memory: 2048M real, 1284M free, 578M swap in use, 2944M swap free

   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 28021 www        3   0    0   19M   14M cpu/1    0:37 38.95% httpd


apologies for the length, output of a pstack, and some of a truss -p
is included.

Script started on Wed Jun 19 14:38:59 2002
root@prodigy (~) # 
root@prodigy (~) # 
root@prodigy (~) # 
root@prodigy (~) # pstack 28021
28021:	/local/web/bin/httpd -k start -DSSL
-----------------  lwp# 1 / thread# 1  --------------------
 fee4f06c strcasecmp (10c7f88, 10dd280, 1ea2f0, 40c, ffbeb9fc, ffbeb980) + 38
 00057a3c rename_original_env (10c7f88, 10748e0, 1e9c58, 91c4a112, 1eaac8, 10c7cb0) + 68
 00057b58 internal_internal_redirect (132270, 1074480, 0, 1, 132270, c3129) + fc
 00057e6c ap_internal_redirect (132270, 1074480, 190, 1, 132270, c3129) + 8
 00057eb8 ap_internal_redirect (1074480, 1022068, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1022068, fd0f28, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (fd0f28, f810c0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (f810c0, f32258, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (f32258, ee46e8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (ee46e8, e98288, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (e98288, e4d160, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (e4d160, e030b8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (e030b8, db9ca0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (db9ca0, d71e00, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (d71e00, d2b490, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (d2b490, ce5bb0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (ce5bb0, ca1828, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (ca1828, c5e168, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (c5e168, c1bbc8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (c1bbc8, bdaac8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (bdaac8, b9ad68, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (b9ad68, b5bd70, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (b5bd70, b1df50, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (b1df50, ae1130, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (ae1130, aa5350, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (aa5350, a6a520, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (a6a520, a308d8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (a308d8, 9f7db0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (9f7db0, 9bff50, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (9bff50, 989548, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (989548, 953848, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (953848, 91ecb0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (91ecb0, 8eb148, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (8eb148, 8b6700, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (8b6700, 885ad0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (885ad0, 855cc8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (855cc8, 826fa8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (826fa8, 7f8ef0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (7f8ef0, 7cbaa0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (7cbaa0, 79f738, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (79f738, 774380, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (774380, 74a150, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (74a150, 720b90, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (720b90, 6f8420, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (6f8420, 6d0b50, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (6d0b50, 6a9ed0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (6a9ed0, 683cf0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (683cf0, 65e888, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (65e888, 63a128, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (63a128, 616868, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (616868, 5f3bf8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (5f3bf8, 5d1b70, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (5d1b70, 5b05d0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (5b05d0, 58fe80, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (58fe80, 570248, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (570248, 5510b8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (5510b8, 532d60, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (532d60, 515710, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (515710, 4f86d8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (4f86d8, 4dc258, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (4dc258, 4c0b98, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (4c0b98, 4a6280, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (4a6280, 48c218, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (48c218, 472d18, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (472d18, 45a460, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (45a460, 4420f0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (4420f0, 42a940, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (42a940, 413cb0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (413cb0, 3fd810, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (3fd810, 3e7c00, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (3e7c00, 3d2a98, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (3d2a98, 3be498, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (3be498, 3aa740, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (3aa740, 3973e8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (3973e8, 384960, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (384960, 372918, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (372918, 361188, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (361188, 350038, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (350038, 33f7c0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (33f7c0, 32f780, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (32f780, 31ff78, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (31ff78, 311048, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (311048, 3026c8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (3026c8, 2f33e8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2f33e8, 2e6430, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2e6430, 2d9cc0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2d9cc0, 2cdb20, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2cdb20, 2c20a8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2c20a8, 2b6e30, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2b6e30, 2ac050, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2ac050, 2a1a70, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2a1a70, 297bb8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (297bb8, 28e2d0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (28e2d0, 284fa0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (284fa0, 27c288, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (27c288, 273b28, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (273b28, 26b998, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (26b998, 263e48, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (263e48, 25c7b0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (25c7b0, 2555b8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2555b8, 24e958, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (24e958, 2481b8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2481b8, 241f28, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (241f28, 23c108, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (23c108, 2367b8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2367b8, 231330, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (231330, 22c250, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (22c250, 227a00, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (227a00, 222768, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (222768, 21e758, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (21e758, 21ab58, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (21ab58, 217280, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (217280, 213d08, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (213d08, 210b00, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (210b00, 20db98, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (20db98, 20af68, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (20af68, 2085c8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2085c8, 205eb0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (205eb0, 203a30, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (203a30, 2017f0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (2017f0, 1fb878, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1fb878, 1f57e8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1f57e8, 1f3ea8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1f3ea8, 1f6718, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1f6718, 1f3118, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1f3118, 1f1cc0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1f1cc0, 1e6980, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1e6980, 1ef698, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1ef698, 1ee710, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1ee710, 1ed7f8, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1ed7f8, 1eca20, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1eca20, 1ebcf0, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1ebcf0, 1eb020, 190, 1, 132270, c3129) + 54
 00057eb8 ap_internal_redirect (1eb020, 1e9aa0, 190, bfc00, 4, a8000) + 54
 00057984 ap_process_request (1e9aa0, c8, 4, 1e9aa0, 0, 4) + 3c
 00052e5c ap_process_http_connection (1dd280, 52d98, c3050, 4, 1dd294, 1e3a50) + c4
 0007ba08 ap_run_process_connection (1dd280, 1dd1b0, 1dd1b0, 4, 1db1f8, 1e3a50) + 3c
 0006f4ac child_main (4, 1, c2400, 4e2e, bec00, c3400) + 35c
 0006f68c make_child (0, 4, a, 40, 4, 9) + f0
 0006f8c8 perform_idle_server_maintenance (cd3e0, 1, ffbef798, cd3e0, d0b88, a4cf0) + 168
 0006ff14 ap_mpm_run (0, c2000, 0, bec00, bec00, a6800) + 578
 000755c4 main     (d0b88, cb458, a69b8, a69c8, 0, 0) + 4c0
 0002cfd0 _start   (0, 0, 0, 0, 0, 0) + 5c
-----------------  lwp# 2 / thread# 2  --------------------
 fee9ba18 signotifywait ()
 fed9ed90 _dynamiclwps (fedbe000, ff0c3c71, ff2f03b4, a654bc4, ff051958, 0) + 1c
 feda206c thr_yield (0, 0, 0, 0, 0, 0) + 8c
-----------------  lwp# 3 / thread# 3  --------------------
 fee9c0c8 lwp_sema_wait (fdf0de30)
 fed99af4 _park    (fdf0de30, fedbe000, 0, fdf0dd70, 24d54, 0) + 114
 fed997bc _swtch   (fdf0dd70, fdf0dd70, fedbe000, 5, 1000, 1) + 424
 fed9ddf8 _reap_wait (fedc29e8, 204e4, 0, fedbe000, 0, 0) + 38
 fed9db50 _reaper  (fedbee38, fed85d10, fedc29e8, fedbee10, 0, fe400000) + 38
 fedab730 _thread_start (0, 0, 0, 0, 0, 0) + 40
root@prodigy (~) # truss -p 28021
brk(0x011C97E0)					= 0
brk(0x011CB7E0)					= 0
brk(0x011CB7E0)					= 0
brk(0x011CD7E0)					= 0
brk(0x011CD7E0)					= 0
brk(0x011D17E0)					= 0
brk(0x011D17E0)					= 0
brk(0x011D37E0)					= 0
brk(0x011D37E0)					= 0
brk(0x011D57E0)					= 0
brk(0x011D57E0)					= 0
brk(0x011D77E0)					= 0
brk(0x011D77E0)					= 0
brk(0x011D97E0)					= 0
brk(0x011D97E0)					= 0
brk(0x011DB7E0)					= 0
brk(0x011DB7E0)					= 0
brk(0x011DD7E0)					= 0
brk(0x011DD7E0)					= 0
brk(0x011DF7E0)					= 0
brk(0x011DF7E0)					= 0
brk(0x011E17E0)					= 0
brk(0x011E17E0)					= 0
brk(0x011E37E0)					= 0
brk(0x011E37E0)					= 0
brk(0x011E57E0)					= 0
brk(0x011E57E0)					= 0
brk(0x011E77E0)					= 0
brk(0x011E77E0)					= 0
brk(0x011E97E0)					= 0
brk(0x011E97E0)					= 0
brk(0x011EB7E0)					= 0
brk(0x011EB7E0)					= 0
brk(0x011ED7E0)					= 0
brk(0x011ED7E0)					= 0
brk(0x011EF7E0)					= 0
brk(0x011EF7E0)					= 0
brk(0x011F17E0)					= 0
brk(0x011F17E0)					= 0
brk(0x011F37E0)					= 0
brk(0x011F37E0)					= 0
brk(0x011F57E0)					= 0
brk(0x011F57E0)					= 0
brk(0x011F77E0)					= 0
brk(0x011F77E0)					= 0
brk(0x011F97E0)					= 0
brk(0x011F97E0)					= 0
brk(0x011FB7E0)					= 0
brk(0x011FB7E0)					= 0
brk(0x011FD7E0)					= 0
brk(0x011FD7E0)					= 0
brk(0x011FF7E0)					= 0
brk(0x011FF7E0)					= 0
brk(0x012017E0)					= 0
brk(0x012017E0)					= 0
brk(0x012037E0)					= 0
brk(0x012037E0)					= 0
brk(0x012057E0)					= 0
brk(0x012057E0)					= 0
brk(0x012077E0)					= 0
brk(0x012077E0)					= 0
brk(0x012097E0)					= 0
brk(0x012097E0)					= 0
brk(0x0120B7E0)					= 0
brk(0x0120B7E0)					= 0
brk(0x0120D7E0)					= 0
brk(0x0120D7E0)					= 0
brk(0x0120F7E0)					= 0
brk(0x0120F7E0)					= 0
brk(0x012117E0)					= 0
brk(0x012117E0)					= 0
brk(0x012137E0)					= 0
brk(0x012137E0)					= 0
brk(0x012157E0)					= 0
brk(0x012157E0)					= 0
brk(0x012177E0)					= 0
brk(0x012177E0)					= 0
brk(0x012197E0)					= 0
brk(0x012197E0)					= 0
brk(0x0121B7E0)					= 0
brk(0x0121B7E0)					= 0
brk(0x0121D7E0)					= 0
brk(0x0121D7E0)					= 0
brk(0x0121F7E0)					= 0
brk(0x0121F7E0)					= 0
brk(0x012217E0)					= 0
brk(0x012217E0)					= 0
brk(0x012237E0)					= 0
brk(0x012237E0)					= 0
brk(0x012257E0)					= 0
brk(0x012257E0)					= 0
brk(0x012277E0)					= 0
brk(0x012277E0)					= 0
brk(0x012297E0)					= 0
brk(0x012297E0)					= 0
brk(0x0122B7E0)					= 0
brk(0x0122B7E0)					= 0
brk(0x0122D7E0)					= 0
brk(0x0122D7E0)					= 0
brk(0x0122F7E0)					= 0
brk(0x0122F7E0)					= 0
brk(0x012317E0)					= 0
brk(0x012317E0)					= 0
brk(0x012337E0)					= 0
brk(0x012337E0)					= 0
brk(0x012357E0)					= 0
brk(0x012357E0)					= 0
brk(0x012377E0)					= 0
brk(0x012377E0)					= 0
brk(0x012397E0)					= 0
brk(0x012397E0)					= 0
brk(0x0123B7E0)					= 0
brk(0x0123B7E0)					= 0
brk(0x0123D7E0)					= 0
brk(0x0123D7E0)					= 0
brk(0x0123F7E0)					= 0
brk(0x0123F7E0)					= 0
brk(0x012417E0)					= 0
brk(0x012417E0)					= 0
brk(0x012437E0)					= 0
brk(0x012437E0)					= 0
brk(0x012457E0)					= 0
brk(0x012457E0)					= 0
brk(0x012477E0)					= 0
brk(0x012477E0)					= 0
brk(0x012497E0)					= 0
brk(0x012497E0)					= 0
brk(0x0124B7E0)					= 0
brk(0x0124B7E0)					= 0
brk(0x0124D7E0)					= 0
brk(0x0124D7E0)					= 0
brk(0x0124F7E0)					= 0
brk(0x0124F7E0)					= 0
brk(0x012517E0)					= 0
brk(0x012517E0)					= 0
brk(0x012537E0)					= 0
brk(0x012537E0)					= 0
brk(0x012557E0)					= 0
brk(0x012557E0)					= 0
brk(0x012577E0)					= 0
brk(0x012577E0)					= 0
brk(0x012597E0)					= 0
brk(0x012597E0)					= 0
brk(0x0125B7E0)					= 0
brk(0x0125B7E0)					= 0
brk(0x0125D7E0)					= 0
brk(0x0125D7E0)					= 0
brk(0x0125F7E0)					= 0
brk(0x0125F7E0)					= 0
brk(0x012617E0)					= 0
brk(0x012617E0)					= 0
brk(0x012637E0)					= 0
brk(0x012637E0)					= 0
brk(0x012657E0)					= 0
brk(0x012657E0)					= 0
brk(0x012677E0)					= 0
brk(0x012677E0)					= 0
brk(0x012697E0)					= 0
brk(0x012697E0)					= 0
brk(0x0126B7E0)					= 0
brk(0x0126B7E0)					= 0
brk(0x0126D7E0)					= 0
brk(0x0126D7E0)					= 0
brk(0x0126F7E0)					= 0
brk(0x0126F7E0)					= 0
brk(0x012717E0)					= 0
brk(0x012717E0)					= 0
brk(0x012737E0)					= 0
brk(0x012737E0)					= 0
brk(0x012757E0)					= 0
brk(0x012757E0)					= 0
brk(0x012777E0)					= 0
brk(0x012777E0)					= 0
brk(0x012797E0)					= 0
brk(0x012797E0)					= 0
brk(0x0127B7E0)					= 0
brk(0x0127B7E0)					= 0
brk(0x0127D7E0)					= 0
brk(0x0127D7E0)					= 0
brk(0x0127F7E0)					= 0
brk(0x0127F7E0)					= 0
brk(0x012837E0)					= 0
brk(0x012837E0)					= 0
brk(0x012857E0)					= 0
brk(0x012857E0)					= 0
brk(0x012877E0)					= 0
brk(0x012877E0)					= 0
brk(0x012897E0)					= 0
brk(0x012897E0)					= 0
brk(0x0128B7E0)					= 0
brk(0x0128B7E0)					= 0
brk(0x0128D7E0)					= 0
brk(0x0128D7E0)					= 0
brk(0x0128F7E0)					= 0
brk(0x0128F7E0)					= 0
brk(0x012917E0)					= 0
brk(0x012917E0)					= 0
brk(0x012937E0)					= 0
^Croot@prodigy (~) # exit

script done on Wed Jun 19 14:39:18 2002

I havnt killed it just yet, and if anyone has anything else for
me to try within the next few minutes mail me. After that I'll
kill it, but with SEGV, and keep a corefile.

-- 
colmmacc@redbrick.dcu.ie        PubKey: colmmacc+pgp@redbrick.dcu.ie  
Web:                                 http://devnull.redbrick.dcu.ie/ 

Re: 2.0.38-39 lockup problem ?

Posted by Justin Erenkrantz <je...@apache.org>.
On Wed, Jun 19, 2002 at 09:26:46AM +0100, Colm MacCárthaigh wrote:
> 
> I havnt tracked down a cause yet, but this happens with the 2.0.38
> tarball, the 2.0.39 one, and the current HEAD. I'm running Solaris
> 2.8 on sparc, 

If you can, please run pstack on the running processes and report
back where the process is.  -- justin