You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Scott Severtson <ss...@digitalmeasures.com> on 2010/06/23 23:30:59 UTC

[users@httpd] Solaris 10/x64 worker graceful restart problem

On Solaris 10 u8, HTTPD 2.2.15 occasionally has one child process hang 
during a graceful restart.

Symptoms:
1. At debug-level logging, the error log shows:
[Wed Jun 23 14:38:21 2010] [debug] worker.c(1083): the listener thread 
didn't exit

I understand this is not a major issue 
(https://issues.apache.org/bugzilla/show_bug.cgi?id=9011), but provides 
insight into execution.

2. pstack of the hanging child shows the main thread is hanging while 
shutting down worker threads:

-----------------  lwp# 1 / thread# 1  --------------------
  fffffd7fff06cdea lwp_wait (3, fffffd7fffdff964)
  fffffd7fff063eee _thrp_join () + 3e
  fffffd7fff0640cc pthread_join () + 1c
  fffffd7fff27b195 apr_thread_join () + 25
  0000000000470a19 join_workers () + e9
  0000000000470de3 child_main () + 353
  0000000000471137 make_child () + 147
  0000000000471a6e ap_mpm_run () + 8be
  000000000042fd81 main () + 8b1
  000000000042f08c _start () + 6c
-----------------  lwp# 3 / thread# 3  --------------------
  fffffd7fff067527 lwp_park (0, 0, 0)
  fffffd7fff0610b9 cond_wait_queue () + 59
  fffffd7fff061647 _cond_wait () + 57
  fffffd7fff061676 cond_wait () + 26
  fffffd7fff0616b9 pthread_cond_wait () + 9
  0000000000472cc2 ap_queue_pop () + 72
  000000000047032d worker_thread () + 11d
  fffffd7fff06727b _thr_setup () + 5b
  fffffd7fff0674b0 _lwp_start ()
-----------------  lwp# 4 / thread# 4  --------------------
  fffffd7fff067527 lwp_park (0, 0, 0)
  fffffd7fff0610b9 cond_wait_queue () + 59
  fffffd7fff061647 _cond_wait () + 57
  fffffd7fff061676 cond_wait () + 26
  fffffd7fff0616b9 pthread_cond_wait () + 9
  0000000000472cc2 ap_queue_pop () + 72
  000000000047032d worker_thread () + 11d
  fffffd7fff06727b _thr_setup () + 5b
  fffffd7fff0674b0 _lwp_start ()

---SNIP---
...lots more threads in lwp_park(0, 0, 0)...
---SNIP---

-----------------  lwp# 28 / thread# 28  --------------------
  fffffd7fff06ce2a lwp_mutex_timedlock (fffffd7ffeee0000, 0)
  fffffd7fff05fb78 mutex_lock_internal () + 328
  fffffd7fff05ff62 mutex_lock_impl () + 112
  fffffd7fff06002b mutex_lock () + b
  fffffd7fff26e5a5 proc_mutex_proc_pthread_acquire () + 15
  000000000046ff4c listener_thread () + 3bc
  fffffd7fff06727b _thr_setup () + 5b
  fffffd7fff0674b0 _lwp_start ()

It appears that join_workers() is hanging on a call to 
apr_thread_join(...), in line 1104 of worker.c.


HTTPD was compiled with Solaris's default GCC (3.4.3), with the 
following flags:

CFLAGS="-O3 -m64 -march=athlon64"
LDFLAGS="-R$INSTALL_SSL/lib -L$INSTALL_SSL/lib"
./configure -C \
                 --prefix=$INSTALL \
                 --enable-mods-shared="deflate expires headers proxy 
proxy-ajp proxy-balancer proxy-connect proxy-http rewrite ssl usertrack 
dav status log-config logio" \
                 -with-ssl=$INSTALL_SSL \
                 --with-mpm=worker \
                 --enable-nonportable-atomics


Any thoughts? Anything other information I can provide to diagnose this 
issue?


Many thanks,
Scott Severtson

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org