You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Scott Severtson <ss...@digitalmeasures.com> on 2010/06/23 23:30:59 UTC
[users@httpd] Solaris 10/x64 worker graceful restart problem
On Solaris 10 u8, HTTPD 2.2.15 occasionally has one child process hang
during a graceful restart.
Symptoms:
1. At debug-level logging, the error log shows:
[Wed Jun 23 14:38:21 2010] [debug] worker.c(1083): the listener thread
didn't exit
I understand this is not a major issue
(https://issues.apache.org/bugzilla/show_bug.cgi?id=9011), but provides
insight into execution.
2. pstack of the hanging child shows the main thread is hanging while
shutting down worker threads:
----------------- lwp# 1 / thread# 1 --------------------
fffffd7fff06cdea lwp_wait (3, fffffd7fffdff964)
fffffd7fff063eee _thrp_join () + 3e
fffffd7fff0640cc pthread_join () + 1c
fffffd7fff27b195 apr_thread_join () + 25
0000000000470a19 join_workers () + e9
0000000000470de3 child_main () + 353
0000000000471137 make_child () + 147
0000000000471a6e ap_mpm_run () + 8be
000000000042fd81 main () + 8b1
000000000042f08c _start () + 6c
----------------- lwp# 3 / thread# 3 --------------------
fffffd7fff067527 lwp_park (0, 0, 0)
fffffd7fff0610b9 cond_wait_queue () + 59
fffffd7fff061647 _cond_wait () + 57
fffffd7fff061676 cond_wait () + 26
fffffd7fff0616b9 pthread_cond_wait () + 9
0000000000472cc2 ap_queue_pop () + 72
000000000047032d worker_thread () + 11d
fffffd7fff06727b _thr_setup () + 5b
fffffd7fff0674b0 _lwp_start ()
----------------- lwp# 4 / thread# 4 --------------------
fffffd7fff067527 lwp_park (0, 0, 0)
fffffd7fff0610b9 cond_wait_queue () + 59
fffffd7fff061647 _cond_wait () + 57
fffffd7fff061676 cond_wait () + 26
fffffd7fff0616b9 pthread_cond_wait () + 9
0000000000472cc2 ap_queue_pop () + 72
000000000047032d worker_thread () + 11d
fffffd7fff06727b _thr_setup () + 5b
fffffd7fff0674b0 _lwp_start ()
---SNIP---
...lots more threads in lwp_park(0, 0, 0)...
---SNIP---
----------------- lwp# 28 / thread# 28 --------------------
fffffd7fff06ce2a lwp_mutex_timedlock (fffffd7ffeee0000, 0)
fffffd7fff05fb78 mutex_lock_internal () + 328
fffffd7fff05ff62 mutex_lock_impl () + 112
fffffd7fff06002b mutex_lock () + b
fffffd7fff26e5a5 proc_mutex_proc_pthread_acquire () + 15
000000000046ff4c listener_thread () + 3bc
fffffd7fff06727b _thr_setup () + 5b
fffffd7fff0674b0 _lwp_start ()
It appears that join_workers() is hanging on a call to
apr_thread_join(...), in line 1104 of worker.c.
HTTPD was compiled with Solaris's default GCC (3.4.3), with the
following flags:
CFLAGS="-O3 -m64 -march=athlon64"
LDFLAGS="-R$INSTALL_SSL/lib -L$INSTALL_SSL/lib"
./configure -C \
--prefix=$INSTALL \
--enable-mods-shared="deflate expires headers proxy
proxy-ajp proxy-balancer proxy-connect proxy-http rewrite ssl usertrack
dav status log-config logio" \
-with-ssl=$INSTALL_SSL \
--with-mpm=worker \
--enable-nonportable-atomics
Any thoughts? Anything other information I can provide to diagnose this
issue?
Many thanks,
Scott Severtson
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
" from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org