You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2008/02/11 23:47:06 UTC

DO NOT REPLY [Bug 42580] - Timeout when restarting or stopping Apache

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG�
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=42580>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND�
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=42580





------- Additional Comments From basant.kukreja@sun.com  2008-02-11 14:47 -------
Created an attachment (id=21511)
 --> (http://issues.apache.org/bugzilla/attachment.cgi?id=21511&action=view)
Patch

When I had the following settings :

Listen 192.168.21.1:81
Listen 192.168.22.1:81

<IfModule worker.c>
ListenBackLog	  50000
StartServers	     2
ThreadLimit	   5
ThreadsPerChild    5
MinSpareThreads    1
MaxSpareThreads    1
ThreadsPerChild    5
MaxClients	  10
MaxRequestsPerChild  0
</IfModule>

I was getting the following error in my error log
[Mon Feb 11 13:33:52 2008] [error] (70007)The timeout specified has expired:
apr_pollset_poll : (listen)

On debugging, I found out that the listener thread calls port_getn to listen
for new connections, it was getting SIGHUP signal. This was confirmed by truss
output.

5541/1: 	read(6, 0xFFFFFD7FFFDFF38B, 1)	(sleeping...)
5541/7: 	lwp_park(0x00000000, 0) 	(sleeping...)
5541/6: 	lwp_park(0x00000000, 0) 	(sleeping...)
5541/5: 	lwp_park(0x00000000, 0) 	(sleeping...)
5541/4: 	lwp_park(0x00000000, 0) 	(sleeping...)
5541/3: 	lwp_park(0x00000000, 0) 	(sleeping...)
5541/8: 	port_getn(12, 0x00539D30, 2, 1, 0x00000000) (sleeping...)
5543/1: 	read(6, 0xFFFFFD7FFFDFF38B, 1)	(sleeping...)
5543/3: 	lwp_park(0x00000000, 0) 	(sleeping...)
5543/4: 	lwp_park(0x00000000, 0) 	(sleeping...)
5543/5: 	lwp_park(0x00000000, 0) 	(sleeping...)
5543/6: 	lwp_park(0x00000000, 0) 	(sleeping...)
5543/7: 	lwp_park(0x00000000, 0) 	(sleeping...)
5543/8: 	fcntl(11, F_SETLKW, 0xFFFFFD7FFF241E10) (sleeping...)
5539/1: 	pollsys(0xFFFFFD7FFFDFF2E0, 0, 0xFFFFFD7FFFDFF3A0, 0x00000000)
= 0
5539/1: 	write(7, " !", 1)				= 1
5541/1: 	read(6, " !", 1)				= 1
5541/1: 	lwp_wait(2, 0xFFFFFD7FFFDFF31C) 		= 0
5541/1: 	lwp_kill(8, SIGHUP)				= 0
5541/1: 	lwp_kill(8, SIG#0)				= 0
5541/8: 	    Received signal #1, SIGHUP, in port_getn() [caught]
5541/8: 	      siginfo: SIGHUP pid=5541 uid=101 code=-1
5541/8: 	port_getn(12, 0x00539D30, 2, 1, 0x00000000)	Err#4 EINTR
5541/8: 	lwp_sigmask(SIG_SETMASK, 0xFFBEE007, 0x0000FFF7) = 0xFFBFFEFF
[0x0000FFFF]

Note that 5541/8 is the listener thread and 5541/1 was sending the SIGHUP
signal.

I further debugged why one of my process is dieing. Then I found out that
in worker.c, in perform_idle_server_maintenance function, 

    if (idle_thread_count > max_spare_threads) {
	/* Kill off one child */
	ap_mpm_pod_signal(pod, TRUE);
	idle_spawn_rate = 1;
	..
>From debugging, I found out that idle_thread_count was set to 10 and
max_spare_threads was set to 6.  Hence the first thread was sending SIGHUP
signal to kill the process.

If I increase the MaxSpareThreads to 10 then this error disappeared because in
that case, worker process didn't try to kill himself.
<IfModule worker.c>
ListenBackLog	  50000
StartServers	     2
ThreadLimit	   5
ThreadsPerChild    5
MinSpareThreads    1
MaxSpareThreads    10
ThreadsPerChild    5
MaxClients	  10
MaxRequestsPerChild  0
</IfModule>

The error message is misleading and it may or may not appear. It depends on the

timing. If listener thread has been reached to the point when it is blocked
with port_getn then we get this error. It seems to me that server wanted to
deliberately kill one of the child process.

Casue of the problem is that apr_pollset_poll function doesn't differentiate
the EINTR and ETIME.  apr_pollset_poll contains ...
	if (errno == ETIME || errno == EINTR) {
	    rv = APR_TIMEUP;
	}

In worker.c, we do expect EINTR to be normal as shown below :
-------------- worker.c --------------------------
		rv = apr_pollset_poll(pollset, -1, &numdesc, &pdesc);
		if (rv != APR_SUCCESS) {
		    if (APR_STATUS_IS_EINTR(rv)) {
			continue;
		    }
--------------------------------------------------

If we allow EINTR as a known error and return APR_EINTR as given in suggested
patch then this error disappeared.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org