You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Alexis Huxley <ah...@gmx.net> on 2004/02/21 12:00:25 UTC

"(97)Address family not supported by protocol" causes "disk ticking"?

I'm running apache 2.1dev, and I've posted to the -user list with
no response, so now I'm a bit stuck as to where next to try, so I'm
posting here.  Apologies if this is the wrong place.

About once every three weeks it happens that the hard disk of the
machine where I run apache2 starts audibly "ticking" - i.e. it starts
making some sort of non-cached access (maybe the log writing below?) at
a rate of once per second.

The server could still serve other pages without problem, so there
was no DoS.

The first few times it happened I rebooted, assuming the disk or
controller had got its knickers in a twist. The next times I shutdown
all of the locally added init.d scripts which fixed it. The next
time I shut down half of those init.d scripts which worked. Next
time half of that half, and so on until eventually I traced it down
to the httpd daemon, which was yesterday!

I checked the apache logs and I see this in error_log:

        [Mon Feb 16 23:35:33 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
        [Mon Feb 16 23:35:34 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
        [Mon Feb 16 23:35:35 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
        [Mon Feb 16 23:35:36 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
        [Mon Feb 16 23:35:37 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
        ...

The first entry above was the first entry caused yesterday's 'incident',
but the log entries went on - at one second intervals - until I killed 
httpd.

I wondered if some access had triggered it so I checked the access_log
and immediately before 23:35:33 I have the following (sorry for very
long lines!):

        cache-mtc-aa06.proxy.aol.com - - [16/Feb/2004:23:35:16 +0100] "GET /~alexis/STABLE/hiking/tegernschlier/ HTTP/1.0" 200 8755 "http://www.google.com/search?q=Schliersee&hl=en&lr=&ie=UTF-8&start=310&sa=N" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ak04.proxy.aol.com - - [16/Feb/2004:23:35:18 +0100] "GET /~alexis/STABLE/css/standard.css HTTP/1.0" 200 5017 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ab10.proxy.aol.com - - [16/Feb/2004:23:35:21 +0100] "GET /~alexis/STABLE/images/d_arrow_7rows.gif HTTP/1.0" 200 242 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ab10.proxy.aol.com - - [16/Feb/2004:23:35:21 +0100] "GET /~alexis/STABLE/images/d_arrow_2rows.gif HTTP/1.1" 200 111 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-aa08.proxy.aol.com - - [16/Feb/2004:23:35:21 +0100] "GET /~alexis/STABLE/hiking/tegernschlier/images/tn-map.jpg HTTP/1.0" 200 3005 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ad09.proxy.aol.com - - [16/Feb/2004:23:35:21 +0100] "GET /~alexis/STABLE/hiking/tegernschlier/images/tn-5_20021110-0981_de-gindelalmschneid-tegernsee_michael-light-snow.jpg HTTP/1.0" 200 2279 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ad14.proxy.aol.com - - [16/Feb/2004:23:35:22 +0100] "GET /~alexis/STABLE/hiking/tegernschlier/images/tn-5_20021110-0982_de-gindelalmschneid-tegernsee_michael-medium-snow.jpg HTTP/1.0" 200 1478 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ad08.proxy.aol.com - - [16/Feb/2004:23:35:22 +0100] "GET /~alexis/STABLE/hiking/tegernschlier/images/tn-5_20021110-0985_de-gindelalmschneid-gindelalm_michael-heavy-snow.jpg HTTP/1.0" 200 1487 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ad13.proxy.aol.com - - [16/Feb/2004:23:35:23 +0100] "GET /~alexis/STABLE/hiking/tegernschlier/images/tn-5_20021110-0987_de-gindelalmschneid-au_sunny-lake.jpg HTTP/1.0" 200 2571 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ac08.proxy.aol.com - - [16/Feb/2004:23:35:25 +0100] "GET /~alexis/STABLE/hiking/tegernschlier/images/tn-5_20021110-0989_de-gindelalmschneid-schwaig_view-of-schliersee.jpg HTTP/1.0" 200 2011 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ab01.proxy.aol.com - - [16/Feb/2004:23:35:27 +0100] "GET /~alexis/STABLE/images/valid-html401.png HTTP/1.0" 200 2948 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ah02.proxy.aol.com - - [16/Feb/2004:23:35:27 +0100] "GET /~alexis/STABLE/images/vcss.png HTTP/1.0" 200 1134 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"
        cache-mtc-ac02.proxy.aol.com - - [16/Feb/2004:23:35:27 +0100] "GET /~alexis/STABLE/hiking/tegernschlier/images/tn-5_20021110-0991_de-gindelalmschneid-schliersee_lake-view-east.jpg HTTP/1.0" 200 2084 "http://dione.no-ip.org/~alexis/STABLE/hiking/tegernschlier/" "Mozilla/4.0 (compatible; MSIE 6.0; AOL 8.0; Windows 98)"

The pages on this host get hit only a dozen times per hour, so two
accesses with only ten seconds between them (I mean the last thing
in access_log and the first in error_log) probably are caused by the
same client.

I trust apache enough to that I have not found a bug :-) but what
is it? 

I can include the httpd.conf if necessary, but for the moment I'll
just post that I'm running:

        Apache/2.1.0-dev (Unix) DAV/2 SVN/0.36.0+ Server at dione.myth.net Port 80

(dione.myth.net is the name of the host on my home-net, and
dione.no-ip.org is the world-accessible name.)

But I have encountered the same problem since - give or take one minor
release - the time of 2.0.45.

Googling turns up no references to the above error. Has anybody any 
ideas what's going on? Many thanks!

Alexis Huxley
ahuxley@gmx.net

PS I posted the above about a week ago, and since then it has already
   happened again. Here is what was in access_log just before it,
   starting with the first entry made be the host making the accesses
   that trigger the behaviour:

	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:26 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-5_20030525-1261_de-wank-panorama_logging-and-lift.jpg HTTP/1.1" 200 3345 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:27 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-5_20030525-1268_de-wank-eckenberg_cloud-lars-paola.jpg HTTP/1.1" 200 1608 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:27 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-5_20030525-1265_de-wank-eckenberg_steps-lars-paola-karin.jpg HTTP/1.1" 200 2787 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:26 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-map.jpg HTTP/1.1" 200 2714 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:26 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-5_20030525-1264_de-wank_karin-paola-lars.jpg HTTP/1.1" 200 2220 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:26 +0100] "GET /~alexis/STABLE/images/d_arrow_7rows.gif HTTP/1.1" 200 242 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:26 +0100] "GET /~alexis/STABLE/images/d_arrow_2rows.gif HTTP/1.1" 200 111 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:27 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-5_20030525-1272_de-wank-eckenberg_view-of-hoher-fricken-bischof-krottenkopf.jpg HTTP/1.1" 200 2470 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:27 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-5_20030525-1270_de-wank-eckenberg_view-of-esterberg-alm.jpg HTTP/1.1" 200 2171 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:27 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-5_20030525-1274_de-wank-saddle_eckenberg.jpg HTTP/1.1" 200 1756 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:27 +0100] "GET /~alexis/STABLE/hiking/wank/images/tn-5_20030525-1277_de-wank-saddle_ameisberg.jpg HTTP/1.1" 200 2931 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:27 +0100] "GET /~alexis/STABLE/images/valid-html401.png HTTP/1.1" 200 2948 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:27 +0100] "GET /~alexis/STABLE/images/vcss.png HTTP/1.1" 200 1134 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:25 +0100] "GET /~alexis/STABLE/hiking/wank/ HTTP/1.0" 200 9304 "XXXX:++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"
	cache6-midd.server.ntli.net - - [20/Feb/2004:21:27:30 +0100] "GET /~alexis/STABLE/css/standard.css HTTP/1.1" 200 5017 "http://dione.no-ip.org/~alexis/STABLE/hiking/wank/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Hotbar 4.3.5.0; .NET CLR 1.0.3705)"

-- end --


Re: "(97)Address family not supported by protocol" causes "disk ticking"?

Posted by Alexis Huxley <ah...@gmx.net>.
>> 	Process 1008 attached - interrupt to quit
>> 	select(0, NULL, NULL, NULL, {0, 620000}) = 0 (Timeout)
>> 	write(6, "!", 1)                        = 1
>> 	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
>> 	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
>> 	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 EAFNOSUPPORT (Address family not supported by protocol)
>> 	gettimeofday({1077743877, 15040}, NULL) = 0
>> 	write(7, "[Wed Feb 25 22:17:57 2004] [warn"..., 114) = 114
>> 	waitpid(-1, 0xbffffa00, WNOHANG|WUNTRACED) = 0
>> 	select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
>> 	write(6, "!", 1)                        = 1
>> 	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
>
>> 	.....

> aren't responding to the pipe of death.  Could it be that network traffic is 
> totally dried up during these periods?  or could the child processes be in the 

By "totally dried up" do you mean is the host getting fewer requests?
This machine gets about 20-30 requests per hour usually. 

> middle of some long running request?  server-status with ExtendedStatus On would 
> help answer those questions.

Okay, I added this to httpd.conf:

	<Location /server-status>
	    SetHandler server-status
	    Order Deny,Allow
	    Deny from all
	    Allow from 127.0.0.1
	</Location>
	ExtendedStatus on

and restarted httpd.

>  From the ps, it looks like you are using the prefork mpm...is that correct?

As per out-of-the-box httpd.conf, I have:

	<IfModule prefork.c>
	    StartServers          5
	    MinSpareServers       5
	    MaxSpareServers      10
	    MaxClients          150
	    MaxRequestsPerChild   0
	</IfModule>

> I'm guessing you hear the disk ticking once a second like clockwork when this 
> happens.

Exactly.

>> I'll post a bug on Saturday either way unless someone suggest otherwise. 
>
> sure, might as well.

Okay, now I'll wait till Sunday evening or the next occurence, whichever
is first, in order to try to get the info you requested into the bug
report at the same time.

Alexis


Re: "(97)Address family not supported by protocol" causes "disk ticking"?

Posted by gr...@apache.org.
Alexis Huxley wrote:
>>>>        [Mon Feb 16 23:35:33 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
>>>>        [Mon Feb 16 23:35:34 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
> 
> 
> Ok, it did it again. Here's the 'strace -p' output:
> 
> 	Script started on Wed Feb 25 22:17:44 2004
> 	dione:/tmp# ps fax | grep httpd
> 	 2141 pts/7    S      0:00                      \_ grep httpd
> 	 1008 ?        S      0:00 /usr/server/opt/apache2/bin/httpd -k start
> 	 1009 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start

> 	dione:/tmp# strace -p 1008
> 	Process 1008 attached - interrupt to quit
> 	select(0, NULL, NULL, NULL, {0, 620000}) = 0 (Timeout)
> 	write(6, "!", 1)                        = 1
> 	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
> 	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
> 	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 EAFNOSUPPORT (Address family not supported by protocol)
> 	gettimeofday({1077743877, 15040}, NULL) = 0
> 	write(7, "[Wed Feb 25 22:17:57 2004] [warn"..., 114) = 114
> 	waitpid(-1, 0xbffffa00, WNOHANG|WUNTRACED) = 0
> 	select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
> 	write(6, "!", 1)                        = 1
> 	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)

> 	.....
> 
> ... and so on.
> 
> I hope that helps. Any ideas? Thanks.

hmmm, looks like the httpd parent is trying to shut down a child process 
gracefully after a decrease in traffic.  write(6, "!", 1) means it's using the 
pipe of death.  Since waitpid() is returning 0, it looks the child processes 
aren't responding to the pipe of death.  Could it be that network traffic is 
totally dried up during these periods?  or could the child processes be in the 
middle of some long running request?  server-status with ExtendedStatus On would 
help answer those questions.

I'm guessing the series of socket calls after that is where we pretend we are a 
browser and try to establish dummy connections in an attempt to wake up a 
potential victim child which could be blocked in accept().  dunno what's wrong 
with the socket() calls, why they say PF_INET6, or why they come in groups of three.

 From the ps, it looks like you are using the prefork mpm...is that correct?

I'm guessing you hear the disk ticking once a second like clockwork when this 
happens.

> I'll post a bug on Saturday either way unless someone suggest otherwise. 

sure, might as well.

Greg


Re: "(97)Address family not supported by protocol" causes "disk ticking"?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 03:40 PM 2/25/2004, Alexis Huxley wrote:
>>>>         [Mon Feb 16 23:35:33 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener

the ticking is an unexpected hard flush when APR_APPEND causes
win32 to file lock for write.

Bill



Re: "(97)Address family not supported by protocol" causes "disk ticking"?

Posted by Alexis Huxley <ah...@gmx.net>.
>>>         [Mon Feb 16 23:35:33 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
>>>         [Mon Feb 16 23:35:34 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener

Ok, it did it again. Here's the 'strace -p' output:

	Script started on Wed Feb 25 22:17:44 2004
	dione:/tmp# ps fax | grep httpd
	 2141 pts/7    S      0:00                      \_ grep httpd
	 1008 ?        S      0:00 /usr/server/opt/apache2/bin/httpd -k start
	 1009 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1010 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1011 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1012 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1013 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1059 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1060 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1080 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1081 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1082 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	 1332 ?        S      0:00  \_ /usr/server/opt/apache2/bin/httpd -k start
	dione:/tmp# strace -p 1008
	Process 1008 attached - interrupt to quit
	select(0, NULL, NULL, NULL, {0, 620000}) = 0 (Timeout)
	write(6, "!", 1)                        = 1
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 EAFNOSUPPORT (Address family not supported by protocol)
	gettimeofday({1077743877, 15040}, NULL) = 0
	write(7, "[Wed Feb 25 22:17:57 2004] [warn"..., 114) = 114
	waitpid(-1, 0xbffffa00, WNOHANG|WUNTRACED) = 0
	select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
	write(6, "!", 1)                        = 1
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 EAFNOSUPPORT (Address family not supported by protocol)
	gettimeofday({1077743878, 64645}, NULL) = 0
	write(7, "[Wed Feb 25 22:17:58 2004] [warn"..., 114) = 114
	waitpid(-1, 0xbffffa00, WNOHANG|WUNTRACED) = 0
	select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
	write(6, "!", 1)                        = 1
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 ENOSYS (Function not implemented)
	socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = -1 EAFNOSUPPORT (Address family not supported by protocol)
	gettimeofday({1077743879, 118433}, NULL) = 0
	write(7, "[Wed Feb 25 22:17:59 2004] [warn"..., 114) = 114
	waitpid(-1, 0xbffffa00, WNOHANG|WUNTRACED) = 0
	select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
	.....

... and so on.

I hope that helps. Any ideas? Thanks.

I'll post a bug on Saturday either way unless someone suggest otherwise. 

Alexis

PS The frequency that this is happening at is definitely going up.


Re: "(97)Address family not supported by protocol" causes "disk ticking"?

Posted by Alexis Huxley <ah...@gmx.net>.
>> About once every three weeks it happens that the hard disk of the
>> machine where I run apache2 starts audibly "ticking" - i.e. it starts
>> making some sort of non-cached access (maybe the log writing below?) at
>> a rate of once per second.

>>         [Mon Feb 16 23:35:33 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
>>         [Mon Feb 16 23:35:34 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener

> OK, what platform are you on, what Listen directives are you using, and

	Linux dione 2.4.16 #1 Sat Mar 9 19:04:14 CET 2002 i686 GNU/Linux

It's a Debian 'testing' box, but apache2 is locally compiled because
I'm running Subversion too.

The only listen directive is a 'Listen 80', not nested inside anything.

( I've posted the complete httpd.conf at 
http://dione.no-ip.org/~alexis/tmp-nopasswd/ticking-httpd.conf. The
changes made to the out-of-the-box httpd.conf are documented at 
http://tinyurl.com/2ngjx )

> are you using IPv6 on the box at all?  It would be useful to run "strace

No, not running v6 - or at least:

	dione$ telnet ::1 
	Trying ::1...
	telnet: Unable to connect to remote host: Address family not supported by protocol

so I presume that means I'm not (right? 'telnet 127.0.0.1' gives a normal
'connection' refused' 'cos telnetd not running).

> -p" on the pid of the parent once the server gets into this state.

Okay; the 'strace -p' will take some more time, because I have no way 
of provoking this behaviour; I just have to wait for it to happen.

Is there any other info I can provide before it happens, or anything
else I should collect when it does? If there please let me know, and
if there isn't I'll keep quiet until I've collected said information
and then post/cc again.

Thanks for the assistance ...

Alexis


Re: "(97)Address family not supported by protocol" causes "disk ticking"?

Posted by Joe Orton <jo...@redhat.com>.
Hi,

On Sat, Feb 21, 2004 at 11:00:25AM +0000, Alexis Huxley wrote:
...
> About once every three weeks it happens that the hard disk of the
> machine where I run apache2 starts audibly "ticking" - i.e. it starts
> making some sort of non-cached access (maybe the log writing below?) at
> a rate of once per second.
> 
> The server could still serve other pages without problem, so there
> was no DoS.
...
> I checked the apache logs and I see this in error_log:
> 
>         [Mon Feb 16 23:35:33 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
>         [Mon Feb 16 23:35:34 2004] [warn] (97)Address family not supported by protocol: get socket to connect to listener
...

OK, what platform are you on, what Listen directives are you using, and
are you using IPv6 on the box at all?  It would be useful to run "strace
-p" on the pid of the parent once the server gets into this state.

Regards,

joe

Re: "(97)Address family not supported by protocol" causes "disk ticking"?

Posted by Joe Orton <jo...@redhat.com>.
On Sat, Feb 21, 2004 at 11:00:25AM +0000, Alexis Huxley wrote:
> I'm running apache 2.1dev, and I've posted to the -user list with
> no response, so now I'm a bit stuck as to where next to try, so I'm
> posting here.  Apologies if this is the wrong place.

The bug is that the POD code is doing a name lookup on the first bound
address, which doesn't seem to make much sense since the addresses of
bound listeners are determined just before the POD is opened.

It breaks on Linux systems where getaddrinfo will return AF_INET6
addresses but socket(AF_INET6...) fails.  There's also some interaction
with the new alloc_listener logic on HEAD I presume since nobody has
complained about this with 2.0, not quite sure there.

One-liner is below, proper fix is to remove the sa field from ap_pod_t
and always use ap_listeners->bind_addr in its place, which I'll do
unless anyone can think of a good reason to do a name lookup here.

Index: server/mpm_common.c
===================================================================
RCS file: /home/cvs/httpd-2.0/server/mpm_common.c,v
retrieving revision 1.117
diff -u -r1.117 mpm_common.c
--- server/mpm_common.c	25 Feb 2004 11:55:31 -0000	1.117
+++ server/mpm_common.c	4 Mar 2004 16:39:48 -0000
@@ -366,8 +366,7 @@
     apr_file_pipe_timeout_set((*pod)->pod_in, 0);
     (*pod)->p = p;
 
-    apr_sockaddr_info_get(&(*pod)->sa, ap_listeners->bind_addr->hostname,
-                          APR_UNSPEC, ap_listeners->bind_addr->port, 0, p);
+    (*pod)->sa = ap_listeners->bind_addr;
 
     /* close these before exec. */
     apr_file_inherit_unset((*pod)->pod_in);