You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Travis Whitton <ti...@gmail.com> on 2010/11/29 23:25:47 UTC

[users@httpd] Connection Issues

Hi,

We're experiencing some odd behavior regarding connections taking a
long time to establish to our website. We've been running Apache in
production for over three years now and have recently began
experiencing issues where the server-status page, static, and dynamic
content response times will slow anywhere from a few seconds to long
enough for the connection to timeout.

Initially thinking we might be hitting some hard limits with the OS,
we've thoroughly audited our sysctl variables, tried disabling
iptables and conntrack, and ensured that we're not running out of
ephemeral ports or anything along those lines. Looking at netstat, it
seems we have a pretty large number of connections in TIME_WAIT which
is understandable since this is a high traffic website, but I'm
wondering if this value could indicate we're backlogging on TCP
connections or something along those lines?

[root@RHL073 ipv4]# netstat -an | awk '/^tcp/ {A[$(NF)]++} END {for (I
in A) {printf "%5d %s\n", A[I], I}}'
34723 TIME_WAIT
    3 CLOSE_WAIT
  275 FIN_WAIT1
   74 FIN_WAIT2
 8824 ESTABLISHED
  815 SYN_RECV
  102 CLOSING
   30 LAST_ACK
   10 LISTEN

In an effort to tune things, I've tried playing with the TCP timeout
settings a bit, and the response times have improved somewhat. Please
note that I've been testing response times using the loopback
interface to rule out any ethernet hardware issues.

echo 15 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse

We're running prefork, and have configured the client settings to what
seem to be reasonable limits for our hardware.

<IfModule prefork.c>
StartServers       100
MinSpareServers    100
MaxSpareServers   200
ServerLimit       1500
MaxClients        1500
MaxRequestsPerChild 1000000
</IfModule>

Any help or advice would be greatly appreciated.

-Travis

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Connection Issues

Posted by Rainer Jung <ra...@kippdata.de>.
On 29.11.2010 23:25, Travis Whitton wrote:
> Hi,
>
> We're experiencing some odd behavior regarding connections taking a
> long time to establish to our website. We've been running Apache in
> production for over three years now and have recently began
> experiencing issues where the server-status page, static, and dynamic
> content response times will slow anywhere from a few seconds to long
> enough for the connection to timeout.
>
> Initially thinking we might be hitting some hard limits with the OS,
> we've thoroughly audited our sysctl variables, tried disabling
> iptables and conntrack, and ensured that we're not running out of
> ephemeral ports or anything along those lines. Looking at netstat, it
> seems we have a pretty large number of connections in TIME_WAIT which
> is understandable since this is a high traffic website, but I'm
> wondering if this value could indicate we're backlogging on TCP
> connections or something along those lines?
>
> [root@RHL073 ipv4]# netstat -an | awk '/^tcp/ {A[$(NF)]++} END {for (I
> in A) {printf "%5d %s\n", A[I], I}}'
> 34723 TIME_WAIT
>      3 CLOSE_WAIT
>    275 FIN_WAIT1
>     74 FIN_WAIT2
>   8824 ESTABLISHED
>    815 SYN_RECV
>    102 CLOSING
>     30 LAST_ACK
>     10 LISTEN
>
> In an effort to tune things, I've tried playing with the TCP timeout
> settings a bit, and the response times have improved somewhat. Please
> note that I've been testing response times using the loopback
> interface to rule out any ethernet hardware issues.
>
> echo 15>  /proc/sys/net/ipv4/tcp_fin_timeout
> echo 1>  /proc/sys/net/ipv4/tcp_tw_recycle
> echo 1>  /proc/sys/net/ipv4/tcp_tw_reuse
>
> We're running prefork, and have configured the client settings to what
> seem to be reasonable limits for our hardware.
>
> <IfModule prefork.c>
> StartServers       100
> MinSpareServers    100
> MaxSpareServers   200
> ServerLimit       1500
> MaxClients        1500
> MaxRequestsPerChild 1000000
> </IfModule>
>
> Any help or advice would be greatly appreciated.

Yes, having lots of TME_WAIT can have a serious impact on TCP 
performance. So I think your approach is reasonable. Unfortunately (at 
leas that was true a few years ago), Linux does not support setting a 
timeout value for TIME_WAIT, as e.g. Solaris does. Unfortunately the 
docs about the reuse and recycle switches is far from being detailed.

Are you using HTTP Keep-Alive? Your high ESTABLISHED numbers suggest 
that. If not, that could reduce the TIME_WAIT numbers too, but comes 
with a price: you would get much higher ESTABLISHED rates (and thus need 
for even more httpd threads, typically about 5 times of what you see 
without Keep-Alive).

Regards,

Rainer

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Connection Issues

Posted by Jeroen Geilman <je...@adaptr.nl>.
On 11/29/2010 11:25 PM, Travis Whitton wrote:
> Hi,
>
> We're experiencing some odd behavior regarding connections taking a
> long time to establish to our website. We've been running Apache in
> production for over three years now and have recently began
> experiencing issues where the server-status page, static, and dynamic
> content response times will slow anywhere from a few seconds to long
> enough for the connection to timeout.
>
> Initially thinking we might be hitting some hard limits with the OS,
> we've thoroughly audited our sysctl variables, tried disabling
> iptables and conntrack, and ensured that we're not running out of
> ephemeral ports or anything along those lines. Looking at netstat, it
> seems we have a pretty large number of connections in TIME_WAIT which
> is understandable since this is a high traffic website, but I'm
> wondering if this value could indicate we're backlogging on TCP
> connections or something along those lines?
>
> [root@RHL073 ipv4]# netstat -an | awk '/^tcp/ {A[$(NF)]++} END {for (I
> in A) {printf "%5d %s\n", A[I], I}}'
> 34723 TIME_WAIT
>      3 CLOSE_WAIT
>    275 FIN_WAIT1
>     74 FIN_WAIT2
>   8824 ESTABLISHED
>    815 SYN_RECV
>    102 CLOSING
>     30 LAST_ACK
>     10 LISTEN
>
> In an effort to tune things, I've tried playing with the TCP timeout
> settings a bit, and the response times have improved somewhat. Please
> note that I've been testing response times using the loopback
> interface to rule out any ethernet hardware issues.
>
> echo 15>  /proc/sys/net/ipv4/tcp_fin_timeout
> echo 1>  /proc/sys/net/ipv4/tcp_tw_recycle
> echo 1>  /proc/sys/net/ipv4/tcp_tw_reuse
>
> We're running prefork, and have configured the client settings to what
> seem to be reasonable limits for our hardware.
>
> <IfModule prefork.c>
> StartServers       100
> MinSpareServers    100
> MaxSpareServers   200
> ServerLimit       1500
> MaxClients        1500
> MaxRequestsPerChild 1000000
> </IfModule>
>
>    

Forking new children is VERY expensive, compared to the alternatives.

If 1500 concurrent clients is common for your site, consider starting up 
that many as well.
min/maxspare is only meant to handle bursts, not define your normal load.
Your settings mean "accept up to 1500 concurrent connections, but only 
RUN 300 threads when you don't have that many clients"

Since apache will have to fork up to 1200 threads in rapid succession 
when the load spikes, this will cause startup throttling after only a 
few seconds, which is causing your timeouts.

You should change these to AT LEAST 1000 startup, 100 minspare and 200 
maxspare - if 1500 is your actual max load, and not a limit you imposed 
because you think it can't handle more.
It can handle many more, if you have the memory for them.

With 1500 concurrent connections, I would long ago have moved to worker 
combined with proxying dynamic content to a separate prefork instance.
This will optimize memory and resource usage to such an extent that you 
can easily run 5000 clients concurrently.

Worker threads are much more efficient and take far less memory than 
prefork children, therefore they suffer far less from being short-lived 
(due to low maxrequest settings)

Unless the majority of these requests are for dynamic content (they 
rarely are), I predict you can increase performance several fold.

-- 
J.


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org