You are viewing a plain text version of this content. The canonical link for it is here.
Posted to bugs@httpd.apache.org by bu...@apache.org on 2017/03/31 14:59:41 UTC
[Bug 60948] New: Large TCP timeout delays hcheck disabling a node
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
Bug ID: 60948
Summary: Large TCP timeout delays hcheck disabling a node
Product: Apache httpd-2
Version: 2.4.25
Hardware: Sun
OS: Solaris
Status: NEW
Severity: enhancement
Priority: P2
Component: mod_proxy_hcheck
Assignee: bugs@httpd.apache.org
Reporter: michael.renz@kordoba.de
Target Milestone: ---
Created attachment 34892
--> https://bz.apache.org/bugzilla/attachment.cgi?id=34892&action=edit
added new hcconnectiontimeout parameter
Using latest patched mod_proxy_hcheck (with patch from bug 60071) I encountered
a problematic situation.
If a node goes down due to a complete failure and is not reachable via tcp/ip
anymore, the long solaris tcp/ip timeout causes mod_proxy_hcheck to DISABLE the
node very late.
mod_proxy_hcheck does not provide a connection-timeout parameter to shorten
this.
On top, the threadpool defined via ProxyHCTPsize quickly fills up and uses all
available threads waiting for the timeout. The workaround is to increase
ProxyHCTPsize to e.g. 500. But the problem remains, that once the node goes
down it is not DISABLED until the first timeout has been reached. Solaris has a
timeout of about 120s, therefore the problematic node will still get requests
during this time. These requests will run into the "connectiontimeout", but
this is still not a good situation as it slows down many requests.
I have patched (well, more copy/paste) the mod_proxy_hcheck.c and added a new
parameter called "hcconnectiontimeout". With this new parameter my tests look
good now.
Example configuration would look like this:
SSLProxyEngine On
SSLProxyVerify none
SSLProxyCheckPeerCN off
SSLProxyCheckPeerName off
SSLProxyCheckPeerExpire off
ProxyHCTPsize 400
ProxyHCExpr get {hc('body') =~ /OK/}
ProxyHCTemplate server hcmethod=GET hcexpr=get hcfails=1 hcinterval=2
hcpasses=1 hcuri=/tester
<Proxy balancer://group>
BalancerMember https://192.168.0.2:8080 connectiontimeout=1
hcconnectiontimeout=1 hctemplate=server
BalancerMember https://192.168.0.3:8080 connectiontimeout=1
hcconnectiontimeout=1 hctemplate=server
</Proxy>
<VirtualHost *:80>
ProxyPass "/" "balancer://group/" failontimeout=On timeout=2
ProxyPassReverse "/" "balancer://group/"
</VirtualHost>
I hope this helps anyone.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org
[Bug 60948] Large TCP timeout delays hcheck disabling a node
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
--- Comment #4 from goldyliang@gmail.com ---
This fix seems to only fixes the case when there is a high latency in the TCP
connection.
However, there is a case we encountered now, that the back-end server does not
respond to the health check http request due to an internal issue or resource
exhaustion after the connection has been established and the request has been
sent. In such a case, the health check from httpd seems to hang there without
any read timeout.
The consequence is that the load balancer is not able to mark a backend as
HCFL, and it seems to be blocking the health check towards other balancer
members as well. In this case, the health check is not able to function as it
should be.
Please advice if my understanding is right. If yes, how can we add the read
timeout for health checks.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org
[Bug 60948] Large TCP timeout delays hcheck disabling a node
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
--- Comment #3 from jfclere <jf...@gmail.com> ---
it is confusing to have connectiontimeout and hcconnectiontimeout.
I have committed http://svn.apache.org/viewvc?view=revision&revision=1862014
if that helps please close the BZ.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org
[Bug 60948] Large TCP timeout delays hcheck disabling a node
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
--- Comment #7 from Aaron Ogburn <ao...@redhat.com> ---
If you need a different timeout for the hcheck (shorter) and the proxied
requests (longer), I saw that could be achieved by setting a short ProxyTimeout
for the hcheck and set a longer desired timeout flag on the BalancerMembers.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org
[Bug 60948] Large TCP timeout delays hcheck disabling a node
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
--- Comment #2 from Thomas Meyer <th...@m3y3r.de> ---
Hi, any updates on this?
an independent timeout for the health check http request would be really
helpful!
the patch looks okay, any thing that I can do to get this merged?
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org
[Bug 60948] Large TCP timeout delays hcheck disabling a node
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
Michael Renz <mi...@fsphost.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #34892|0 |1
is obsolete| |
--- Comment #1 from Michael Renz <mi...@fsphost.com> ---
Created attachment 34893
--> https://bz.apache.org/bugzilla/attachment.cgi?id=34893&action=edit
I forgot to allow it in ProxyHCTemplate and the parameter is now optional
I forgot to allow it in ProxyHCTemplate and the parameter is now optional
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org
[Bug 60948] Large TCP timeout delays hcheck disabling a node
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
Christophe JAILLET <ch...@wanadoo.fr> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |PatchAvailable
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org
[Bug 60948] Large TCP timeout delays hcheck disabling a node
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
--- Comment #6 from goldyliang@gmail.com ---
Thank you for pointing out this, it is helpful. I will test it and hopefully if
it works fine, it can be in the coming release.
I quickly looked in the patch code, looks like it tries to use the same timeout
setting from the balancer config. The concern of that is, in some load balancer
config we have to set a large timeout (like 5 minutes) to fit some heavy
requests which take minutes to complete. But as per health check, we really
expect it to timeout in seconds. It would be better if the health check
timeout can be set differently.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org
[Bug 60948] Large TCP timeout delays hcheck disabling a node
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=60948
--- Comment #5 from Ruediger Pluem <rp...@apache.org> ---
Looks like the backport of r1889936 is missing. Can you check if
http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/proxy/mod_proxy_hcheck.c?r1=1889936&r2=1889935&pathrev=1889936&view=patch
fixes your issue?
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org