You are viewing a plain text version of this content. The canonical link for it is here.

Posted to bugs@httpd.apache.org by bu...@apache.org on 2015/02/02 02:50:42 UTC

[Bug 57520] New: proxy: worker failure disrupts existing connections

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

            Bug ID: 57520
           Summary: proxy: worker failure disrupts existing connections
           Product: Apache httpd-2
           Version: 2.2.29
          Hardware: PC
                OS: Linux
            Status: NEW
          Keywords: ErrorMessage
          Severity: major
          Priority: P2
         Component: mod_proxy
          Assignee: bugs@httpd.apache.org
          Reporter: dkruglyak@gmail.com
        Depends on: 37770

I have been stress-testing fault tolerance of an Apache load balancer with
Jetty backend being restarted under heavy load.

Our test requests involve downloading of large XML documents (in HTTP request
bodies) with heavy custom processing before response - which means the full
requests take between 100-1000ms depending of the system load.

Initially, I have been getting errors similar to the ones described in Bug
37770, so I applied all the proposed fixes in all possible combinations (HTTP
1.0, no keepalive, initial not pooled, etc) - however the same errors as in Bug
37770 kept re-appearing from time to time.

Then I noticed that our Jetty servlet occasionally fails to read HTTP input
stream - encountering and logging an early EOF. Seems to me the ONLY possible
cause of this condition would be if httpd forcefully closed the connection for
currently executing requests. This could occur once Jetty has stopped accepting
new requests, but have not yet finished processing existing ones - it would
return an error to httpd, which would mark the worker as failed and kill
existing connections. This condition would be noticeable with large requests
under heavy load.

I wonder if this theory makes sense and if a fix could be developed.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #8 from Yann Ylavic <yl...@gmail.com> ---
(In reply to Yann Ylavic from comment #6)
> But in this scenario there shouldn't be any "Connection reset by peer:
> proxy: error reading status line from remote server", which indicates that
> an established connection was *reset* by jetty with no response at all, no
> 200 nor 503...
> This error won't put the bachend in recovery state (the connect()ion succeed
> and there is no status to fail on), but with proxy-initial-not-pooled and a
> normal browser as client, this one will resend the request without notifying
> the user.

This is wrong, proxy-initial-not-pooled does not help the browser here, it just
prevent mod_proxy from reusing a kept alive connection on the backend side when
the request comes from new client connection (first one on this connection),
precisely because the client does not expect an error is this case (and reusing
a connection whose ttl is above backend's KeepAliveTimeTimeout may produce the
error).
Particularly, proxy-initial-not-pooled should not be used when ttl is lower
than backend's KeepAliveTimeTimeout since reusing backend connections is not an
issue and may even help fault tolerance as seen is previous message.

I mixed up with mod_proxy's behaviour (hard coded) on the client side when this
error occurs.
Either the request comes from a kept alive connection and hence the browser is
expecting that kind of error, so mod_proxy will respond with a closed
connection (no HTTP response at all) so that the browser can recover (resend
the request) without notifying the user.
Or the request is the first one on the client connection and httpd will respond
with a 502 because the browser can't recover (unexpected error).
Unfortunately this is not something which is controllable on httpd, hence this
errors are not *always* transparent for the user.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

dk <dk...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--- Comment #3 from dk <dk...@gmail.com> ---
1) Yes, Jetty is being restarted gracefully. I am calling server.stop() and
then server.join() to ensure the JVM does not exit till all current requests
finish. All Java-side issues have been sorted out by now

2) The error that reveals the problem here is a Java exception with "EOF while
reading" HTTP input stream. This is only possible if the connection is closed
by Apache - JVM would not do this since Jetty is being stopped gracefully and
waiting for requests to finish, per above.

3) Yes, XML is being uploaded into Jetty as HTTP body. It is read by servlet as
raw input stream fed into XML parser. Per above, the stream reading
sporadically fails with a "premature EOF exception" if Jetty is being
stopped/restarted under heavy load 

4) The HTTP body read never fails unless Jetty is restarted. Both our test
client and Jetty servlet have been confirmed to function correctly under all
circumstances, except the stress test of Jetty restart under heavy load

5) Of course I see "disabling worker" messages. That's the point of the test -
to ensure that restarting Jetty under heavy load is graceful and does not
mangle requests in process. Additionally "proxy: error reading status line from
remote server" messages occasionally appear which may also be consistent with
the possibility of httpd killing in-process connections when a worker is
disabled.

6) I tried this with or without keepalives (per instructions in Bug 37770) and
the behavior is consistent. Specifically the tested cases were: 1) 'SetEnv
proxy-nokeepalive 1' 2) keepalive=On on all balancer workers 3) 'KeepAlive 600'
within vhost config

Here are the sample lines from httpd error log -

[Sun Feb 01 16:11:48 2015] [error] [client xx.xx.xx.xx] (20014)Internal error:
proxy: error reading status line from remote server host1:xxxx
[Sun Feb 01 16:11:48 2015] [error] [client xx.xx.xx.xx] proxy: Error reading
from remote server returned by /my_uri

[Sun Feb 01 16:40:06 2015] [error] [client xx.xx.xx.xx] (104)Connection reset
by peer: proxy: error reading status line from remote server host1:xxxx
[Sun Feb 01 16:40:06 2015] [error] [client xx.xx.xx.xx] proxy: Error reading
from remote server returned by /my_uri
[Sun Feb 01 16:40:06 2015] [error] [client xx.xx.xx.xx] (104)Connection reset
by peer: proxy: error reading status line from remote server host2:xxxx
[Sun Feb 01 16:40:06 2015] [error] [client xx.xx.xx.xx] proxy: Error reading
from remote server returned by /my_uri
[Sun Feb 01 16:40:06 2015] [error] (111)Connection refused: proxy: HTTP:
attempt to connect to xx.xx.xx.xx:xxxx (host1) failed
[Sun Feb 01 16:40:06 2015] [error] ap_proxy_connect_backend disabling worker
for (host1)
[Sun Feb 01 16:40:26 2015] [error] (111)Connection refused: proxy: HTTP:
attempt to connect to xx.xx.xx.xx:xxxx (host2) failed
[Sun Feb 01 16:40:26 2015] [error] ap_proxy_connect_backend disabling worker
for (host2)

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ylavic.dev@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #9 from dk <dk...@gmail.com> ---
Based on your insistence that the problem must be on Jetty side I did in-depth
investigation.

Indeed, turned out that Jetty requires a very specific configuration for
graceful shutdown to work - specifically a working StatisticHandler.
Originally, I did have this handler created but it was not properly invoked /
integrated into the handler chain. After fixing this problem (and adding
logging of the StatisticHandler stats at the request start/stop and server
shutdown start/stop times) I found that all exceptions / errors go away after
thorough testing. In the logs I can observe the case of server shutdown invoked
in the middle of multiple requests processing and then seeing them through to
graceful completion, before server shutdown completion.

Here is the reference for the Jetty issue:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=420142

Just for the record this final test was completed under the following settings
in the VirtualHost:

    SetEnv force-proxy-request-1.0 1
    SetEnv proxy-nokeepalive 1
    SetEnv proxy-initial-not-pooled 1
    RequestHeader unset Expect early

... and this balancer setup:

<Proxy balancer://mybalancer>
    ProxySet failonstatus=502,503 maxattempts=1000
    BalancerMember http://host1:9999 timeout=3600 retry=0 ping=30
disablereuse=on keepalive=on
    BalancerMember http://host2:9999 timeout=3600 retry=0 ping=30
disablereuse=on keepalive=on
    BalancerMember http://host3:9999 timeout=3600 retry=0 ping=30
disablereuse=on keepalive=on
</Proxy>

I intend to leave these settings "as is" for our production system, unless
there is a good reason (and low risk) to restore keep alives or if you might
advise change any other changes... I suppose if all requests have large
payloads, the keepalives won't be adding much benefit?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #5 from dk <dk...@gmail.com> ---
"Connection refused" messages in the error log are expected since after Jetty
JVM exits it won't be listening on ports while it is restarting. While Jetty is
still shutting down and waiting for existing requests to finish it responds
with 503. In httpd I have failonstatus set to handle that.

Could you recommend the specific pcap or tcpdump settings to capture the
connection info you need? I believe capturing the entire traffic payload would
be way too heavy and I am not sure if/how I can even post huge logs into this
issue tracker. I have 3 boxes in the cluster/balancer with Jetty/httpd on each
and one client box issuing round robin stress test requests - 30 threads at the
time. Could you specify exactly what you want me to do?

As far as ttl/keepalives - yes this reported behavior is seen under
"proxy-nokeepalive" too. TTL is the only setting you mentioned that I have not
tried, but I presume it is ignored under nokeepalive regime?

Perhaps you could suggest a couple mod_proxy configurations for me to test with
the traffic recorder on? I should note, it might take a few runs (generating
pretty big logs) since the error behavior is quite flaky and occurs
sporadically - depending on how the Jetty restart is exactly timed.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #4 from Yann Ylavic <yl...@gmail.com> ---
(In reply to dk from comment #3)
> 1) Yes, Jetty is being restarted gracefully. I am calling server.stop() and
> then server.join() to ensure the JVM does not exit till all current requests
> finish. All Java-side issues have been sorted out by now

I don't no much about Jetty but, does it accept new connections while
restarting gracefully? The "Connection refused" in the error log seems to
indicate it does not.

> 
> 2) The error that reveals the problem here is a Java exception with "EOF
> while reading" HTTP input stream. This is only possible if the connection is
> closed by Apache - JVM would not do this since Jetty is being stopped
> gracefully and waiting for requests to finish, per above.

Could you provide some network (pcap) traces between the proxy and jetty which
would capture the restart (so that we can figure out what's going on and which
part is closing/resetting the connections)?


> 5) Of course I see "disabling worker" messages. That's the point of the test
> - to ensure that restarting Jetty under heavy load is graceful and does not
> mangle requests in process. Additionally "proxy: error reading status line
> from remote server" messages occasionally appear which may also be
> consistent with the possibility of httpd killing in-process connections when
> a worker is disabled.
> 
> 6) I tried this with or without keepalives (per instructions in Bug 37770)
> and the behavior is consistent. Specifically the tested cases were: 1)
> 'SetEnv proxy-nokeepalive 1' 2) keepalive=On on all balancer workers 3)
> 'KeepAlive 600' within vhost config

Unless "proxy-nokeepalive" (SetEnv) or "disablereuse=on" (ProxyPass' parameter)
are used, the backend (jetty) connections will be reused for
multiple/successive requests.
The time these connections are kept alive between requests (idle) is given by
the parameter "ttl=<seconds>" (ProxyPass'), whereas "keepalive=on" relates to
the TCP-keepalive (not HTTP-keepalive), and KeepAliveTimeout (assuming this is
what you mean with 'KeepAlive 600') applies to client connections (not to
backend's).

By using a ttl= lower than the configured KeepAliveTimeout on the jetty side,
it would prevent the case where the backend closes the connection while the
proxy is reusing it for a successive request, which would lead to the same
"proxy: error reading status line" error since the proxy wouldn't realize
before the connection is read (write would succeed).

This cannot happen with "proxy-nokeepalive" though, backend connections are not
reused and hence a new one is created for each request.
>From mod_proxy point of view, an "error reading" on a newly established
connection means it has been closed/reset on the backend side (whereas a
response is expected).
That's why I think we need network traces to figure out...

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #6 from Yann Ylavic <yl...@gmail.com> ---
(In reply to dk from comment #5)
> Perhaps you could suggest a couple mod_proxy configurations for me to test
> with the traffic recorder on? I should note, it might take a few runs
> (generating pretty big logs) since the error behavior is quite flaky and
> occurs sporadically - depending on how the Jetty restart is exactly timed.

Given your load scenario and "sporadic error behavior", it is probably not a
good idea to use a network capture (would indeed produce a big file, but mainly
huge time to analyse it).

You could restrict captured packets' size with tcpdump -s or just filter
SYN/FIN/RST (and then find connections terminated prematuraly, ie. those with
low sequence numbers at the end, correlated with error log timestamps), still
probably quite painful to analyse with high traffic.

Before that, maybe we can think more about what's happening...

First, please note (as already requested) that the relevant part of your
configuration (<Proxy balancer:...></Proxy>, ProxyPass, ..., or the complete
file) would help determine the expected httpd's behaviour.

Otherwise, can you see "child pid <pid> exit signal <sig>..." messages in the
global log file (the one pointed to by the main ErrorLog directive, out of any
VirtualHost)?
Those would mean a crash in httpd (children) and could explain why some
connections are closed forcibly (by the system) before the request is sent, as
detected on the jetty side.

Other than that, mod_proxy will *never* close any connection without even
trying to send the request it was created for (but read failures on the client
side), even if the backend (LoadBalancerMember) has switched to recovery state
in the meantime (because of some simultaneous request's failure).
So this is not something that should/could be fixed (per comment #1), it just
should not happen.

The only reason (I see) for which jetty could get an EOF (without data) is a
race condition between a connect() timeout on the proxy side and a
(simultaneous) accept() on the jetty side (that would cause "The timeout
specified has expired" instead of "Connection refused" in the log, though).

> "Connection refused" messages in the error log are expected since after
> Jetty JVM exits it won't be listening on ports while it is restarting. While
> Jetty is still shutting down and waiting for existing requests to finish it
> responds with 503. In httpd I have failonstatus set to handle that.

BTW, in both cases ("connection refused" and 503) mod_proxy will put the
backend in recovery state for a period of retry=<seconds> (60 by default),
which is what is intended (IIUC).
So, all the "connection refused" messages should only appear once jetty is
completely down and until jetty is up again, and only when the backend is being
retried (the number of lines should then be the number of simultaneous requests
elected for that backend at that time).
So far, so good.

But in this scenario there shouldn't be any "Connection reset by peer: proxy:
error reading status line from remote server", which indicates that an
established connection was *reset* by jetty with no response at all, no 200 nor
503...
This error won't put the bachend in recovery state (the connect()ion succeed
and there is no status to fail on), but with proxy-initial-not-pooled and a
normal browser as client, this one will resend the request without notifying
the user.

>From the log lines you provided in comment #3 :
> [Sun Feb 01 16:40:06 2015] [error] [client xx.xx.xx.xx] (104)Connection reset by peer: proxy: error reading status line from remote server host1:xxxx
> [Sun Feb 01 16:40:06 2015] [error] [client xx.xx.xx.xx] proxy: Error reading from remote server returned by /my_uri
> [Sun Feb 01 16:40:06 2015] [error] [client xx.xx.xx.xx] (104)Connection reset by peer: proxy: error reading status line from remote server host2:xxxx
> [Sun Feb 01 16:40:06 2015] [error] [client xx.xx.xx.xx] proxy: Error reading from remote server returned by /my_uri
> [Sun Feb 01 16:40:06 2015] [error] (111)Connection refused: proxy: HTTP: attempt to connect to xx.xx.xx.xx:xxxx (host1) failed
> [Sun Feb 01 16:40:06 2015] [error] ap_proxy_connect_backend disabling worker for (host1)
it suggests that reading errors can arise just before jetty is down (not
connectable anymore), so it seems that the graceful shutdown is missing some
already established connections...
In fact this is also racy, between the time when jetty detects/decides there is
no pending connection (no more 403) and the one the listening socket is really
closed, there may be new connection handshaken (TCP speaking) by the OS (up to
the listen backlog size).
These connections will typically be reset once the listening socket is really
closed.

> 
> As far as ttl/keepalives - yes this reported behavior is seen under
> "proxy-nokeepalive" too. TTL is the only setting you mentioned that I have
> not tried, but I presume it is ignored under nokeepalive regime?

Correct, I just mentioned this based on your "keepalive=On" + "Keepalive 600"
configuration from comment #3, and only to notice that ttl lower than backend's
KeepAliveTimeout is probably a better alternative with regard to
performances/resources (since connections to jetty would be reused), and
possibly also with regard to fault tolerance (see below).

That would depend on how jetty handles kept alive (idle) connections on
gracefull restart (in your scenario).
If those are closed immediatly the issue remain that mod_proxy may reuse
connections closed before they can be detected as such on its side.
But if they are closed only above KeepAliveTimeout while still answered with
503 when some request arrives (on time), everything is fine.

This "mode" may also be better for fault tolerance too (provided "everything is
fine" above), because now mod_proxy will likely reuse already established
connections and should receive 403s on all of them (for all simultaneous
requests), until the recovery state.

Although this is just (my) theoretical thought...

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #10 from Yann Ylavic <yl...@gmail.com> ---
(In reply to dk from comment #9)
> Based on your insistence that the problem must be on Jetty side I did
> in-depth investigation.

Sorry if it appeared insistent, it was not my intent.

My goal was more about giving an overview on the issues regarding connections
between a proxy and a backend (I may have been confusing in that goal though,
based on a real use case), because these questions keep coming regularly either
on bugzilla or mailing lists (or personnal experience).

I think this is is racy by nature, under heavy stress no one should expect zero
error.
And jetty's graceful shutdown is not an exception (IMHO), it has to decide when
to stop listenning/accepting clients, but the proxy has no idea about this and
still asks for connections (especially with retry=0)...

> 
> Just for the record this final test was completed under the following
> settings in the VirtualHost:
> 
>     SetEnv force-proxy-request-1.0 1
>     SetEnv proxy-nokeepalive 1
>     SetEnv proxy-initial-not-pooled 1

Those 3 are redundant, you could leave the second only.
Beside redundancy, I'd suggest not using force-proxy-request-1.0, unless the
backend really expects HTTP/1.0, or requests with chunked encoding is not
supported on jetty.
It will cause chunked requests' bodies to be spooled on the disk before being
forwarded (as Content-Length).

> 
>     RequestHeader unset Expect early
> 
> ... and this balancer setup:
> 
> <Proxy balancer://mybalancer>
>     ProxySet failonstatus=502,503 maxattempts=1000
>     BalancerMember http://host1:9999 timeout=3600 retry=0 ping=30
> disablereuse=on keepalive=on
>     BalancerMember http://host2:9999 timeout=3600 retry=0 ping=30
> disablereuse=on keepalive=on
>     BalancerMember http://host3:9999 timeout=3600 retry=0 ping=30
> disablereuse=on keepalive=on
> </Proxy>

disablereuse is also redundant with SetEnvs above.
ping is ajp:// specific in 2.2.x, it is implemented in 2.4.x for http://, so
you can leave it as is from a migration perspective...
But if the goal is to let jetty handle the Expect header (with RequestHeader
unset above), it won't work in 2.2.x, and since httpd won't handle it either
(still per RequestHeader rule), no one will statisfy the client's
expectation...

> 
> I intend to leave these settings "as is" for our production system, unless
> there is a good reason (and low risk) to restore keep alives or if you might
> advise change any other changes... I suppose if all requests have large
> payloads, the keepalives won't be adding much benefit?

Probably, especially if it works "as is"...
I'd personnaly run an heavier stress test and try to get things race ( but
that's *my* theory ;) and then if it happens, try the other way...

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #7 from Yann Ylavic <yl...@gmail.com> ---
Since there is nothing we can do on httpd side, I'm closing this report.

If some investigation turns up a httpd bug, please reopen.
Otherwise, I suggest users@httpd.apache.org for further (support) discussion.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Yann Ylavic <yl...@gmail.com> ---
(In reply to dk from comment #0)
> I have been stress-testing fault tolerance of an Apache load balancer with
> Jetty backend being restarted under heavy load.

Is Jetty gracefully restarted here?
If not, you probably can't expect zero error here, say a request (including
body) has been forwarded by mod_proxy while Jetty issues an abortive restart,
there is nothing that can be recovered, an error will occur on the proxy side
while "reading status line from remote server".

What is the exact error line printed?

> 
> Our test requests involve downloading of large XML documents (in HTTP
> request bodies) with heavy custom processing before response - which means
> the full requests take between 100-1000ms depending of the system load.

You mean *uploading* a large XML document to Jetty right?
And then Jetty takes up to 1s to generate a response or is the request taking
that time to be forwarded to Jetty (not including response)?

> 
> Initially, I have been getting errors similar to the ones described in Bug
> 37770, so I applied all the proposed fixes in all possible combinations
> (HTTP 1.0, no keepalive, initial not pooled, etc) - however the same errors
> as in Bug 37770 kept re-appearing from time to time.
> 
> Then I noticed that our Jetty servlet occasionally fails to read HTTP input
> stream - encountering and logging an early EOF. Seems to me the ONLY
> possible cause of this condition would be if httpd forcefully closed the
> connection for currently executing requests.

This could also happen if the read failed on the client side while mod_proxy
forwards the (large) request.

> This could occur once Jetty has
> stopped accepting new requests, but have not yet finished processing
> existing ones - it would return an error to httpd, which would mark the
> worker as failed and kill existing connections. This condition would be
> noticeable with large requests under heavy load.

If that happens, you should see "disabling worker" messages in the error log,
is it the case?

> 
> I wonder if this theory makes sense and if a fix could be developed.

I think we need at least the error log and configuration file on the httpd
side, as well as the KeepAlive timeout value (and connection related ones) on
the Jetty side.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|INVALID                     |DUPLICATE

--- Comment #12 from Yann Ylavic <yl...@gmail.com> ---
So that a link is done between the two (other than "depends on").

*** This bug has been marked as a duplicate of bug 37770 ***

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #14 from dk <dk...@gmail.com> ---
Thanks for all your help!

Just completed the final stress-test under the revised settings and everything
goes through without any errors. So for now I'll consider the issue solved for
us...

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|37770                       |

--- Comment #2 from Yann Ylavic <yl...@gmail.com> ---
Bug 37770 is closed, there is nothing to depend on from there right now.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #13 from Yann Ylavic <yl...@gmail.com> ---
(In reply to dk from comment #11)
> Just to confirm - if httpd
> encounters 503 from Jetty under this regime, would it retry a different
> balancer or return 503 to client (with failonstatus set)? The docs are not
> very clear on whether failonstatus would cause the requests to be retried
> and/or if this behavior could be configured...

Yes, any 503 is retried on another BalancerMember, provided at least one is not
in recovery state (forcerecovery=on may be used to force recovery of all the
members otherwise).

However failonstatus plays no role on retry, it only sets the statuses for
which the backend is put into recovery state, but only 503 (or connect() error)
will be retried (moreover 503 is implicit, so failonstatus=503 is redundant
too).
With 502 for example (as per your configuration), it will cause the member to
be put into recovery state for retry seconds but the request will *not* be sent
to another one, and the 502 will reach the client.
Keep in mind that mod_proxy can't decide whether a request can be resent,
particularly non-idempotent ones (as per RFC), you wouldn't want a bank order
to be applied twice (it depends on which way though :) and hence mod_proxy
generally let the browser/user decide.

Finally retry=0 is probably playing a bad role here since when retried, all the
balancer members will be taken into account (and checked against the recovery
period for eligibility), hence retry=0 will force eligibility during the retry,
which may lead to the same member being elected.

> 
> I am going to clean up our proxy settings per your advice. The ones I have
> now were result of trial and error with various suggestions from
> https://issues.apache.org/bugzilla/show_bug.cgi?id=37770 (e.g. I do not care
> about Expect header or HTTP 1.0). I am curious how much of the advice in
> that discussion thread is still applicable with 2.2.29 vs. may have been
> fixed / worked around?

Link between the two tickets is done now.

If needed, this discussion should be continued on the users@ mailing-list.
Bugzilla if for reporting bugs only, not for support.
I'll be happy to contribute, but we should stop here (I realize this is not
what I've done so far, by I guess it's enough).

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 57520] proxy: worker failure disrupts existing connections

Posted by bu...@apache.org.

https://issues.apache.org/bugzilla/show_bug.cgi?id=57520

--- Comment #11 from dk <dk...@gmail.com> ---
Your insistence was very much appreciated. It led me to re-check Jetty backend.

Yes, I am not expecting zero error, just want to make sure the most obvious
things to increase stability are implemented. In my case I believe retry=0
(with maxattempts=1000) is very proper since our Jetty restart script
"staggers" restarts to ensure that only 1/3 of all servers may be down at a
time (we take IP address modulo 3 to calculate the restart delay multiplier) -
so there is always a server to fall back on. Just to confirm - if httpd
encounters 503 from Jetty under this regime, would it retry a different
balancer or return 503 to client (with failonstatus set)? The docs are not very
clear on whether failonstatus would cause the requests to be retried and/or if
this behavior could be configured...

I am going to clean up our proxy settings per your advice. The ones I have now
were result of trial and error with various suggestions from
https://issues.apache.org/bugzilla/show_bug.cgi?id=37770 (e.g. I do not care
about Expect header or HTTP 1.0). I am curious how much of the advice in that
discussion thread is still applicable with 2.2.29 vs. may have been fixed /
worked around?

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org