You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2020/01/16 11:12:46 UTC

[Bug 64080] New: Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

            Bug ID: 64080
           Summary: Graceful shutdown does not occur for connected clients
                    that have not yet submitted their request payload
           Product: Tomcat 9
           Version: 9.0.27
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Catalina
          Assignee: dev@tomcat.apache.org
          Reporter: carbattles@hotmail.com
  Target Milestone: -----

Continuing from bug 62515, testing shows that graceful shutdown will occur if:
- the request has reached the servlet
- bindOnInit is set to false on Connector in server.xml
- unloadDelay is set to sufficient high value on Context in context.xml

The key point here is "the request has reached the servlet".

The server will accept a connect SYN/ACK and still terminate the connection,
resulting in error upstream.

Expected behavior for Tomcat here would be to:
1. Either not accept the connect (as it is shutting down)
Or
2. If accepted, wait unloadDelay/some_other_property period of time to receive
the request and respond to it.

To reproduce the error:

SERVER
- servlet with sleep of 30s
- unloadDelay set to 60s
- bindOnInit set to false

Testcase:
1. Use HTTPUrlConnection
urlConn.setDoInput (true);
urlConn.setDoOutput (true);
urlConn.setUseCaches (false);
urlConn.setConnectTimeout(1000); //1 second
urlConn.setReadTimeout(10000); // 10 seconds
urlConn.connect();

At this point a succesful connection to Tomcat server has been established by
the Java Client

2. Shutdown Tomcat server
Tomcat will abort this connection

3. Submit request: 
OutputStreamWriter printout = new OutputStreamWriter(urlConn.getOutputStream
(), charset);

This will throw a IOException (SocketTimeOutException)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #2 from carbattles@hotmail.com ---
>Are you expecting Tomcat to process these requests normally (within some >configured timeout or set of timeouts for each stage) or is cleanly returning a >503 an acceptable response?

In a perfect world, I'd like Tomcat to behave like an enterprise server, eg if
it accepts a connection, it will process the request and submit the response
before shutting down. A timeout should be provided (IMO unloadDelay is
sufficient), for in-flight requests (including the connection stage) to
complete.

If this is not possible, then I would prefer the socket NOT to accept the
connection, eg any connection attempt from the client (after the server
shutdown has started) should result in a connection time out. This is IMO a
much better approach than 503, as it is a more clear cut message to the
upstream client that this server is not available, so either fail over to
another server or report the error message further up stream.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #14 from Mark Thomas <ma...@apache.org> ---
Fixed in:
- 10.0.x for 10.0.0-M11 onwards
- 9.0.x for 9.0.41 onwards

Back-port to 8.5.x and earlier isn't practical due to the extent of the
refactoring between 8.5.x and 9.0.x particularly around the Acceptor.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #3 from Christopher Schultz <ch...@christopherschultz.net> ---
(In reply to carbattles from comment #2)
> In a perfect world, I'd like Tomcat to behave like an enterprise server, eg
> if it accepts a connection, it will process the request and submit the
> response before shutting down.

AIUI, this means that connectors cannot be paused. It also means that incoming
requests can indefinitely keep the server running, if each incoming requests
has its own separate timeout.

I think what you are requesting is not possible.

The way to get the behavior you want is to use a reverse proxy in front of the
application server. The proxy can re-try a request if it is dropped by the
back-end server, possibly failing-over to another node.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #15 from Mark Thomas <ma...@apache.org> ---
This has now been back-ported to 8.5.x for 8.5.76 onwards.

Other fixes required 8.5.x to be updated to align with 9.0.x which made this
back-port possible.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #7 from Mark Thomas <ma...@apache.org> ---
I recommend folks take a look at bug 62515 and the associated commits before
commenting further on this bug. It might save us all some time discussing what
is and is not possible and what Tomcat currently does.

Historically, the Tomcat approach has been to shut the server down ASAP,
providing a 503 response or similar (e.g. a timeout signal for async) where the
opportunity allows and allowing the request to complete (with a short timeout)
where there is no opportunity for a 503.

The move to non-blocking opened a gap where it is now possible that an
in-flight request will just see a closed connection.

Fixing the closed connection issue is relatively simple. Switching strategy
from 'provide a 503 as early as possible' to 'allow requests to complete if
possible' is a more invasive change and is not consistent with, historically,
the behaviour Tomcat users have asked for on shutdown. Whether it is possible
to make the shutdown strategy configurable isn't something I've looked at. That
looks to be worth doing to see how invasive the patch to do that would be.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #6 from carbattles@hotmail.com ---
(In reply to Christopher Schultz from comment #5)
> (In reply to carbattles from comment #4)
> > @Christopher Schultz In the previous bug 62515 I mentioned the first thing
> > that should happen, when a shutdown is requested, is that the port should be
> > closed, so no more incoming requests are accepted.
> 
> Closing the port will terminate all the connections, including those
> in-flight.
> 
> > Then in-flight requests should be executed to completion based on timeout.
> 
> Not possible.
> 
> > This bug is about that this is only partially implemented, as the port is
> > partially closed as the first thing, it still accepts connect requests, but
> > fails when the client tries to submit the request payload (eg on a POST).
> 
> There is no such thing as "partially closed". The port is either bound or
> not. When unbinding, everything is lost. So you either wait and possibly
> never close or you close and possibly drop connections in the backlog.

I understand what you are saying, but this is how most enterprise servers does
it.

Quote from WebSphere:
>WebSphere Application Server supports three shutdown modes: stop, stop immediate and terminate (in order of immediacy).
>
>A normal stop begins by preventing new inbound HTTP and IIP requests and then has a quiesce period in which in-flight requests are allowed to complete. The maximum time allowed for these requests to complete is 180 seconds (configurable via a JVM property). This maximum is enforced regardless of whether or not a request is part of a global transaction. At the end of this period the application server components (including any in-process messaging engine) then begin to shutdown.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #12 from carbattles@hotmail.com ---
> If the expectation is that a request (from the client's perspective) will
> not be dropped, then I think you can solve this at the load-balancer with
> retry-based fail-over: the lb takes care of the fail-over and the client
> detects no problems at all.

The problem here is that the loadbalancer is in-house developed using Java, and
the HttpURLConnection class does not distinguish between SocketTimeOutException
for request payload sent and not sent, due to its built-in retry functionality.
Therefore neither the exception nor the message is any indication of whether
the loadbalancer can retry or not.
I can provide links explaining this in detail, if this is of interest. 

To solve this we will have to either handle sockets ourselves, or switch to
another http client instead of HttpUrlConnection

> If you refuse to allow a request to an individual node to be dropped, that
> is a much taller order.
> 
> In my environment, we tell the lb that the nodes are coming down so we avoid
> any of this. The lb allows all in-flight requests to complete and we only
> shut-down the node after that point. No new requests are sent to the target
> node after the lb is told to take it (softly) out of service. The only time
> requests should ever be dropped in this situation is if a node unexpectedly
> goes down. We do this using mod_jk with DISABLED and STOP states.
> mod_proxy_* has these same states and can be used with either AJP or HTTP as
> the communication protocol.

That is plan B, depending on the conclusion for this bug: Do a custom
integration between our build server and the load balancer. 

Even a 503 could be use able, provided that we can retry knowing, the request
was rejected by tomcat.

Who can provide a conclusion on this defect, in terms of next step?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #5 from Christopher Schultz <ch...@christopherschultz.net> ---
(In reply to carbattles from comment #4)
> @Christopher Schultz In the previous bug 62515 I mentioned the first thing
> that should happen, when a shutdown is requested, is that the port should be
> closed, so no more incoming requests are accepted.

Closing the port will terminate all the connections, including those in-flight.

> Then in-flight requests should be executed to completion based on timeout.

Not possible.

> This bug is about that this is only partially implemented, as the port is
> partially closed as the first thing, it still accepts connect requests, but
> fails when the client tries to submit the request payload (eg on a POST).

There is no such thing as "partially closed". The port is either bound or not.
When unbinding, everything is lost. So you either wait and possibly never close
or you close and possibly drop connections in the backlog.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #11 from Christopher Schultz <ch...@christopherschultz.net> ---
(In reply to carbattles from comment #8)
> If it is any help, our setup is 12 tomcat servers, with load balancer in
> front. Our build process will restart the tomcat servers in sequence. We
> expect that not a single request is dropped, however may process slower.

If the expectation is that a request (from the client's perspective) will not
be dropped, then I think you can solve this at the load-balancer with
retry-based fail-over: the lb takes care of the fail-over and the client
detects no problems at all.

If you refuse to allow a request to an individual node to be dropped, that is a
much taller order.

In my environment, we tell the lb that the nodes are coming down so we avoid
any of this. The lb allows all in-flight requests to complete and we only
shut-down the node after that point. No new requests are sent to the target
node after the lb is told to take it (softly) out of service. The only time
requests should ever be dropped in this situation is if a node unexpectedly
goes down. We do this using mod_jk with DISABLED and STOP states. mod_proxy_*
has these same states and can be used with either AJP or HTTP as the
communication protocol.

I'm not sure if this gives you any viable alternatives, here, but I believe it
allows you to accomplish your goal without adding a lot of complexity to the
application server (i.e. Tomcat).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #8 from carbattles@hotmail.com ---
> 'allow requests to complete if
> possible' is a more invasive change and is not consistent with,
> historically, the behaviour Tomcat users have asked for on shutdown. 

Isn't this what unloadDelay is for? My testing shows, this is exactly what it
does, it provides time for requests that have reached the servlet to finish
processing. If the servlets do not finish their processing, and the time runs
out, the in-flight requests are terminated.

I also think this bug is being over-compilicated. I'm not asking for new
functionality, just the shutdown to also respect connections that have already
been established (but not yet reached the servlets) and allow them time to
complete based on unloadDelay.

If it is any help, our setup is 12 tomcat servers, with load balancer in front.
Our build process will restart the tomcat servers in sequence. We expect that
not a single request is dropped, however may process slower. This should have
been fixed in bug 62515, but as I understand Mark, a new bug was introduced
with the introduction of non blocking IO.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #13 from Christopher Schultz <ch...@christopherschultz.net> ---
FYI there is some movement on this.

https://lists.apache.org/thread.html/r980a8ed033a14b417231f1296e52daf07b15f76d565586520e5446f1%40%3Cdev.tomcat.apache.org%3E

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #9 from Mark Thomas <ma...@apache.org> ---
(In reply to carbattles from comment #8)
> > 'allow requests to complete if
> > possible' is a more invasive change and is not consistent with,
> > historically, the behaviour Tomcat users have asked for on shutdown. 
> 
> Isn't this what unloadDelay is for? My testing shows, this is exactly what
> it does, it provides time for requests that have reached the servlet to
> finish processing. If the servlets do not finish their processing, and the
> time runs out, the in-flight requests are terminated.

No. unloadDelay applies only once the request has been passed to the Servlet.

There are opportunities to respond with a 503 before the request is passed
to the Servlet and this is what Tomcat does because that is what users have
previously requested.

> I also think this bug is being over-compilicated. I'm not asking for new
> functionality,

Yes you are.

> just the shutdown to also respect connections that have
> already been established (but not yet reached the servlets) and allow them
> time to complete based on unloadDelay.

That *is* new behaviour and runs contrary to the current design. That is what
makes the change more invasive / complicated / risky / etc.

> If it is any help, our setup is 12 tomcat servers, with load balancer in
> front. Our build process will restart the tomcat servers in sequence. We
> expect that not a single request is dropped, however may process slower.
> This should have been fixed in bug 62515, but as I understand Mark, a new
> bug was introduced with the introduction of non blocking IO.

The bug has always existed. Non-blocking I/O just makes it more likely the
connection is closed immediately rather than a 503 response being written
and then the connection being closed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #4 from carbattles@hotmail.com ---
@Christopher Schultz In the previous bug 62515 I mentioned the first thing that
should happen, when a shutdown is requested, is that the port should be closed,
so no more incoming requests are accepted.

Then in-flight requests should be executed to completion based on timeout.

This bug is about that this is only partially implemented, as the port is
partially closed as the first thing, it still accepts connect requests, but
fails when the client tries to submit the request payload (eg on a POST).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
                 OS|                            |All

--- Comment #1 from Mark Thomas <ma...@apache.org> ---
I'll note that:
- there is a similar situation with requests in the keep-alive state;
- the issue was introduced largely as a result of the switch from blocking IO
to non-blocking IO.

Are you expecting Tomcat to process these requests normally (within some
configured timeout or set of timeouts for each stage) or is cleanly returning a
503 an acceptable response?

Looking at the existing code, processing the requests normally is a much more
invasive change that returning a 503.

A likely issue with a 503 is that clients won't read it until they have sent
the entire request so the Connector's maxSwallowSize configuration becomes a
factor as well.

It looks like changes would be required;
- in the Processor.service() implementations to always process the request on
the first iteration around the processing loop
- in Endpoint.stopInternal() to add a delay / wait between pause() (which stops
the Acceptor) and stopping the Poller (NIO/APR) / closing the active
connections (NIO2)

I'm moving this to an enhancement

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 64080] Graceful shutdown does not occur for connected clients that have not yet submitted their request payload

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=64080

--- Comment #10 from carbattles@hotmail.com ---
I understand the complexity. I look forward to seeing what you think is the way
forward. 

If the conclusion is a 503, then I assume it is guaranteed that the request is
not forwarded to the servlet? eg the loadbalancer can use the 503 as a retry to
another tomcat server? Executing the same request twice is fatal in our setup.

Thank you.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org