You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2017/11/29 01:58:29 UTC
[Bug 61831] New: NIO2 connector becomes intermittently unresponsive
after some period of time
https://bz.apache.org/bugzilla/show_bug.cgi?id=61831
Bug ID: 61831
Summary: NIO2 connector becomes intermittently unresponsive
after some period of time
Product: Tomcat 8
Version: 8.0.47
Hardware: All
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Connectors
Assignee: dev@tomcat.apache.org
Reporter: yaolega@gmail.com
Target Milestone: ----
Created attachment 35564
--> https://bz.apache.org/bugzilla/attachment.cgi?id=35564&action=edit
jstack thread dump
We are observing a scenario when NIO2 connector on tomcat becomes unresponsive
after some period of time, at the same time NIO connector running on the same
host is still able to process the same requests and serves traffic. Only server
restart helps in this case.
This issue is intermittent and with the current infrastructure we have few
nodes behind LB and it happens from time to time (like once per week) for each
node, so it seems to be not a node or hardware specific in our case.
Below is our server.xml:
<Executor name="tomcatThreadPool" namePrefix="catalina-exec-"
maxThreads="800" minSpareThreads="100"/>
<Connector executor="tomcatServiceThreadPool"
port="8080"
protocol="org.apache.coyote.http11.Http11Nio2Protocol"
connectionTimeout="1000"
enableLookups="false"
acceptorThreadCount="1"
processorCache="800"
socket.tcpNoDelay="true"
socket.soKeepAlive="true"
socket.soLingerOn="false"
compression="256"
compressableMimeType="text/html,text/xml,text/plain,application/x-protobuf,application/json,application/javascript"
URIEncoding="UTF-8" />
<!-- The load balancer terminates SSL connections and
then forwards them to the following connector as
normal HTTP (non-secure) requests
-->
<Connector executor="tomcatServiceThreadPool"
port="8443"
protocol="org.apache.coyote.http11.Http11NioProtocol"
connectionTimeout="1000"
enableLookups="false"
connectionLinger="-1"
acceptorThreadCount="20"
processorCache="800"
socket.tcpNoDelay="true"
socket.soKeepAlive="true"
socket.soLingerOn="false"
compression="256"
compressableMimeType="text/html,text/xml,text/plain,application/x-protobuf,application/json,application/javascript"
URIEncoding="UTF-8" />
<!-- Define an AJP 1.3 Connector on port 8009 -->
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443" />
Also below is an example of the behavior we observe:
curl -verbose 'http://localhost:8080/rs?id=nio2issue'
* About to connect() to localhost port 8080 (#0)
* Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /rs?id=nio2issue HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.15.3 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: localhost:8080
> Accept: */*
> Referer: rbose
>
* Closing connection #0
* Failure when receiving data from the peer
curl: (56) Failure when receiving data from the peer
at the same time:
curl -i 'http://localhost:8443/rs?id=nio2issue'
HTTP/1.1 302 Found
Also, no unusual errors are logged to catalina.out at the time of the accident.
Enclosed is thread dump from the server.
Also, we have observed the same behavior on tomcat 8.0.18 and upgraded to the
latest version in the same release 8.0.47 but it didn't help.
Please let me know what else might be helpful as we keep one of the servers in
this state, for now, to be able to gather any data as the issue is intermittent
and we were not able to reproduce with a simple load test.
Regards,
Oleg.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 61831] NIO2 connector becomes intermittently unresponsive after
some period of time
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61831
Remy Maucherat <re...@apache.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |WORKSFORME
--- Comment #1 from Remy Maucherat <re...@apache.org> ---
The thread dump looks perfect: acceptor thread blocking on the accept, all
threads idle and ready to execute something. Please investigate on the user
list to get at least some idea on how to reproduce it.
If possible, try to avoid using a custom executor, it makes things more complex
and the benefit is usually not obvious.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 61831] NIO2 connector becomes intermittently unresponsive after
some period of time
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61831
Oleg <ya...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|WORKSFORME |---
Status|RESOLVED |REOPENED
--- Comment #2 from Oleg <ya...@gmail.com> ---
Hi,
I realize that thread dump might look fine and this is the most confusing part:
even simple curl command from the same host receives no response from this
connector and it starts working fine after tomcat restart. At the same time
tomcat in overall looks to be healthy and another connector works fine, as this
happens from time to time on different servers, this doesn't look like to be OS
or hardware issue but something which is tomcat NIO2 specific.
And when we do any request to this NIO2 endpoint connector in a bad state - no
thread is triggered in tomcat, while looking into tomcat source code it looks
like that countdownlatch was not simply updated and service just hangs because
of this but the root cause is still not clear.
So curious what additional information we can provide to help investigate this
issue together with tomcat apache dev team?
Also, I'm not sure about your remark about custom executor - we don't use
custom one, we just configure the one form tomcat.
Regards,
Oleg.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 61831] NIO2 connector becomes intermittently unresponsive after
some period of time
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61831
Oleg <ya...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |yaolega@gmail.com
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 61831] NIO2 connector becomes intermittently unresponsive after
some period of time
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61831
--- Comment #4 from Remy Maucherat <re...@apache.org> ---
Ok, maybe. Let us know if you find some elements demonstrating an issue in
Tomcat.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 61831] NIO2 connector becomes intermittently unresponsive after
some period of time
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61831
Piotr <pi...@gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEEDINFO |RESOLVED
Resolution|--- |INVALID
--- Comment #3 from Piotr <pi...@gmail.com> ---
We think we figured it out to be a Java Bug in asynchronous server socket
implementation.
Please see the following bug report which seems to exhibit a similar issue.
https://bugs.openjdk.java.net/browse/JDK-8172750
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
[Bug 61831] NIO2 connector becomes intermittently unresponsive after
some period of time
Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61831
Remy Maucherat <re...@apache.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |NEEDINFO
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org