You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2016/02/04 18:18:07 UTC

[Bug 58970] New: http NIO connector crash after update from 8.0.27 to 8.0.30

https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

            Bug ID: 58970
           Summary: http NIO connector crash after update from 8.0.27 to
                    8.0.30
           Product: Tomcat 8
           Version: 8.0.30
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Connectors
          Assignee: dev@tomcat.apache.org
          Reporter: slash@aceslash.net

Created attachment 33531
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=33531&action=edit
Graph of network connection status during the crash of the connector

==============================
Environment:
Debian 8
Tomcat 8.0.30
Java Oracle JDK 1.8.0_72
Using connector NIO, current connector configuration:
    <Connector port="8001"
protocol="org.apache.coyote.http11.Http11NioProtocol"
        connectionTimeout="20000"
        acceptorThreadCount="4"
        maxThreads="200"
        maxConnections="1000"
        maxKeepAliveRequests="5000" />
Hardware: different servers, Intel Xeon CPU with a total of 16 core (32 thread)
memory per tomcat around 30GB, using G1GC.
==============================
What is happening:
Before the update, with Tomcat version 8.0.27, we didn't have any issue with
the NIO connector, it was working fine and websocket too.
Since the update, the connector just "crash" after several hours of work: no
request are then processed (websocket or http), trying to access any
application from http://ip:8001/ just hangs. Looking at the state of the
network socket, it is clearly not working (graph attached).

The http/NIO connector is used almost exclusively for websocket connections
(the only connection that are not websocket are from our internal connector
checker).

There is also an AJP/APR connector that is working fine during that time, even
when the NIO/http connector crash.

I don't see anything in the catalina.out nor in the system log... 

I know this is difficult to debug with so little information, I only see this
issue in production myself when there is a large number of connections, never
in test.

The tomcat is behind an apache httpd 2.4 proxy, relevant configuration:
JkMount /APPNAME* server_tomcat1
ProxyPass /APPNAME/realtime/ ws://server.example.net:8001/APPNAME/realtime/
ProxyPassReverse /APPNAME/realtime/
ws://server.example.net:8001/APPNAME/realtime/

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

reda.housnialaoui@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--- Comment #5 from reda.housnialaoui@gmail.com ---
I am sorry, I wasn't clear enough.
Slash and me are working in the same company, so I can assure you that the
uploaded thread dump is about this issue.

We have a lot of trafic on AJP and less on http NIO because all non websocket
traffic is going through httpd modjk and then AJP connector.
Since modjk can't deal with websocket connections, http NIO connector is here
to only manage websocket traffic.

Here is what we do to systematically reproduce the issue:
- From a nodejs application we try to establish 20 000 atmosphere connections
using websocket transport to the app running in tomcat 8.0.30
- Once we hit the max connection, we wait about 1 minute
- Then we kill violently the node application and relaunch it to establish 20
000 new atmosphere connections
- If the http connector is still alive, we repeat the whole operation

It takes about 3 attemps to crash the http connector.
In the end, the node app is totally stopped, there is no more connection to the
tomcat http nio connector and yet the connector is totally frozen.

>From what I have seen, comparing healty tomcat tdump and tomcat with frozen
connector tdump, I can see that when connector is frozen, all http nio
acceptors thread are in PARKING status.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

--- Comment #15 from Remy Maucherat <re...@apache.org> ---
I still don't understand if this is caused by maxConnections or not. Can the
unlimited setting be tried and/or the connection count be monitored ?

Usually unplugging a network cable is the worst test since the network
connection may never be actually noticed by the other server as being dead.
However, the server connectionTimeout should work, but it doesn't necessarily
apply in all cases (websockets, etc, and precisely that's the scenario here).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

--- Comment #12 from Réda Housni Alaoui <re...@gmail.com> ---
The dump is too big to be attached.

Here is a link to download it: 
http://s000.tinyupload.com/index.php?file_id=00903516386387493654

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |NEEDINFO

--- Comment #14 from Mark Thomas <ma...@apache.org> ---
Do the same reproduction steps still create the issue?

Can you provide a (simple as possible) web application and client we can use to
recreate this problem?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

reda.housnialaoui@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |reda.housnialaoui@gmail.com

--- Comment #3 from reda.housnialaoui@gmail.com ---
Created attachment 33732
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=33732&action=edit
Thread dump of a tomcat 8.0.30 with http connector frozen

Hello, 

Please find the required thread dump in attachment.
Thread dump of a tomcat 8.0.30 with a frozen http nio connector.

Regards

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

--- Comment #10 from Remy Maucherat <re...@apache.org> ---
Simply set maxConnections to unlimited (-1) in your configuration and you're
done.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

--- Comment #2 from slash@aceslash.net ---
I know it's difficult to debug like this, unfortunately I had to rollback the
production to 8.0.27 for now to restore our websocket services.

I'll see what I can do to give you relevant logs/thread dump.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

--- Comment #13 from Réda Housni Alaoui <re...@gmail.com> ---
The dump comes from a tomcat 8.0.38 with crashed http connector.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

reda.housnialaoui@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

Réda Housni Alaoui <re...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|FIXED                       |---
             Status|RESOLVED                    |REOPENED

--- Comment #11 from Réda Housni Alaoui <re...@gmail.com> ---
Hello,

We still have the issue on tomcat 8.0.37 and 8.0.38 with the same
configuration.
New jstack attached.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

--- Comment #7 from Mark Thomas <ma...@apache.org> ---
The problem is with the current connection count tracking. There are code paths
where this isn't being decremented when a connection closes in error. I'm
currently looking for a reliable way to track the open connection count.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |RESOLVED
         Resolution|---                         |WORKSFORME

--- Comment #16 from Mark Thomas <ma...@apache.org> ---
No further response from OP, no info on how to reproduce this and no similar
reports from other users.

If you believe you are experiencing this issue or one similar, please open a
new issue with the steps to reproduce the issue on clean install of the latest
7.0.x, 8.0.x, 8.5.x or 9.0.x release.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

--- Comment #9 from Réda Housni Alaoui <re...@gmail.com> ---
Thank you for the fix.
When can we expect the 8.0.34 release?
Would it be wise to use the current 8.0.34 snapshot in production?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

--- Comment #6 from Réda Housni Alaoui <re...@gmail.com> ---
I don't know if you can see this in the tdump but we are using the JSR356
websocket implementation.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from Mark Thomas <ma...@apache.org> ---
I (think I) found the root cause. This has been fixed in:
- 9.0.x for 9.0.0.M5
- 8.5.x for 8.5.1
- 8.0.x for 8.0.34
- 7.0.x for 7.0.70

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

Julien Béti <ju...@beti.name> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |julien@beti.name

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

Mark Thomas <ma...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Mark Thomas <ma...@apache.org> ---
Thread dump when the problem occurs and logs leading up to the problem please.

Best guess at this point in that the Poller thread stopped but without
information that is nothing more than a wild guess.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[Bug 58970] http NIO connector crash after update from 8.0.27 to 8.0.30

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=58970

Remy Maucherat <re...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #4 from Remy Maucherat <re...@apache.org> ---
The dump looks slightly weird (lots of APR AJP, this seems more active to me
than the NIO connector). However, the NIO connector is indeed stuck on its max
connections which probably have been leaked due to the Atmosphere use, which
may or may not be doing bad things.

maxConnections is 10000 and often does not make sense (I disabled it by default
for the NIO2 connector).

So I'll switch it back to need info since there's no proof this is valid (or
the same issue that was originally reported, although I'd say it's likely).

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org