You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@tomcat.apache.org by Thomas Boniface <th...@stickyads.tv> on 2015/04/20 14:11:21 UTC

File descriptors peaks with latest stable build of Tomcat 7

Hi,

I have tried to find help regarding an issue we experience with our
platform leading to random file descriptor peaks. This happens more often
on heavy load but can also happen on low traffic periods.

Our application is using servlet 3.0 async features and an async connector.
We noticed that a lot of issues regarding asynchronous feature were fixed
between our production version and the last stable build. We decided to
give it a try to see if it improves things or at least give clues on what
can cause the issue; Unfortunately it did neither.

The file descriptor peaks and application blocking happens frequently with
this version when it only happens rarely on previous version (tomcat7
7.0.28-4).

Tomcat is behind an nginx server. The tomcat connector used is configured
as follows:

We use an Nio connector:
<Connector port="8080" protocol="org.apache.coyote.
http11.Http11NioProtocol"
      selectorTimeout="1000"
      maxThreads="200"
      maxHttpHeaderSize="16384"
      address="127.0.0.1"
      redirectPort="8443"/>

In catalina I can see some Broken pipe message that were not happening with
previous version.

I compared thread dumps from server with both the new and "old" version of
tomcat and both look similar from my stand point.

My explanation may not be very clear, but I hope this gives an idea how
what we are experiencing. Any pointer would be welcomed.

Thomas

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Rainer Jung <ra...@kippdata.de>.

Am 20.04.2015 um 17:40 schrieb Thomas Boniface:
> Hi,
>
> Both nginx and tomcat are hosted on the same server when listing the
> connections I see both the connections from nginx to tomcat (the first one
> create) and the one from tomcat to nginx used to reply. I may have
> presented things the bad way though (I'm not too good regarding system
> level).
>
> I do agree the high number of close wait seems strange, I really feel like
> nginx closed the connection before tomcat did (what I think leads to the
> broken pipe expections observed in the catalina.out). In case someone want
> to have a look I uploaded a netstat log here:
> http://www.filedropper.com/netsat

The connection statistics between clients and nginx http port is:

   Count           IP:Port ConnectionState
   45467  178.32.101.62:80 TIME_WAIT
   44745    178.33.42.6:80 TIME_WAIT
   26093    178.33.42.6:80 ESTABLISHED
   25667  178.32.101.62:80 ESTABLISHED
    6898  178.32.101.62:80 FIN_WAIT2
    6723    178.33.42.6:80 FIN_WAIT2
     800  178.32.101.62:80 FIN_WAIT1
     792    178.33.42.6:80 FIN_WAIT1
     712  178.32.101.62:80 LAST_ACK
     656    178.33.42.6:80 LAST_ACK
     234    178.33.42.6:80 SYN_RECV
     232  178.32.101.62:80 SYN_RECV
      18    178.33.42.6:80 CLOSING
       8  178.32.101.62:80 CLOSING
       1    178.33.42.6:80 CLOSE_WAIT
       1        0.0.0.0:80 LISTEN

So lots of connections in TIME_WAIT state which is kind of expected for 
a web server doing lots of short time client connections, but slows down 
the IP stack. Also quite a lot of established connections (about 
50.000!), which means that you probably want to check, whether you can 
reduce your keep alive timeout for nginx.

The same statistics for the https port:

   Count           IP:Port ConnectionState
    2283 178.32.101.62:443 TIME_WAIT
    2125   178.33.42.6:443 TIME_WAIT
    1585 178.32.101.62:443 ESTABLISHED
    1493   178.33.42.6:443 ESTABLISHED
     484 178.32.101.62:443 FIN_WAIT2
     420   178.33.42.6:443 FIN_WAIT2
      47 178.32.101.62:443 FIN_WAIT1
      46   178.33.42.6:443 FIN_WAIT1
      25 178.32.101.62:443 LAST_ACK
      17   178.33.42.6:443 SYN_RECV
      16 178.32.101.62:443 SYN_RECV
      16   178.33.42.6:443 LAST_ACK
      10   178.33.42.6:443 CLOSING
       4 178.32.101.62:443 CLOSING
       1       0.0.0.0:443 LISTEN

About the same relative picture but only about 5% of the http connection 
counts.

The incoming connection statistics for Tomcat (port 8080) is:

   Count           IP:Port ConnectionState
    8381    127.0.0.1:8080 CLOSE_WAIT
    1650    127.0.0.1:8080 ESTABLISHED
     127    127.0.0.1:8080 SYN_RECV
      65    127.0.0.1:8080 TIME_WAIT
       1   172.16.1.3:8080 LISTEN
       1    127.0.0.1:8080 LISTEN

The many CLOSE_WAIT mean, that the remote side (nginx) has already 
closed the connection, but not yet Tomcat. Probably the idleconnection 
timeout / keep alive timeout for connectiosn between nginx and Tomcat is 
lower on the nginx side, than on the tomcat side.

Interestingly the same connections but viewed from the opposite side of 
the connection (nginx) have totally different statistics:

   Count           IP:Port ConnectionState
   20119    127.0.0.1:8080 SYN_SENT
    4692    127.0.0.1:8080 ESTABLISHED
     488    127.0.0.1:8080 FIN_WAIT2
     122    127.0.0.1:8080 TIME_WAIT
      13    127.0.0.1:8080 FIN_WAIT1

I wonder why we have 4692 established connections here, but on 1650 in 
the table above. In a static situation, the numbers should be the same. 
It indicates, that the is so much dynamics, taht the numbers vary a lot 
even while netstat runs.

We see a lot of SYN_SENT, so nginx wants to open many more connections 
to Tomcat but doesn't get them as quickly as it wants.

Finally there's a bunch of connections to remote web services:

   Count            IP:Port ConnectionState
     286      95.85.3.86:80 CLOSE_WAIT
     255   46.228.164.12:80 ESTABLISHED
     209   188.125.82.65:80 CLOSE_WAIT
     172  176.74.173.230:80 ESTABLISHED
     170   54.171.53.252:80 CLOSE_WAIT
     136   188.125.82.65:80 LAST_ACK
     129      95.85.3.86:80 LAST_ACK
     128  23.212.108.209:80 CLOSE_WAIT
     106  46.137.157.249:80 CLOSE_WAIT
     101    81.19.244.69:80 ESTABLISHED
      86   146.148.30.94:80 CLOSE_WAIT
      83    46.137.83.90:80 CLOSE_WAIT
      80   188.125.82.65:80 ESTABLISHED
      78  37.252.163.221:80 CLOSE_WAIT
      77    46.137.83.90:80 ESTABLISHED
      73  46.137.157.121:80 CLOSE_WAIT
      64    54.246.89.98:80 CLOSE_WAIT
      63  173.194.40.153:80 ESTABLISHED
      61    93.176.80.69:80 ESTABLISHED
      55  23.212.108.198:80 CLOSE_WAIT
      53    54.72.204.78:80 CLOSE_WAIT
      51  37.252.162.230:80 CLOSE_WAIT
      51  173.194.40.154:80 ESTABLISHED
      50  54.247.113.157:80 CLOSE_WAIT
      50   37.252.170.98:80 CLOSE_WAIT
      49  23.212.108.191:80 CLOSE_WAIT
      47   54.154.23.133:80 CLOSE_WAIT
      43  176.34.179.135:80 CLOSE_WAIT
      39   146.148.21.73:80 CLOSE_WAIT
      36   46.137.87.196:80 CLOSE_WAIT
      34  173.194.40.154:80 CLOSE_WAIT
      30   46.137.87.163:80 CLOSE_WAIT
      30    37.252.170.5:80 CLOSE_WAIT
      29  23.212.108.215:80 CLOSE_WAIT
      29   46.228.164.12:80 CLOSE_WAIT
      28    54.77.236.40:80 CLOSE_WAIT
      26  37.252.163.162:80 CLOSE_WAIT
      26  173.194.40.141:80 ESTABLISHED
      25   146.148.5.248:80 CLOSE_WAIT
      25    68.67.152.86:80 CLOSE_WAIT
      25    23.251.130.5:80 CLOSE_WAIT
      23  23.251.133.199:80 CLOSE_WAIT
      23   85.114.159.66:80 CLOSE_WAIT
      23   46.228.164.12:80 FIN_WAIT1
      21  192.158.31.169:80 CLOSE_WAIT
      20  23.212.108.193:80 CLOSE_WAIT
      19  37.252.170.106:80 CLOSE_WAIT
      19   146.148.10.88:80 CLOSE_WAIT
      18   85.114.159.66:80 ESTABLISHED
      15  23.251.141.112:80 CLOSE_WAIT
      14  173.194.40.141:80 CLOSE_WAIT
      13    54.246.89.98:80 ESTABLISHED
      12   46.228.164.12:80 LAST_ACK
       9  23.212.108.193:80 ESTABLISHED
       7  37.252.163.238:80 CLOSE_WAIT
       7  23.251.138.102:80 CLOSE_WAIT
       7  192.158.31.169:80 ESTABLISHED
       7  173.194.40.153:80 CLOSE_WAIT
       7  146.148.11.181:80 CLOSE_WAIT
       7      95.85.3.86:80 ESTABLISHED
       6 146.148.127.205:80 CLOSE_WAIT
       5   37.252.163.26:80 CLOSE_WAIT
       4 146.148.113.130:80 CLOSE_WAIT
       3  146.148.23.202:80 CLOSE_WAIT
       3  146.148.118.41:80 CLOSE_WAIT
       3   37.252.163.26:80 ESTABLISHED
       2    23.251.130.5:80 ESTABLISHED
       1  54.247.113.157:80 ESTABLISHED
       1  23.212.108.209:80 ESTABLISHED
       1  176.34.179.135:80 ESTABLISHED
       1  173.194.40.154:80 TIME_WAIT
       1   5.135.147.172:80 CLOSE_WAIT
       1   37.252.170.96:80 CLOSE_WAIT
       1   37.252.170.69:80 CLOSE_WAIT
       1   37.252.163.26:80 TIME_WAIT
       1    68.67.152.86:80 TIME_WAIT

Again many are in CLOSE_WAIT meaning here, that the remote web service 
has already closed the connection, but not the local client, which 
probably sits inside your webapp.

All in all the numbers are quite big in many aspects. You could try to 
grow your tomcat connector thread pool, but the question would be 
whether many of those connections are actually busy and whether your 
system (hardware) can cope with that load.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

Hi,

Both nginx and tomcat are hosted on the same server when listing the
connections I see both the connections from nginx to tomcat (the first one
create) and the one from tomcat to nginx used to reply. I may have
presented things the bad way though (I'm not too good regarding system
level).

I do agree the high number of close wait seems strange, I really feel like
nginx closed the connection before tomcat did (what I think leads to the
broken pipe expections observed in the catalina.out). In case someone want
to have a look I uploaded a netstat log here:
http://www.filedropper.com/netsat

Thomas

2015-04-20 17:13 GMT+02:00 André Warnier <aw...@ice-sa.com>:

> Thomas Boniface wrote:
>
>> I did some captures during a peak this morning, I have some lsof and
>> netstat data.
>>
>> It seems to me that most file descriptors used by tomcat are some http
>> connections:
>>
>>  thomas@localhost  ~/ads3/tbo11h12  cat lsof| wc -l
>> 17772
>>  thomas@localhost  ~/ads3/tbo11h12  cat lsof | grep TCP | wc -l
>> 13966
>>
>> (Note that the application also send request to external servers via http)
>>
>>
>> Regarding netstat I did a small script to try to aggregate connections
>> with
>> a human readable name, if my script is right the connections between nginx
>> and tomcat are as follows:
>>
>> tomcat => nginx SYN_RECV 127
>> tomcat => nginx ESTABLISHED 1650
>> tomcat => nginx CLOSE_WAIT 8381
>> tomcat => nginx TIME_WAIT 65
>>
>> nginx => tomcat SYN_SENT 20119
>> nginx => tomcat ESTABLISHED 4692
>> nginx => tomcat TIME_WAIT 122
>> nginx => tomcat FIN_WAIT2 488
>> nginx => tomcat FIN_WAIT1 13
>>
>
> I don't understand the distinction here.  Tomcat should never initiate
> connections *to* nginx, or ?
>
> For personal historical reasons, the high number of connections in
> CLOSE_WAIT state above triggered my interest.  Search Google for : "tcp
> close_wait state meaning"
> Basically, it can mean that the client wants to go away, and closes its
> end of the connection to the server, but the application on the server
> never properly closes the connection to the client. And as long as it
> doesn't, the corresponding connection will remain stuck in the CLOSE_WAIT
> state (and continue to use resources on the server, such as an fd and
> associated resources).
> All that doesn't mean that this is your main issue here, but it's
> something to look into.
>
>
>
>
>> Concerning the other response and the system max number of file, I am not
>> sure this is where our issue lies. The peak itself seems to be a sympton
>> of
>> an issue, tomcat fd are around 1000 almost all the time except when a peak
>> occurs. In such cases it can go up to 10000 or more sometimes.
>>
>> Thomas
>>
>>
>>
>> 2015-04-20 15:41 GMT+02:00 Rainer Jung <ra...@kippdata.de>:
>>
>>  Am 20.04.2015 um 14:11 schrieb Thomas Boniface:
>>>
>>>  Hi,
>>>>
>>>> I have tried to find help regarding an issue we experience with our
>>>> platform leading to random file descriptor peaks. This happens more
>>>> often
>>>> on heavy load but can also happen on low traffic periods.
>>>>
>>>> Our application is using servlet 3.0 async features and an async
>>>> connector.
>>>> We noticed that a lot of issues regarding asynchronous feature were
>>>> fixed
>>>> between our production version and the last stable build. We decided to
>>>> give it a try to see if it improves things or at least give clues on
>>>> what
>>>> can cause the issue; Unfortunately it did neither.
>>>>
>>>> The file descriptor peaks and application blocking happens frequently
>>>> with
>>>> this version when it only happens rarely on previous version (tomcat7
>>>> 7.0.28-4).
>>>>
>>>> Tomcat is behind an nginx server. The tomcat connector used is
>>>> configured
>>>> as follows:
>>>>
>>>> We use an Nio connector:
>>>> <Connector port="8080" protocol="org.apache.coyote.
>>>> http11.Http11NioProtocol"
>>>>        selectorTimeout="1000"
>>>>        maxThreads="200"
>>>>        maxHttpHeaderSize="16384"
>>>>        address="127.0.0.1"
>>>>        redirectPort="8443"/>
>>>>
>>>> In catalina I can see some Broken pipe message that were not happening
>>>> with
>>>> previous version.
>>>>
>>>> I compared thread dumps from server with both the new and "old" version
>>>> of
>>>> tomcat and both look similar from my stand point.
>>>>
>>>> My explanation may not be very clear, but I hope this gives an idea how
>>>> what we are experiencing. Any pointer would be welcomed.
>>>>
>>>>  If the peaks happen long enough and your platforms has the tools
>>> available
>>> you can use lsof to look for what those FDs are - or on Linux looking at
>>> "ls -l /proc/PID/fd/*" (PID is the process PID file) - or on Solaris use
>>> the pfiles command.
>>>
>>> If the result is what is expected, namely that by far the most FDs are
>>> coming from network connections for port 8080, then you can check via
>>> "netstat" in which connection state those are.
>>>
>>> If most are in ESTABLISHED state, then you/we need to further break down
>>> the strategy.
>>>
>>> Regards,
>>>
>>> Rainer
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by André Warnier <aw...@ice-sa.com>.

Thomas Boniface wrote:
> I did some captures during a peak this morning, I have some lsof and
> netstat data.
> 
> It seems to me that most file descriptors used by tomcat are some http
> connections:
> 
>  thomas@localhost  ~/ads3/tbo11h12  cat lsof| wc -l
> 17772
>  thomas@localhost  ~/ads3/tbo11h12  cat lsof | grep TCP | wc -l
> 13966
> 
> (Note that the application also send request to external servers via http)
> 
> 
> Regarding netstat I did a small script to try to aggregate connections with
> a human readable name, if my script is right the connections between nginx
> and tomcat are as follows:
> 
> tomcat => nginx SYN_RECV 127
> tomcat => nginx ESTABLISHED 1650
> tomcat => nginx CLOSE_WAIT 8381
> tomcat => nginx TIME_WAIT 65
> 
> nginx => tomcat SYN_SENT 20119
> nginx => tomcat ESTABLISHED 4692
> nginx => tomcat TIME_WAIT 122
> nginx => tomcat FIN_WAIT2 488
> nginx => tomcat FIN_WAIT1 13

I don't understand the distinction here.  Tomcat should never initiate connections *to* 
nginx, or ?

For personal historical reasons, the high number of connections in CLOSE_WAIT state above 
triggered my interest.  Search Google for : "tcp close_wait state meaning"
Basically, it can mean that the client wants to go away, and closes its end of the 
connection to the server, but the application on the server never properly closes the 
connection to the client. And as long as it doesn't, the corresponding connection will 
remain stuck in the CLOSE_WAIT state (and continue to use resources on the server, such as 
an fd and associated resources).
All that doesn't mean that this is your main issue here, but it's something to look into.


> 
> Concerning the other response and the system max number of file, I am not
> sure this is where our issue lies. The peak itself seems to be a sympton of
> an issue, tomcat fd are around 1000 almost all the time except when a peak
> occurs. In such cases it can go up to 10000 or more sometimes.
> 
> Thomas
> 
> 
> 
> 2015-04-20 15:41 GMT+02:00 Rainer Jung <ra...@kippdata.de>:
> 
>> Am 20.04.2015 um 14:11 schrieb Thomas Boniface:
>>
>>> Hi,
>>>
>>> I have tried to find help regarding an issue we experience with our
>>> platform leading to random file descriptor peaks. This happens more often
>>> on heavy load but can also happen on low traffic periods.
>>>
>>> Our application is using servlet 3.0 async features and an async
>>> connector.
>>> We noticed that a lot of issues regarding asynchronous feature were fixed
>>> between our production version and the last stable build. We decided to
>>> give it a try to see if it improves things or at least give clues on what
>>> can cause the issue; Unfortunately it did neither.
>>>
>>> The file descriptor peaks and application blocking happens frequently with
>>> this version when it only happens rarely on previous version (tomcat7
>>> 7.0.28-4).
>>>
>>> Tomcat is behind an nginx server. The tomcat connector used is configured
>>> as follows:
>>>
>>> We use an Nio connector:
>>> <Connector port="8080" protocol="org.apache.coyote.
>>> http11.Http11NioProtocol"
>>>        selectorTimeout="1000"
>>>        maxThreads="200"
>>>        maxHttpHeaderSize="16384"
>>>        address="127.0.0.1"
>>>        redirectPort="8443"/>
>>>
>>> In catalina I can see some Broken pipe message that were not happening
>>> with
>>> previous version.
>>>
>>> I compared thread dumps from server with both the new and "old" version of
>>> tomcat and both look similar from my stand point.
>>>
>>> My explanation may not be very clear, but I hope this gives an idea how
>>> what we are experiencing. Any pointer would be welcomed.
>>>
>> If the peaks happen long enough and your platforms has the tools available
>> you can use lsof to look for what those FDs are - or on Linux looking at
>> "ls -l /proc/PID/fd/*" (PID is the process PID file) - or on Solaris use
>> the pfiles command.
>>
>> If the result is what is expected, namely that by far the most FDs are
>> coming from network connections for port 8080, then you can check via
>> "netstat" in which connection state those are.
>>
>> If most are in ESTABLISHED state, then you/we need to further break down
>> the strategy.
>>
>> Regards,
>>
>> Rainer
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

I did some captures during a peak this morning, I have some lsof and
netstat data.

It seems to me that most file descriptors used by tomcat are some http
connections:

 thomas@localhost  ~/ads3/tbo11h12  cat lsof| wc -l
17772
 thomas@localhost  ~/ads3/tbo11h12  cat lsof | grep TCP | wc -l
13966

(Note that the application also send request to external servers via http)


Regarding netstat I did a small script to try to aggregate connections with
a human readable name, if my script is right the connections between nginx
and tomcat are as follows:

tomcat => nginx SYN_RECV 127
tomcat => nginx ESTABLISHED 1650
tomcat => nginx CLOSE_WAIT 8381
tomcat => nginx TIME_WAIT 65

nginx => tomcat SYN_SENT 20119
nginx => tomcat ESTABLISHED 4692
nginx => tomcat TIME_WAIT 122
nginx => tomcat FIN_WAIT2 488
nginx => tomcat FIN_WAIT1 13

Concerning the other response and the system max number of file, I am not
sure this is where our issue lies. The peak itself seems to be a sympton of
an issue, tomcat fd are around 1000 almost all the time except when a peak
occurs. In such cases it can go up to 10000 or more sometimes.

Thomas



2015-04-20 15:41 GMT+02:00 Rainer Jung <ra...@kippdata.de>:

> Am 20.04.2015 um 14:11 schrieb Thomas Boniface:
>
>> Hi,
>>
>> I have tried to find help regarding an issue we experience with our
>> platform leading to random file descriptor peaks. This happens more often
>> on heavy load but can also happen on low traffic periods.
>>
>> Our application is using servlet 3.0 async features and an async
>> connector.
>> We noticed that a lot of issues regarding asynchronous feature were fixed
>> between our production version and the last stable build. We decided to
>> give it a try to see if it improves things or at least give clues on what
>> can cause the issue; Unfortunately it did neither.
>>
>> The file descriptor peaks and application blocking happens frequently with
>> this version when it only happens rarely on previous version (tomcat7
>> 7.0.28-4).
>>
>> Tomcat is behind an nginx server. The tomcat connector used is configured
>> as follows:
>>
>> We use an Nio connector:
>> <Connector port="8080" protocol="org.apache.coyote.
>> http11.Http11NioProtocol"
>>        selectorTimeout="1000"
>>        maxThreads="200"
>>        maxHttpHeaderSize="16384"
>>        address="127.0.0.1"
>>        redirectPort="8443"/>
>>
>> In catalina I can see some Broken pipe message that were not happening
>> with
>> previous version.
>>
>> I compared thread dumps from server with both the new and "old" version of
>> tomcat and both look similar from my stand point.
>>
>> My explanation may not be very clear, but I hope this gives an idea how
>> what we are experiencing. Any pointer would be welcomed.
>>
>
> If the peaks happen long enough and your platforms has the tools available
> you can use lsof to look for what those FDs are - or on Linux looking at
> "ls -l /proc/PID/fd/*" (PID is the process PID file) - or on Solaris use
> the pfiles command.
>
> If the result is what is expected, namely that by far the most FDs are
> coming from network connections for port 8080, then you can check via
> "netstat" in which connection state those are.
>
> If most are in ESTABLISHED state, then you/we need to further break down
> the strategy.
>
> Regards,
>
> Rainer
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

Thanks for your time Rainer,

I get what you mean regarding the application getting slow. This server was
also logging the garbage collection activity and it seems normal even when
the problem is occuring, there is no big variation in the time taken to do
a garbage collection operation.

I don't have a clear view of the server response time around the test I
made, so I can't tell if the application gets "slow" before the file
descriptor peak but as mentioned before this happen also during low traffic
period (in such a period there should have not reason to get slow). Also,
it feels unexpected that this version of tomcat makes the application
getting slower more often than a server with the other version of tomcat.

Thomas


2015-04-20 16:32 GMT+02:00 Rainer Jung <ra...@kippdata.de>:

> Am 20.04.2015 um 15:41 schrieb Rainer Jung:
>
>> Am 20.04.2015 um 14:11 schrieb Thomas Boniface:
>>
>>> Hi,
>>>
>>> I have tried to find help regarding an issue we experience with our
>>> platform leading to random file descriptor peaks. This happens more often
>>> on heavy load but can also happen on low traffic periods.
>>>
>>> Our application is using servlet 3.0 async features and an async
>>> connector.
>>> We noticed that a lot of issues regarding asynchronous feature were fixed
>>> between our production version and the last stable build. We decided to
>>> give it a try to see if it improves things or at least give clues on what
>>> can cause the issue; Unfortunately it did neither.
>>>
>>> The file descriptor peaks and application blocking happens frequently
>>> with
>>> this version when it only happens rarely on previous version (tomcat7
>>> 7.0.28-4).
>>>
>>> Tomcat is behind an nginx server. The tomcat connector used is configured
>>> as follows:
>>>
>>> We use an Nio connector:
>>> <Connector port="8080" protocol="org.apache.coyote.
>>> http11.Http11NioProtocol"
>>>        selectorTimeout="1000"
>>>        maxThreads="200"
>>>        maxHttpHeaderSize="16384"
>>>        address="127.0.0.1"
>>>        redirectPort="8443"/>
>>>
>>> In catalina I can see some Broken pipe message that were not happening
>>> with
>>> previous version.
>>>
>>> I compared thread dumps from server with both the new and "old"
>>> version of
>>> tomcat and both look similar from my stand point.
>>>
>>> My explanation may not be very clear, but I hope this gives an idea how
>>> what we are experiencing. Any pointer would be welcomed.
>>>
>>
>> If the peaks happen long enough and your platforms has the tools
>> available you can use lsof to look for what those FDs are - or on Linux
>> looking at "ls -l /proc/PID/fd/*" (PID is the process PID file) - or on
>> Solaris use the pfiles command.
>>
>> If the result is what is expected, namely that by far the most FDs are
>> coming from network connections for port 8080, then you can check via
>> "netstat" in which connection state those are.
>>
>> If most are in ESTABLISHED state, then you/we need to further break down
>> the strategy.
>>
>
> One more thing: the connection peak might happen, if for some reason your
> application or the JVM (GC) gets slow. The reason doesn't have to still be
> there at the time when you take the thread dump.
>
> You might want to add "%D" to your Tomcat access log and ty to estimate,
> whether the connection peaks are due to (temporary) application slow down.
>
> The same holds for activating a GC log and check for long or many
> cumulative GC pauses.
>
>
> Regards,
>
> Rainer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Rainer Jung <ra...@kippdata.de>.

Am 20.04.2015 um 15:41 schrieb Rainer Jung:
> Am 20.04.2015 um 14:11 schrieb Thomas Boniface:
>> Hi,
>>
>> I have tried to find help regarding an issue we experience with our
>> platform leading to random file descriptor peaks. This happens more often
>> on heavy load but can also happen on low traffic periods.
>>
>> Our application is using servlet 3.0 async features and an async
>> connector.
>> We noticed that a lot of issues regarding asynchronous feature were fixed
>> between our production version and the last stable build. We decided to
>> give it a try to see if it improves things or at least give clues on what
>> can cause the issue; Unfortunately it did neither.
>>
>> The file descriptor peaks and application blocking happens frequently
>> with
>> this version when it only happens rarely on previous version (tomcat7
>> 7.0.28-4).
>>
>> Tomcat is behind an nginx server. The tomcat connector used is configured
>> as follows:
>>
>> We use an Nio connector:
>> <Connector port="8080" protocol="org.apache.coyote.
>> http11.Http11NioProtocol"
>>        selectorTimeout="1000"
>>        maxThreads="200"
>>        maxHttpHeaderSize="16384"
>>        address="127.0.0.1"
>>        redirectPort="8443"/>
>>
>> In catalina I can see some Broken pipe message that were not happening
>> with
>> previous version.
>>
>> I compared thread dumps from server with both the new and "old"
>> version of
>> tomcat and both look similar from my stand point.
>>
>> My explanation may not be very clear, but I hope this gives an idea how
>> what we are experiencing. Any pointer would be welcomed.
>
> If the peaks happen long enough and your platforms has the tools
> available you can use lsof to look for what those FDs are - or on Linux
> looking at "ls -l /proc/PID/fd/*" (PID is the process PID file) - or on
> Solaris use the pfiles command.
>
> If the result is what is expected, namely that by far the most FDs are
> coming from network connections for port 8080, then you can check via
> "netstat" in which connection state those are.
>
> If most are in ESTABLISHED state, then you/we need to further break down
> the strategy.

One more thing: the connection peak might happen, if for some reason 
your application or the JVM (GC) gets slow. The reason doesn't have to 
still be there at the time when you take the thread dump.

You might want to add "%D" to your Tomcat access log and ty to estimate, 
whether the connection peaks are due to (temporary) application slow down.

The same holds for activating a GC log and check for long or many 
cumulative GC pauses.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Neill Lima <ne...@visual-meta.com>.

Increasing the amount of opened file descriptors is an accepted fine-tune (*if
your application it is handling the threads properly*)

ulimit -n
ulimit -n [new_value]
ulimit -n

If even after allowing more fds the performance is not adequate, some sort
of scaling (H/V) is necessary.

On Mon, Apr 20, 2015 at 3:41 PM, Rainer Jung <ra...@kippdata.de>
wrote:

> Am 20.04.2015 um 14:11 schrieb Thomas Boniface:
>
>> Hi,
>>
>> I have tried to find help regarding an issue we experience with our
>> platform leading to random file descriptor peaks. This happens more often
>> on heavy load but can also happen on low traffic periods.
>>
>> Our application is using servlet 3.0 async features and an async
>> connector.
>> We noticed that a lot of issues regarding asynchronous feature were fixed
>> between our production version and the last stable build. We decided to
>> give it a try to see if it improves things or at least give clues on what
>> can cause the issue; Unfortunately it did neither.
>>
>> The file descriptor peaks and application blocking happens frequently with
>> this version when it only happens rarely on previous version (tomcat7
>> 7.0.28-4).
>>
>> Tomcat is behind an nginx server. The tomcat connector used is configured
>> as follows:
>>
>> We use an Nio connector:
>> <Connector port="8080" protocol="org.apache.coyote.
>> http11.Http11NioProtocol"
>>        selectorTimeout="1000"
>>        maxThreads="200"
>>        maxHttpHeaderSize="16384"
>>        address="127.0.0.1"
>>        redirectPort="8443"/>
>>
>> In catalina I can see some Broken pipe message that were not happening
>> with
>> previous version.
>>
>> I compared thread dumps from server with both the new and "old" version of
>> tomcat and both look similar from my stand point.
>>
>> My explanation may not be very clear, but I hope this gives an idea how
>> what we are experiencing. Any pointer would be welcomed.
>>
>
> If the peaks happen long enough and your platforms has the tools available
> you can use lsof to look for what those FDs are - or on Linux looking at
> "ls -l /proc/PID/fd/*" (PID is the process PID file) - or on Solaris use
> the pfiles command.
>
> If the result is what is expected, namely that by far the most FDs are
> coming from network connections for port 8080, then you can check via
> "netstat" in which connection state those are.
>
> If most are in ESTABLISHED state, then you/we need to further break down
> the strategy.
>
> Regards,
>
> Rainer
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Rainer Jung <ra...@kippdata.de>.

Am 20.04.2015 um 14:11 schrieb Thomas Boniface:
> Hi,
>
> I have tried to find help regarding an issue we experience with our
> platform leading to random file descriptor peaks. This happens more often
> on heavy load but can also happen on low traffic periods.
>
> Our application is using servlet 3.0 async features and an async connector.
> We noticed that a lot of issues regarding asynchronous feature were fixed
> between our production version and the last stable build. We decided to
> give it a try to see if it improves things or at least give clues on what
> can cause the issue; Unfortunately it did neither.
>
> The file descriptor peaks and application blocking happens frequently with
> this version when it only happens rarely on previous version (tomcat7
> 7.0.28-4).
>
> Tomcat is behind an nginx server. The tomcat connector used is configured
> as follows:
>
> We use an Nio connector:
> <Connector port="8080" protocol="org.apache.coyote.
> http11.Http11NioProtocol"
>        selectorTimeout="1000"
>        maxThreads="200"
>        maxHttpHeaderSize="16384"
>        address="127.0.0.1"
>        redirectPort="8443"/>
>
> In catalina I can see some Broken pipe message that were not happening with
> previous version.
>
> I compared thread dumps from server with both the new and "old" version of
> tomcat and both look similar from my stand point.
>
> My explanation may not be very clear, but I hope this gives an idea how
> what we are experiencing. Any pointer would be welcomed.

If the peaks happen long enough and your platforms has the tools 
available you can use lsof to look for what those FDs are - or on Linux 
looking at "ls -l /proc/PID/fd/*" (PID is the process PID file) - or on 
Solaris use the pfiles command.

If the result is what is expected, namely that by far the most FDs are 
coming from network connections for port 8080, then you can check via 
"netstat" in which connection state those are.

If most are in ESTABLISHED state, then you/we need to further break down 
the strategy.

Regards,

Rainer


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Neill Lima <ne...@visual-meta.com>.

Hello Christopher S.,

I know it won't. I just wanted to provide insight into Andre W.'s approach.

Thanks,

Neill

On Wed, Apr 22, 2015 at 4:58 PM, André Warnier <aw...@ice-sa.com> wrote:

> Christopher Schultz wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> Neill,
>>
>> On 4/22/15 9:12 AM, Neill Lima wrote:
>>
>>> If I am not wrong, if the application in question is monitored in
>>> VisualVM through JMX (https://visualvm.java.net/) you could trigger
>>> a Force GC from its monitoring console.
>>>
>>
>> You can do this, but it won't close any CLOSE_WAIT connections.
>> Tomcat's timeout must be reached. I suspect that the timeout(s) are
>> simply way too long.
>>
>>
> Just humor me..
> If it doesn't, it doesn't.  But it's easy to do, does not require a change
> of configuration nor a shutdown/restart of Tomcat, and it may show us
> something in principle unexpected.
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by André Warnier <aw...@ice-sa.com>.

Christopher Schultz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> Neill,
> 
> On 4/22/15 9:12 AM, Neill Lima wrote:
>> If I am not wrong, if the application in question is monitored in
>> VisualVM through JMX (https://visualvm.java.net/) you could trigger
>> a Force GC from its monitoring console.
> 
> You can do this, but it won't close any CLOSE_WAIT connections.
> Tomcat's timeout must be reached. I suspect that the timeout(s) are
> simply way too long.
> 

Just humor me..
If it doesn't, it doesn't.  But it's easy to do, does not require a change of 
configuration nor a shutdown/restart of Tomcat, and it may show us something in principle 
unexpected.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

Thanks for your reply, we'll give a try to your suggestions.

2015-04-29 23:15 GMT+02:00 Christopher Schultz <chris@christopherschultz.net
>:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Thomas,
>
> On 4/25/15 4:25 AM, Thomas Boniface wrote:
> > When talking about the strategy for our next test on the release we
> > checked at the tomcat connector configuration but we are unsure how
> > to applies your advices:
> >
> > 1. Check the nginx configuration. Specifically, the keep-alive and
> > timeout associated with the proxy configuration.
> >
> > 2. Make sure that Tomcat's timeouts are appropriate for those
> > matching settings in nginx.
> >
> > It seems were have 100 connections max keept alive at nginx level
> > ( keepalive), a timeout to connect to tomcat of 2s
> > (proxy_connect_timeout) and a timeout to read from tomcat of 10s
> > (proxy_read_timeout).
> >
> > On tomcat side we have a connector like follows:
> >
> > <Connector port="8080"
> > protocol="org.apache.coyote.http11.Http11NioProtocol"
> > selectorTimeout="1000" maxThreads="200" maxHttpHeaderSize="16384"
> > address="127.0.0.1" redirectPort="8443"/>
>
> It sounds like you need to add this to your <Connector> configuration:
>
>    connectionTimeout="10000"
>
> This matches your value for proxy_read_timeout. You should probably
> also set keepAliveTimeout if you think it needs to be different from
> connectionTimeout (keepAliveTimeout defaults to connectionTimeout).
>
> I'm not sure if Nginx's proxy_read_timeout is the same timeout used to
> terminate a connection to Tomcat if Nginx hasn't tried to send a
> request over that connection for a while, but if so, the
> connectionTimeout/keepAliveTimeout is what you want to set.
>
> I'm not sure that setting selectorTimeout to something other than the
> default helps you at all (1000ms is the default).
>
> The goal is to get both Nginx and Tomcat to close their connections at
> the same time when they decide that the connection is no loner
> necessary. If Nginx times-out more quickly than Tomcat, then re-opens
> a new connection to Tomcat, it will make Tomcat artificially run out
> of connections (and file descriptors) even though Tomcat is largely idle
> .
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2
> Comment: GPGTools - http://gpgtools.org
>
> iQIcBAEBCAAGBQJVQUnhAAoJEBzwKT+lPKRYzZwQAIgYxw6OuCgPeks/1S8x7bVP
> MdBdLddY9ruDNCRq9kLzKxEouo/WD5zuQW3kMRyTlX9I36HVRRcE6boaIwFBjiws
> LhoEMy6f5cZQj0FzRfstmyiyOFmZKtvAxwMVa8p1ykqkAhysDTU4fDKxmsKDk1fM
> fakJkqj4nRYP86ekFq/kIb/TNdMbzq+qx32QlevB/z+p0t7frR1DXadRK5KGXGVu
> dOHclY3Z29nzIGe+hdZULkZgpmAUDtk+Y7/bePeWv7ln6IBBoka7hYZGLj1+shdy
> PHrWs0ikTKTB9+kgS7OaipZD8r8x0yvtYYTEjZt3Jcsno0W2kKW600oTFI9YFJ2M
> XDu87+TUvb+E/NYLjJIPQICtDK71b0JpPt8ijQCx+91RFiFRYS8tuWNABcWbtRBb
> C2WlHmNilI/i+kAc7Syvao9gKO594jpao4nlPWhOXJK75QDw5K1szgo/ONgwujtU
> YRtpyZCVVB8UCUk8QIESL8WQT7zlP4MDlEpmeyRzhEGRcelCMoXEq22rZ4HVygAP
> iZg8KbkwUN/Ul7FMcwBbxoWOVE9iTBEj2nHuriAH5oKPnSJbuI2lfxOpxKSVMQaI
> NKV8Zb+yNby11UWWQxxI0QaStZB9IMVnCTLEMXT/M/okwd12xZKuChhh6RFaXKxL
> WIZLFHnxc4C5yWay7OPx
> =tLMj
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Christopher Schultz <ch...@christopherschultz.net>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Thomas,

On 4/25/15 4:25 AM, Thomas Boniface wrote:
> When talking about the strategy for our next test on the release we
> checked at the tomcat connector configuration but we are unsure how
> to applies your advices:
> 
> 1. Check the nginx configuration. Specifically, the keep-alive and 
> timeout associated with the proxy configuration.
> 
> 2. Make sure that Tomcat's timeouts are appropriate for those
> matching settings in nginx.
> 
> It seems were have 100 connections max keept alive at nginx level
> ( keepalive), a timeout to connect to tomcat of 2s
> (proxy_connect_timeout) and a timeout to read from tomcat of 10s
> (proxy_read_timeout).
> 
> On tomcat side we have a connector like follows:
> 
> <Connector port="8080"
> protocol="org.apache.coyote.http11.Http11NioProtocol" 
> selectorTimeout="1000" maxThreads="200" maxHttpHeaderSize="16384" 
> address="127.0.0.1" redirectPort="8443"/>

It sounds like you need to add this to your <Connector> configuration:

   connectionTimeout="10000"

This matches your value for proxy_read_timeout. You should probably
also set keepAliveTimeout if you think it needs to be different from
connectionTimeout (keepAliveTimeout defaults to connectionTimeout).

I'm not sure if Nginx's proxy_read_timeout is the same timeout used to
terminate a connection to Tomcat if Nginx hasn't tried to send a
request over that connection for a while, but if so, the
connectionTimeout/keepAliveTimeout is what you want to set.

I'm not sure that setting selectorTimeout to something other than the
default helps you at all (1000ms is the default).

The goal is to get both Nginx and Tomcat to close their connections at
the same time when they decide that the connection is no loner
necessary. If Nginx times-out more quickly than Tomcat, then re-opens
a new connection to Tomcat, it will make Tomcat artificially run out
of connections (and file descriptors) even though Tomcat is largely idle
.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: GPGTools - http://gpgtools.org

iQIcBAEBCAAGBQJVQUnhAAoJEBzwKT+lPKRYzZwQAIgYxw6OuCgPeks/1S8x7bVP
MdBdLddY9ruDNCRq9kLzKxEouo/WD5zuQW3kMRyTlX9I36HVRRcE6boaIwFBjiws
LhoEMy6f5cZQj0FzRfstmyiyOFmZKtvAxwMVa8p1ykqkAhysDTU4fDKxmsKDk1fM
fakJkqj4nRYP86ekFq/kIb/TNdMbzq+qx32QlevB/z+p0t7frR1DXadRK5KGXGVu
dOHclY3Z29nzIGe+hdZULkZgpmAUDtk+Y7/bePeWv7ln6IBBoka7hYZGLj1+shdy
PHrWs0ikTKTB9+kgS7OaipZD8r8x0yvtYYTEjZt3Jcsno0W2kKW600oTFI9YFJ2M
XDu87+TUvb+E/NYLjJIPQICtDK71b0JpPt8ijQCx+91RFiFRYS8tuWNABcWbtRBb
C2WlHmNilI/i+kAc7Syvao9gKO594jpao4nlPWhOXJK75QDw5K1szgo/ONgwujtU
YRtpyZCVVB8UCUk8QIESL8WQT7zlP4MDlEpmeyRzhEGRcelCMoXEq22rZ4HVygAP
iZg8KbkwUN/Ul7FMcwBbxoWOVE9iTBEj2nHuriAH5oKPnSJbuI2lfxOpxKSVMQaI
NKV8Zb+yNby11UWWQxxI0QaStZB9IMVnCTLEMXT/M/okwd12xZKuChhh6RFaXKxL
WIZLFHnxc4C5yWay7OPx
=tLMj
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

Hi,

When talking about the strategy for our next test on the release we checked
at the tomcat connector configuration but we are unsure how to applies your
advices:

1. Check the nginx configuration. Specifically, the keep-alive and
timeout associated with the proxy configuration.

2. Make sure that Tomcat's timeouts are appropriate for those matching
settings in nginx.

It seems were have 100 connections max keept alive at nginx level (
keepalive), a timeout to connect to tomcat of 2s (proxy_connect_timeout)
and a timeout to read from tomcat of 10s (proxy_read_timeout).

On tomcat side we have a connector like follows:

<Connector port="8080" protocol="org.apache.coyote.http11.Http11NioProtocol"
      selectorTimeout="1000"
      maxThreads="200"
      maxHttpHeaderSize="16384"
      address="127.0.0.1"
      redirectPort="8443"/>

Thomas

2015-04-23 14:50 GMT+02:00 Thomas Boniface <th...@stickyads.tv>:

> I just want to keep you updated and tell you that all your replies are
> very helpful. It give me clues on what to look for and sometimes confirm
> some of our suspicion.
>
> I have transmitted some of the element collected in this thread to our
> platform team but we were not able to setup new test so far due to other
> task keeping the team busy. I'll post an update when some new tests will be
> possible.
>
> Thanks again,
> Thomas
>
> 2015-04-22 17:19 GMT+02:00 Frederik Nosi <fr...@postecom.it>:
>
>> On 04/22/2015 05:15 PM, Christopher Schultz wrote:
>>
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA256
>>>
>>> Frederik,
>>>
>>> On 4/22/15 10:53 AM, Frederik Nosi wrote:
>>>
>>>> Hi, On 04/22/2015 04:35 PM, Christopher Schultz wrote: Neill,
>>>>
>>>> On 4/22/15 9:12 AM, Neill Lima wrote:
>>>>
>>>>> If I am not wrong, if the application in question is
>>>>>>> monitored in VisualVM through JMX
>>>>>>> (https://visualvm.java.net/) you could trigger a Force GC
>>>>>>> from its monitoring console.
>>>>>>>
>>>>>> You can do this, but it won't close any CLOSE_WAIT connections.
>>>> Tomcat's timeout must be reached. I suspect that the timeout(s)
>>>> are simply way too long.
>>>>
>>>>> You can tune the network stack's timeout using sysctl, eg:
>>>>> net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1
>>>>> net.ipv4.tcp_fin_timeout = 3
>>>>>
>>>> This won't do anything, either. As far as the OS is concerned, the
>>> application (Tomcat) is still using that connection. Therefore it
>>> can't be cleaned up.
>>>
>>
>> Indeed you are right, tuning the network stack help with TIME_WAIT, not
>> CLOSE_WAIT, my bad.
>>
>>
>>> Tomcat has to actively hang up the connection, and the best way to do
>>> that is with synchronized timeouts between the reverse proxy and Tomcat.
>>>
>>> You can try all other kinds of tricks, but the fact of the matter is
>>> that the application is still trying to use the socket, so no other
>>> component can step-in and kill it.
>>>
>>
>> Probably the application is slow then, or the server overloaded.
>>
>>
>>
>>> - -chris
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2
>>> Comment: GPGTools - http://gpgtools.org
>>>
>>> iQIcBAEBCAAGBQJVN7sZAAoJEBzwKT+lPKRYRhkP/j0GBtPH/+/QU2YEgZxbRoJE
>>> z2lmWxDrbFNxiYFS5332SvN4bXhG/Khog83CeBM0bg0VLciSxKYqm5J8YziMlrlo
>>> omqk3gUiNeViyjsjO5SBW9hxT1qhC1PLdtx7uZ7xUiNmmE24wQ3Gi2edyjyvYDJ0
>>> pzLT+bEp8BjXgm0c6aOONO0PJ+PbyZPeF56PXq6iqn426IhebEUlDP8kxuSh3RwL
>>> LQW7tg05bg3yTuP1ZjiwH4gmBfbomJ+xpY6F+zwDkZgk7Cs4okp5/Tr0uTNhsHQM
>>> lgGaIZc9SCoqKaMFqWila3RaAnnpqDe1cdg2N44zluIaMkcO94kDSWBuT25t5dGe
>>> GBiFG2HGczwyo5MCrx0RgYgLtb2bQ0QZQ8nHzNis8wkNQdHWzziWsvsVQOCnCqL/
>>> 3FOkWUbbJTdmnB8lx84sRcuMsDYQ0BYOYW4W/F2WmSxzBnm7V4NixHG9dD4lZ3vJ
>>> fhIO/d0VNOpI+wesZyQg+pwWRHInbigZ0+5A3InOLHW84rWa2qX0wvt6a7rBb0YP
>>> gonBY4xbrPTHoXDHH7ZCs3JW+gwstA5avA/Obp45C5LessbduqRPtBvMUZizyZR5
>>> ByPtJcrCvHlFux1fwc7Idj/9seqaYvllyvO6evvhqgYVU3jV2tekOUNuFGDJ8KRt
>>> HmrzuiH3cmU1JpT6FSen
>>> =XyQw
>>> -----END PGP SIGNATURE-----
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

I just want to keep you updated and tell you that all your replies are very
helpful. It give me clues on what to look for and sometimes confirm some of
our suspicion.

I have transmitted some of the element collected in this thread to our
platform team but we were not able to setup new test so far due to other
task keeping the team busy. I'll post an update when some new tests will be
possible.

Thanks again,
Thomas

2015-04-22 17:19 GMT+02:00 Frederik Nosi <fr...@postecom.it>:

> On 04/22/2015 05:15 PM, Christopher Schultz wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA256
>>
>> Frederik,
>>
>> On 4/22/15 10:53 AM, Frederik Nosi wrote:
>>
>>> Hi, On 04/22/2015 04:35 PM, Christopher Schultz wrote: Neill,
>>>
>>> On 4/22/15 9:12 AM, Neill Lima wrote:
>>>
>>>> If I am not wrong, if the application in question is
>>>>>> monitored in VisualVM through JMX
>>>>>> (https://visualvm.java.net/) you could trigger a Force GC
>>>>>> from its monitoring console.
>>>>>>
>>>>> You can do this, but it won't close any CLOSE_WAIT connections.
>>> Tomcat's timeout must be reached. I suspect that the timeout(s)
>>> are simply way too long.
>>>
>>>> You can tune the network stack's timeout using sysctl, eg:
>>>> net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1
>>>> net.ipv4.tcp_fin_timeout = 3
>>>>
>>> This won't do anything, either. As far as the OS is concerned, the
>> application (Tomcat) is still using that connection. Therefore it
>> can't be cleaned up.
>>
>
> Indeed you are right, tuning the network stack help with TIME_WAIT, not
> CLOSE_WAIT, my bad.
>
>
>> Tomcat has to actively hang up the connection, and the best way to do
>> that is with synchronized timeouts between the reverse proxy and Tomcat.
>>
>> You can try all other kinds of tricks, but the fact of the matter is
>> that the application is still trying to use the socket, so no other
>> component can step-in and kill it.
>>
>
> Probably the application is slow then, or the server overloaded.
>
>
>
>> - -chris
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v2
>> Comment: GPGTools - http://gpgtools.org
>>
>> iQIcBAEBCAAGBQJVN7sZAAoJEBzwKT+lPKRYRhkP/j0GBtPH/+/QU2YEgZxbRoJE
>> z2lmWxDrbFNxiYFS5332SvN4bXhG/Khog83CeBM0bg0VLciSxKYqm5J8YziMlrlo
>> omqk3gUiNeViyjsjO5SBW9hxT1qhC1PLdtx7uZ7xUiNmmE24wQ3Gi2edyjyvYDJ0
>> pzLT+bEp8BjXgm0c6aOONO0PJ+PbyZPeF56PXq6iqn426IhebEUlDP8kxuSh3RwL
>> LQW7tg05bg3yTuP1ZjiwH4gmBfbomJ+xpY6F+zwDkZgk7Cs4okp5/Tr0uTNhsHQM
>> lgGaIZc9SCoqKaMFqWila3RaAnnpqDe1cdg2N44zluIaMkcO94kDSWBuT25t5dGe
>> GBiFG2HGczwyo5MCrx0RgYgLtb2bQ0QZQ8nHzNis8wkNQdHWzziWsvsVQOCnCqL/
>> 3FOkWUbbJTdmnB8lx84sRcuMsDYQ0BYOYW4W/F2WmSxzBnm7V4NixHG9dD4lZ3vJ
>> fhIO/d0VNOpI+wesZyQg+pwWRHInbigZ0+5A3InOLHW84rWa2qX0wvt6a7rBb0YP
>> gonBY4xbrPTHoXDHH7ZCs3JW+gwstA5avA/Obp45C5LessbduqRPtBvMUZizyZR5
>> ByPtJcrCvHlFux1fwc7Idj/9seqaYvllyvO6evvhqgYVU3jV2tekOUNuFGDJ8KRt
>> HmrzuiH3cmU1JpT6FSen
>> =XyQw
>> -----END PGP SIGNATURE-----
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Frederik Nosi <fr...@postecom.it>.

On 04/22/2015 05:15 PM, Christopher Schultz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Frederik,
>
> On 4/22/15 10:53 AM, Frederik Nosi wrote:
>> Hi, On 04/22/2015 04:35 PM, Christopher Schultz wrote: Neill,
>>
>> On 4/22/15 9:12 AM, Neill Lima wrote:
>>>>> If I am not wrong, if the application in question is
>>>>> monitored in VisualVM through JMX
>>>>> (https://visualvm.java.net/) you could trigger a Force GC
>>>>> from its monitoring console.
>> You can do this, but it won't close any CLOSE_WAIT connections.
>> Tomcat's timeout must be reached. I suspect that the timeout(s)
>> are simply way too long.
>>> You can tune the network stack's timeout using sysctl, eg:
>>> net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1
>>> net.ipv4.tcp_fin_timeout = 3
> This won't do anything, either. As far as the OS is concerned, the
> application (Tomcat) is still using that connection. Therefore it
> can't be cleaned up.

Indeed you are right, tuning the network stack help with TIME_WAIT, not 
CLOSE_WAIT, my bad.

>
> Tomcat has to actively hang up the connection, and the best way to do
> that is with synchronized timeouts between the reverse proxy and Tomcat.
>
> You can try all other kinds of tricks, but the fact of the matter is
> that the application is still trying to use the socket, so no other
> component can step-in and kill it.

Probably the application is slow then, or the server overloaded.

>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2
> Comment: GPGTools - http://gpgtools.org
>
> iQIcBAEBCAAGBQJVN7sZAAoJEBzwKT+lPKRYRhkP/j0GBtPH/+/QU2YEgZxbRoJE
> z2lmWxDrbFNxiYFS5332SvN4bXhG/Khog83CeBM0bg0VLciSxKYqm5J8YziMlrlo
> omqk3gUiNeViyjsjO5SBW9hxT1qhC1PLdtx7uZ7xUiNmmE24wQ3Gi2edyjyvYDJ0
> pzLT+bEp8BjXgm0c6aOONO0PJ+PbyZPeF56PXq6iqn426IhebEUlDP8kxuSh3RwL
> LQW7tg05bg3yTuP1ZjiwH4gmBfbomJ+xpY6F+zwDkZgk7Cs4okp5/Tr0uTNhsHQM
> lgGaIZc9SCoqKaMFqWila3RaAnnpqDe1cdg2N44zluIaMkcO94kDSWBuT25t5dGe
> GBiFG2HGczwyo5MCrx0RgYgLtb2bQ0QZQ8nHzNis8wkNQdHWzziWsvsVQOCnCqL/
> 3FOkWUbbJTdmnB8lx84sRcuMsDYQ0BYOYW4W/F2WmSxzBnm7V4NixHG9dD4lZ3vJ
> fhIO/d0VNOpI+wesZyQg+pwWRHInbigZ0+5A3InOLHW84rWa2qX0wvt6a7rBb0YP
> gonBY4xbrPTHoXDHH7ZCs3JW+gwstA5avA/Obp45C5LessbduqRPtBvMUZizyZR5
> ByPtJcrCvHlFux1fwc7Idj/9seqaYvllyvO6evvhqgYVU3jV2tekOUNuFGDJ8KRt
> HmrzuiH3cmU1JpT6FSen
> =XyQw
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Christopher Schultz <ch...@christopherschultz.net>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Frederik,

On 4/22/15 10:53 AM, Frederik Nosi wrote:
> Hi, On 04/22/2015 04:35 PM, Christopher Schultz wrote: Neill,
> 
> On 4/22/15 9:12 AM, Neill Lima wrote:
>>>> If I am not wrong, if the application in question is
>>>> monitored in VisualVM through JMX
>>>> (https://visualvm.java.net/) you could trigger a Force GC
>>>> from its monitoring console.
> You can do this, but it won't close any CLOSE_WAIT connections. 
> Tomcat's timeout must be reached. I suspect that the timeout(s)
> are simply way too long.
>> You can tune the network stack's timeout using sysctl, eg:
> 
>> net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1 
>> net.ipv4.tcp_fin_timeout = 3

This won't do anything, either. As far as the OS is concerned, the
application (Tomcat) is still using that connection. Therefore it
can't be cleaned up.

Tomcat has to actively hang up the connection, and the best way to do
that is with synchronized timeouts between the reverse proxy and Tomcat.

You can try all other kinds of tricks, but the fact of the matter is
that the application is still trying to use the socket, so no other
component can step-in and kill it.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: GPGTools - http://gpgtools.org

iQIcBAEBCAAGBQJVN7sZAAoJEBzwKT+lPKRYRhkP/j0GBtPH/+/QU2YEgZxbRoJE
z2lmWxDrbFNxiYFS5332SvN4bXhG/Khog83CeBM0bg0VLciSxKYqm5J8YziMlrlo
omqk3gUiNeViyjsjO5SBW9hxT1qhC1PLdtx7uZ7xUiNmmE24wQ3Gi2edyjyvYDJ0
pzLT+bEp8BjXgm0c6aOONO0PJ+PbyZPeF56PXq6iqn426IhebEUlDP8kxuSh3RwL
LQW7tg05bg3yTuP1ZjiwH4gmBfbomJ+xpY6F+zwDkZgk7Cs4okp5/Tr0uTNhsHQM
lgGaIZc9SCoqKaMFqWila3RaAnnpqDe1cdg2N44zluIaMkcO94kDSWBuT25t5dGe
GBiFG2HGczwyo5MCrx0RgYgLtb2bQ0QZQ8nHzNis8wkNQdHWzziWsvsVQOCnCqL/
3FOkWUbbJTdmnB8lx84sRcuMsDYQ0BYOYW4W/F2WmSxzBnm7V4NixHG9dD4lZ3vJ
fhIO/d0VNOpI+wesZyQg+pwWRHInbigZ0+5A3InOLHW84rWa2qX0wvt6a7rBb0YP
gonBY4xbrPTHoXDHH7ZCs3JW+gwstA5avA/Obp45C5LessbduqRPtBvMUZizyZR5
ByPtJcrCvHlFux1fwc7Idj/9seqaYvllyvO6evvhqgYVU3jV2tekOUNuFGDJ8KRt
HmrzuiH3cmU1JpT6FSen
=XyQw
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Frederik Nosi <fr...@postecom.it>.

Hi,
On 04/22/2015 04:35 PM, Christopher Schultz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Neill,
>
> On 4/22/15 9:12 AM, Neill Lima wrote:
>> If I am not wrong, if the application in question is monitored in
>> VisualVM through JMX (https://visualvm.java.net/) you could trigger
>> a Force GC from its monitoring console.
> You can do this, but it won't close any CLOSE_WAIT connections.
> Tomcat's timeout must be reached. I suspect that the timeout(s) are
> simply way too long.
You can tune the network stack's timeout using sysctl, eg:

net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 3



> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2
> Comment: GPGTools - http://gpgtools.org
>
> iQIcBAEBCAAGBQJVN7GvAAoJEBzwKT+lPKRYPUIP+gIbeAUIJNo7budTt59iriqX
> JJzpceyz8RQUPmqOKpfSSj+5xLL9wFUBe9WaNsTLgdGkTxPWk1O/UYqmOmeYjYm6
> tKkL/VwI6ySghkKImIMBAOpa8up6cvjqRbziu6He0K7gMgf1d8ipcPI0GQdmGWlm
> 7sMM9FWgQiBtP1+WrRFyEH/7ldD3xbGfgrdzYO4RAqaqbtplvS8ept8ecXZp12RT
> RUeUIZByHBE2x39mcN2piZkPtAB0htN/DPSuAAPi850bBo5cECLlbyDusEoWa4G/
> LQX6i5iIe68M6u2HqRM2gGPB/5LxDnBrCbQdVpcyGBr0CbI/NcfpxKx5IQYYf7PP
> fG5RV3EViqLuIuMTzlMig1b/6h0djGCmMZc8JIZWlsX1SQXf/gbHPOIwEBE2M4pD
> wtfoXZjWOmPep+a8y5QbiiYGZo5wIp9dKNdZEta4KIa/WAUkIYwVT5dEQS9pZ7N0
> /M4NRDngbPdL7FZNh2q4/FNm/gR9W7bg5iIpjz5wVpEwhvqpjU7kJ/rIIE1Vdh6/
> VbRI72dE3P9W1qm8XeQwGFkv8uHcTCqRVxPVN5JvQTIkbWF54tUqvFFB4Dk4gHYW
> s6yDRQdHIAGRnAH9nSF4xp4Bdl9vhl/zAoEW44/MUdeeuknxROMPBerHG3QwoMFw
> MSjyDI5wdBDuFCcuex+s
> =OINr
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

Cheers,
Frederik

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Christopher Schultz <ch...@christopherschultz.net>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Neill,

On 4/22/15 9:12 AM, Neill Lima wrote:
> If I am not wrong, if the application in question is monitored in
> VisualVM through JMX (https://visualvm.java.net/) you could trigger
> a Force GC from its monitoring console.

You can do this, but it won't close any CLOSE_WAIT connections.
Tomcat's timeout must be reached. I suspect that the timeout(s) are
simply way too long.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: GPGTools - http://gpgtools.org

iQIcBAEBCAAGBQJVN7GvAAoJEBzwKT+lPKRYPUIP+gIbeAUIJNo7budTt59iriqX
JJzpceyz8RQUPmqOKpfSSj+5xLL9wFUBe9WaNsTLgdGkTxPWk1O/UYqmOmeYjYm6
tKkL/VwI6ySghkKImIMBAOpa8up6cvjqRbziu6He0K7gMgf1d8ipcPI0GQdmGWlm
7sMM9FWgQiBtP1+WrRFyEH/7ldD3xbGfgrdzYO4RAqaqbtplvS8ept8ecXZp12RT
RUeUIZByHBE2x39mcN2piZkPtAB0htN/DPSuAAPi850bBo5cECLlbyDusEoWa4G/
LQX6i5iIe68M6u2HqRM2gGPB/5LxDnBrCbQdVpcyGBr0CbI/NcfpxKx5IQYYf7PP
fG5RV3EViqLuIuMTzlMig1b/6h0djGCmMZc8JIZWlsX1SQXf/gbHPOIwEBE2M4pD
wtfoXZjWOmPep+a8y5QbiiYGZo5wIp9dKNdZEta4KIa/WAUkIYwVT5dEQS9pZ7N0
/M4NRDngbPdL7FZNh2q4/FNm/gR9W7bg5iIpjz5wVpEwhvqpjU7kJ/rIIE1Vdh6/
VbRI72dE3P9W1qm8XeQwGFkv8uHcTCqRVxPVN5JvQTIkbWF54tUqvFFB4Dk4gHYW
s6yDRQdHIAGRnAH9nSF4xp4Bdl9vhl/zAoEW44/MUdeeuknxROMPBerHG3QwoMFw
MSjyDI5wdBDuFCcuex+s
=OINr
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Neill Lima <ne...@visual-meta.com>.

Hi Andre,

If I am not wrong, if the application in question is monitored in VisualVM
through JMX (https://visualvm.java.net/) you could trigger a Force GC from
its monitoring console.

In order to do that, these startup params might be necessary in the Java
app side :

-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=9010
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false

Thanks,

Neill

On Wed, Apr 22, 2015 at 3:02 PM, André Warnier <aw...@ice-sa.com> wrote:

> Rainer Jung wrote:
>
>> Am 22.04.2015 um 11:58 schrieb Thomas Boniface:
>>
>>> What concerns me the most is the CLOSE_WAIT on tomcat side because when
>>> an
>>> fd peak appears the web application appears to be stuck. It feels like
>>> all
>>> its connections are consumed and none can be established from nginx
>>> anymore. Shouldn't the CLOSE_WAIT connection be recycled to received new
>>> connections from nginx ?
>>>
>>
>> Just to clarify:
>>
>> Every connection has two ends. In netstat the "local" end is left, the
>> "remote" end is right. If a connection is between processes both on the
>> same system, it will be shown in netstat twice. Once for each endpoint
>> being the "local" side.
>>
>> CLOSE_WAIT for a connection between a (local) and b (remote) means, that
>> b has closed the connection but not a. There is no automatism for a closing
>> it because b has closed it. If CLOSE_WAIT pile up, then the idea of b and a
>> when a connection should no longer be used are disparate. E.g. they might
>> have very different idle timeouts (Keep Alive Timeout on HTTP speak), or
>> one observed a problem that the other didn't observe.
>>
>> When I did the counting for
>>
>>   Count           IP:Port ConnectionState
>>    8381    127.0.0.1:8080 CLOSE_WAIT
>>
>> the "127.0.0.1:8080" was left in netstat output, so "local". It means
>> the other side (whatever is the other side of the connection, likely nginx)
>> has closed the connection alardy, but not Tomcat.
>>
>> And the total number of those connections:
>>
>>   Count           IP:Port ConnectionState
>>    8381    127.0.0.1:8080 CLOSE_WAIT
>>    1650    127.0.0.1:8080 ESTABLISHED
>>
>> indeed sums up to the default maxConnections 10000 mentioned by Chris.
>>
>> What I do not understand is, that the same connections looked at from
>> nginx being the local end, show a totally different statistics:
>>
>>   Count           IP:Port ConnectionState
>>   20119    127.0.0.1:8080 SYN_SENT
>>    4692    127.0.0.1:8080 ESTABLISHED
>>     488    127.0.0.1:8080 FIN_WAIT2
>>     122    127.0.0.1:8080 TIME_WAIT
>>      13    127.0.0.1:8080 FIN_WAIT1
>>
>> But maybe that's a problem to solve after you fixed the CLOSED_WAIT (or
>> the 1000 limit) and redo the whole observation.
>>
>> Pretty big numbers you habe ...
>>
>>
> Thomas,
> to elaborate on what Rainer is writing above :
>
> A TCP connection consists of 2 "pipes", one in each direction (client to
> server, server to client).
> From a TCP point of view, the "client" is the one which initially requests
> the connection.  The "server" is the one which "accepts" that connection.
> (This is different from the more general idea of "server", as in "Tomcat
> server".  When Tomcat accepts a HTTP connection, it acts as "server"; when
> a Tomcat webapp establishes a connection with an external HTTP server, the
> webapp (and by extension Tomcat) is the "client").
>
> These 2 pipes can be closed independently of one another, but both need to
> be closed for the connection to be considered as closed and able to
> "disappear".
> When the client wants to close the connection, it will send a "close
> request" packet on the client-to-server pipe.
> The server receives this, and knows then that the client will not send
> anything anymore onto that pipe.  For a server application reading that
> pipe, this would result in the equivalent of an "end of file" on that
> datastream.
> In response to the client close request, the server is supposed to react
> by not sending any more data onto the server-to-client pipe, and in turn to
> send a "close request" onto that pipe.
> Once these various close messages have been received and acknowledged by
> both sides of the connection, the connection is considered as closed, and
> the resources associated with it can be reclaimed/recycled/garbage
> collected etc.. ("closed" is like a virtual state; it means that there is
> no connection).
>
> But if one side fails to fulfill its part of that contract, the connection
> is still there, and it just remains there forever until something forceful
> terminates it.  And all the resources tied to that connection also remain
> tied to it, and are subtracted from the overall resources which the server
> has available to perform other tasks.
> From a server point of view, the "ideal" situation is when all connections
> are actually "active" and really being used to do something useful (sending
> or receiving data e.g.).
> The worst situation is when there are many "useless" connections :
> connections in some state or the other, not actually doing anything useful,
> but tying up resources nevertheless.  This can get to the point where some
> inherent limit is reached, and the server cannot accept any more
> connections, although in theory it still has enough other resources
> available which would allow it to process more useful transactions.
>
> Most of the "TCP states" that you see in the netstat output are transient,
> and last only a few milliseconds usually.  They are just part of the
> overall "TCP connection lifecycle" which is cast in stone and which you can
> do nothing about.
> But, for example, if there is a permanent very high number of connections
> in the CLOSE_WAIT state, that is not "normal".
>
> See here for an explanation of these TCP states, in particular CLOSE_WAIT :
>
> http://www.tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF-2.htm
>
> According to Rainer's counts above, you have 1650 connections in the
> ESTABLISHED state (and for the time being, let's suppose that these are
> actually busy doing something useful).
> But you also have 8381 connections in the CLOSE_WAIT state.  These are not
> doing anything useful, but they are blocking resources on your server.  One
> essential resource which they are blocking, is that there is (currently) a
> maximum *total* of 10,000 connections which can be in existence at any one
> time, and these CLOSE_WAIT connections are occupying (uselessly) 8381 of
> these "slots" (84%).
>
> The precise reason why there are this many connections in that state is
> not clear to us, but my money is on either some misconfiguration of the
> nginx-tomcat connections, or some flaw in the application.
>
> One thing which you could try, and which might provide a clue, is to, in
> quick succession, do :
> 1) a "netstat" command to see how many connections are in CLOSE_WAIT state
> 2) /force/ a GC for Tomcat (*).
> 3) the same netstat command again, to check how many CLOSE_WAIT
> connections there are now
>
> (*) someone else here should be able to contribute the easiest way to
> achieve this
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by André Warnier <aw...@ice-sa.com>.

Rainer Jung wrote:
> Am 22.04.2015 um 11:58 schrieb Thomas Boniface:
>> What concerns me the most is the CLOSE_WAIT on tomcat side because 
>> when an
>> fd peak appears the web application appears to be stuck. It feels like 
>> all
>> its connections are consumed and none can be established from nginx
>> anymore. Shouldn't the CLOSE_WAIT connection be recycled to received new
>> connections from nginx ?
> 
> Just to clarify:
> 
> Every connection has two ends. In netstat the "local" end is left, the 
> "remote" end is right. If a connection is between processes both on the 
> same system, it will be shown in netstat twice. Once for each endpoint 
> being the "local" side.
> 
> CLOSE_WAIT for a connection between a (local) and b (remote) means, that 
> b has closed the connection but not a. There is no automatism for a 
> closing it because b has closed it. If CLOSE_WAIT pile up, then the idea 
> of b and a when a connection should no longer be used are disparate. 
> E.g. they might have very different idle timeouts (Keep Alive Timeout on 
> HTTP speak), or one observed a problem that the other didn't observe.
> 
> When I did the counting for
> 
>   Count           IP:Port ConnectionState
>    8381    127.0.0.1:8080 CLOSE_WAIT
> 
> the "127.0.0.1:8080" was left in netstat output, so "local". It means 
> the other side (whatever is the other side of the connection, likely 
> nginx) has closed the connection alardy, but not Tomcat.
> 
> And the total number of those connections:
> 
>   Count           IP:Port ConnectionState
>    8381    127.0.0.1:8080 CLOSE_WAIT
>    1650    127.0.0.1:8080 ESTABLISHED
> 
> indeed sums up to the default maxConnections 10000 mentioned by Chris.
> 
> What I do not understand is, that the same connections looked at from 
> nginx being the local end, show a totally different statistics:
> 
>   Count           IP:Port ConnectionState
>   20119    127.0.0.1:8080 SYN_SENT
>    4692    127.0.0.1:8080 ESTABLISHED
>     488    127.0.0.1:8080 FIN_WAIT2
>     122    127.0.0.1:8080 TIME_WAIT
>      13    127.0.0.1:8080 FIN_WAIT1
> 
> But maybe that's a problem to solve after you fixed the CLOSED_WAIT (or 
> the 1000 limit) and redo the whole observation.
> 
> Pretty big numbers you habe ...
> 

Thomas,
to elaborate on what Rainer is writing above :

A TCP connection consists of 2 "pipes", one in each direction (client to server, server to 
client).
 From a TCP point of view, the "client" is the one which initially requests the 
connection.  The "server" is the one which "accepts" that connection. (This is different 
from the more general idea of "server", as in "Tomcat server".  When Tomcat accepts a HTTP 
connection, it acts as "server"; when a Tomcat webapp establishes a connection with an 
external HTTP server, the webapp (and by extension Tomcat) is the "client").

These 2 pipes can be closed independently of one another, but both need to be closed for 
the connection to be considered as closed and able to "disappear".
When the client wants to close the connection, it will send a "close request" packet on 
the client-to-server pipe.
The server receives this, and knows then that the client will not send anything anymore 
onto that pipe.  For a server application reading that pipe, this would result in the 
equivalent of an "end of file" on that datastream.
In response to the client close request, the server is supposed to react by not sending 
any more data onto the server-to-client pipe, and in turn to send a "close request" onto 
that pipe.
Once these various close messages have been received and acknowledged by both sides of the 
connection, the connection is considered as closed, and the resources associated with it 
can be reclaimed/recycled/garbage collected etc.. ("closed" is like a virtual state; it 
means that there is no connection).

But if one side fails to fulfill its part of that contract, the connection is still there, 
and it just remains there forever until something forceful terminates it.  And all the 
resources tied to that connection also remain tied to it, and are subtracted from the 
overall resources which the server has available to perform other tasks.
 From a server point of view, the "ideal" situation is when all connections are actually 
"active" and really being used to do something useful (sending or receiving data e.g.).
The worst situation is when there are many "useless" connections : connections in some 
state or the other, not actually doing anything useful, but tying up resources 
nevertheless.  This can get to the point where some inherent limit is reached, and the 
server cannot accept any more connections, although in theory it still has enough other 
resources available which would allow it to process more useful transactions.

Most of the "TCP states" that you see in the netstat output are transient, and last only a 
few milliseconds usually.  They are just part of the overall "TCP connection lifecycle" 
which is cast in stone and which you can do nothing about.
But, for example, if there is a permanent very high number of connections in the 
CLOSE_WAIT state, that is not "normal".

See here for an explanation of these TCP states, in particular CLOSE_WAIT :
http://www.tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF-2.htm

According to Rainer's counts above, you have 1650 connections in the ESTABLISHED state 
(and for the time being, let's suppose that these are actually busy doing something useful).
But you also have 8381 connections in the CLOSE_WAIT state.  These are not doing anything 
useful, but they are blocking resources on your server.  One essential resource which they 
are blocking, is that there is (currently) a maximum *total* of 10,000 connections which 
can be in existence at any one time, and these CLOSE_WAIT connections are occupying 
(uselessly) 8381 of these "slots" (84%).

The precise reason why there are this many connections in that state is not clear to us, 
but my money is on either some misconfiguration of the nginx-tomcat connections, or some 
flaw in the application.

One thing which you could try, and which might provide a clue, is to, in quick succession, 
do :
1) a "netstat" command to see how many connections are in CLOSE_WAIT state
2) /force/ a GC for Tomcat (*).
3) the same netstat command again, to check how many CLOSE_WAIT connections there are now

(*) someone else here should be able to contribute the easiest way to achieve this

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Rainer Jung <ra...@kippdata.de>.

Am 22.04.2015 um 11:58 schrieb Thomas Boniface:
> What concerns me the most is the CLOSE_WAIT on tomcat side because when an
> fd peak appears the web application appears to be stuck. It feels like all
> its connections are consumed and none can be established from nginx
> anymore. Shouldn't the CLOSE_WAIT connection be recycled to received new
> connections from nginx ?

Just to clarify:

Every connection has two ends. In netstat the "local" end is left, the 
"remote" end is right. If a connection is between processes both on the 
same system, it will be shown in netstat twice. Once for each endpoint 
being the "local" side.

CLOSE_WAIT for a connection between a (local) and b (remote) means, that 
b has closed the connection but not a. There is no automatism for a 
closing it because b has closed it. If CLOSE_WAIT pile up, then the idea 
of b and a when a connection should no longer be used are disparate. 
E.g. they might have very different idle timeouts (Keep Alive Timeout on 
HTTP speak), or one observed a problem that the other didn't observe.

When I did the counting for

   Count           IP:Port ConnectionState
    8381    127.0.0.1:8080 CLOSE_WAIT

the "127.0.0.1:8080" was left in netstat output, so "local". It means 
the other side (whatever is the other side of the connection, likely 
nginx) has closed the connection alardy, but not Tomcat.

And the total number of those connections:

   Count           IP:Port ConnectionState
    8381    127.0.0.1:8080 CLOSE_WAIT
    1650    127.0.0.1:8080 ESTABLISHED

indeed sums up to the default maxConnections 10000 mentioned by Chris.

What I do not understand is, that the same connections looked at from 
nginx being the local end, show a totally different statistics:

   Count           IP:Port ConnectionState
   20119    127.0.0.1:8080 SYN_SENT
    4692    127.0.0.1:8080 ESTABLISHED
     488    127.0.0.1:8080 FIN_WAIT2
     122    127.0.0.1:8080 TIME_WAIT
      13    127.0.0.1:8080 FIN_WAIT1

But maybe that's a problem to solve after you fixed the CLOSED_WAIT (or 
the 1000 limit) and redo the whole observation.

Pretty big numbers you habe ...

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

What concerns me the most is the CLOSE_WAIT on tomcat side because when an
fd peak appears the web application appears to be stuck. It feels like all
its connections are consumed and none can be established from nginx
anymore. Shouldn't the CLOSE_WAIT connection be recycled to received new
connections from nginx ?

I am less concerned by the webapp-to-external servers connections in
CLOSE_WAIT state, these connections are handled using httpAsyncClient with
a pool system that has a defined size and an eviction strategy (if too many
connections are opened new connections request will only wait for say 100
ms before failling).

We will look into the configuration you advice to check on nginx and tomcat
size to see how it's setup.

Thanks
Thomas

2015-04-22 11:38 GMT+02:00 André Warnier <aw...@ice-sa.com>:

> Rainer Jung wrote:
>
>> Am 22.04.2015 um 00:08 schrieb André Warnier:
>> ...
>>
>>> The OP has a complex setup, where we are not even sure that the various
>>> connections in various states are even related directly to Tomcat or not.
>>> Graphically, we have this :
>>>
>>> client <-- TCP --> nginx <-- TCP --> Tomcat <--> webapp <-- TCP -->
>>> external servers
>>>
>>> The output of netstat shows all the connections and their state, at the
>>> OS level.  Even assuming that nginx runs on a separate host, that still
>>> leaves the possibility that most of the connections in CLOSE_WAIT state
>>> for example, would be connections between the webapps and external
>>> servers, having not much to do with Tomcat per se.
>>> But of course they use fd's and resources, just like the others. And for
>>> "lsof", they would appear as "belonging" to the Tomcat process.
>>>
>>
>> See my response from 1.5 days ago which contains the individual
>> statistics for each of the above three "TCP" parts.
>>
>>
> Yes, sorry Rainer, I did not read that as carefully as I should have.
>
> And I do agree that the two main things which the OP should have a good
> look at, are
> - the nginx settings for "keep-alive" (client-nginx side)
> - the various webapp-to-external servers connections in CLOSE_WAIT state
> Collectively, these things must be using a lot of resources on the server,
> and probably slow it down significantly.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Rainer Jung <ra...@kippdata.de>.

Am 22.04.2015 um 11:38 schrieb André Warnier:
> Rainer Jung wrote:
>> See my response from 1.5 days ago which contains the individual
>> statistics for each of the above three "TCP" parts.
>>
>
> Yes, sorry Rainer, I did not read that as carefully as I should have.

No worries at all. Lots of stuff going back and forth.

Rainer


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by André Warnier <aw...@ice-sa.com>.

Rainer Jung wrote:
> Am 22.04.2015 um 00:08 schrieb André Warnier:
> ...
>> The OP has a complex setup, where we are not even sure that the various
>> connections in various states are even related directly to Tomcat or not.
>> Graphically, we have this :
>>
>> client <-- TCP --> nginx <-- TCP --> Tomcat <--> webapp <-- TCP -->
>> external servers
>>
>> The output of netstat shows all the connections and their state, at the
>> OS level.  Even assuming that nginx runs on a separate host, that still
>> leaves the possibility that most of the connections in CLOSE_WAIT state
>> for example, would be connections between the webapps and external
>> servers, having not much to do with Tomcat per se.
>> But of course they use fd's and resources, just like the others. And for
>> "lsof", they would appear as "belonging" to the Tomcat process.
> 
> See my response from 1.5 days ago which contains the individual 
> statistics for each of the above three "TCP" parts.
> 

Yes, sorry Rainer, I did not read that as carefully as I should have.

And I do agree that the two main things which the OP should have a good look at, are
- the nginx settings for "keep-alive" (client-nginx side)
- the various webapp-to-external servers connections in CLOSE_WAIT state
Collectively, these things must be using a lot of resources on the server, and probably 
slow it down significantly.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Rainer Jung <ra...@kippdata.de>.

Am 22.04.2015 um 00:08 schrieb André Warnier:
...
> The OP has a complex setup, where we are not even sure that the various
> connections in various states are even related directly to Tomcat or not.
> Graphically, we have this :
>
> client <-- TCP --> nginx <-- TCP --> Tomcat <--> webapp <-- TCP -->
> external servers
>
> The output of netstat shows all the connections and their state, at the
> OS level.  Even assuming that nginx runs on a separate host, that still
> leaves the possibility that most of the connections in CLOSE_WAIT state
> for example, would be connections between the webapps and external
> servers, having not much to do with Tomcat per se.
> But of course they use fd's and resources, just like the others. And for
> "lsof", they would appear as "belonging" to the Tomcat process.

See my response from 1.5 days ago which contains the individual 
statistics for each of the above three "TCP" parts.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by André Warnier <aw...@ice-sa.com>.

Christopher Schultz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> André,
> 
> On 4/21/15 10:56 AM, André Warnier wrote:
>> Thomas Boniface wrote:
>>> The file descriptor peak show up in our monitoring application.
>>> We have some charts showing the number of file descriptors owned
>>> by the tomcat process (ls /proc/$(pgrep -u tomcat7)/fd/ | wc
>>> -l).
>>>
>>> The calatalina.out log shows errors, the most frequent being a 
>>> java.io.IOException: Broken pipe.
>>>
>> [..]
>>
>> A "broken pipe", from the server perspective while sending a
>> response to the client, is a rather usual thing.  It usually means
>> that the (human) client got tired of waiting for a response, and
>> clicked somewhere else in the browser (maybe a "cancel" button;
>> maybe he closed the window; etc..).
> 
> In this case, though, the client is nginx and not a human at a browser.
> 
> If the browser severs the connection to nginx, I'm not sure what nginx
> does with the connection to Tomcat. 

Nginx has no way to know that the client dropped the connection (the client-receiving part 
of it), until Nginx tries to send some data (presumably coming from Tomcat) to the client 
browser and finds no listener anymore.  When that is the case, presumably Nginx closes its 
own receiving part connected to Tomcat, which propagates the error to Tomcat.
(Buffering of all kinds neglected here).

I would expect that it either
> cleans it up nicely (e.g. drains the bytes from the connection, then
> closes), or just drops the connection to the back-end Tomcat (which
> might be more efficient if Tomcat is expected to send relatively large
> responses).
> 
> I don't know how nginx works when acting as a proxy. Does it use HTTP
> keep-alive and process many requests through a single connection
> (possibly not all from the same end user), or does it make and close
> many connections?
> 

I don't know how Nginx works precisely, but it must have all kinds of settings to tune 
such behaviours in function of the circumstances.  If the back-end Tomcat application 
works under a Windows NTLM-like authentication mechanism e.g., then using different 
connections for the same client (or vice-versa, sharing some connections between different 
clients) would play havoc with said AAA mechanism, which is connection-oriented.

This seems to say that Nginx, by default, buffers the entire back-end server response 
before starting to send it to the client : 
http://nginx.com/resources/admin-guide/reverse-proxy/
But it also says that this can be tuned, and even disabled.

It also hints at the fact that even if the client specifies keep-alive with Nginx, nginx 
itself, when dealing with the back-end server, disables the keep-alive (Connection: close).
This probably makes sense, in a scenario where the client may think that all responses 
come from the same back-end server, but Nginx in the middle distributes the requests to 
several back-end servers.  It would make no sense in that case to use keep-alive with the 
back-end servers, which may only ever see one request each from that client.

> If it makes and closes many connections, Tomcat won't hang up the
> phone unless some kind of timeout occurs.
> 
> Thomas, I'd advise you to do the following:
> 
> 1. Check the nginx configuration. Specifically, the keep-alive and
> timeout associated with the proxy configuration.
> 
> 2. Make sure that Tomcat's timeouts are appropriate for those matching
> settings in nginx.
> 
> It's common for users to misconfigure httpd+Tomcat by settings
> different timeouts on either side of the connection, and the result is
> many broken pipe or similar errors on the Tomcat side.
> 
I'll +1 all that in any case.

The OP has a complex setup, where we are not even sure that the various connections in 
various states are even related directly to Tomcat or not.
Graphically, we have this :

client <-- TCP --> nginx <-- TCP --> Tomcat <--> webapp <-- TCP --> external servers

The output of netstat shows all the connections and their state, at the OS level.  Even 
assuming that nginx runs on a separate host, that still leaves the possibility that most 
of the connections in CLOSE_WAIT state for example, would be connections between the 
webapps and external servers, having not much to do with Tomcat per se.
But of course they use fd's and resources, just like the others. And for "lsof", they 
would appear as "belonging" to the Tomcat process.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Christopher Schultz <ch...@christopherschultz.net>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

André,

On 4/21/15 10:56 AM, André Warnier wrote:
> Thomas Boniface wrote:
>> The file descriptor peak show up in our monitoring application.
>> We have some charts showing the number of file descriptors owned
>> by the tomcat process (ls /proc/$(pgrep -u tomcat7)/fd/ | wc
>> -l).
>> 
>> The calatalina.out log shows errors, the most frequent being a 
>> java.io.IOException: Broken pipe.
>> 
> [..]
> 
> A "broken pipe", from the server perspective while sending a
> response to the client, is a rather usual thing.  It usually means
> that the (human) client got tired of waiting for a response, and
> clicked somewhere else in the browser (maybe a "cancel" button;
> maybe he closed the window; etc..).

In this case, though, the client is nginx and not a human at a browser.

If the browser severs the connection to nginx, I'm not sure what nginx
does with the connection to Tomcat. I would expect that it either
cleans it up nicely (e.g. drains the bytes from the connection, then
closes), or just drops the connection to the back-end Tomcat (which
might be more efficient if Tomcat is expected to send relatively large
responses).

I don't know how nginx works when acting as a proxy. Does it use HTTP
keep-alive and process many requests through a single connection
(possibly not all from the same end user), or does it make and close
many connections?

If it makes and closes many connections, Tomcat won't hang up the
phone unless some kind of timeout occurs.

Thomas, I'd advise you to do the following:

1. Check the nginx configuration. Specifically, the keep-alive and
timeout associated with the proxy configuration.

2. Make sure that Tomcat's timeouts are appropriate for those matching
settings in nginx.

It's common for users to misconfigure httpd+Tomcat by settings
different timeouts on either side of the connection, and the result is
many broken pipe or similar errors on the Tomcat side.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: GPGTools - http://gpgtools.org

iQIcBAEBCAAGBQJVNpjcAAoJEBzwKT+lPKRYEI0P/A9EtvOwGtfBcnQLEDWt/CSW
MPDCHKd4M4TmAvzBqdnJgiSp94cSSmXj46h4xaQViHbg+SrODoJenm4SyEV0b3qb
Rx7HwvtsVs0DglTVNGv+ELpDKmeSvzQ0hlEG7dC/AIDDAu7d5ibGBoDxVX8znVRO
CGbZll2YWO+oBypdd7EBR70xVXUZryEWyb2E9F6au1yk0XnLEW0RHG4kbycponbc
JiUny+z1kAPODK8ZlpLv+6FJ6kdwwMDj+3SxSalETf32dU+FAYTDCf6rCC5bciRv
xUctskJQdGuIP/vYyTtIb4xSV3o58HQgqxvaTPciBgr0WOkoQ9+mrcHYGYanzmXd
0FtArB+KtuDFlCfQvt6bhgNX1mvAYQUkk0nZqY4NfabFtq0TSEzUNrxLsvBrvq4m
smYImnaZgkCJMwuQeiZO8jNo5WAP24CC/8oP1OilqEf58wKf0v6iwcxGBC5Z+bjD
LcfY+SGsEbBToiSwkpOmk+ZJhdqgUnmJ4oGwfeE+fm74h+8GjGuETvYkncmoBxfz
Hn7eSELM/dr/NJVFtGsJg6W3zGlsxGKlTflDRteF9RNaeYRd7RrER6zNdVEFkRCw
PXYMmpRbdiZddVBUP0qOJSx/9PJytLBmS6wPjZDkRIVUNOGvV4/K3p9pAupJW1Sn
bcDLfLdKqkAVcR9o0LIy
=GQ5+
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

I guess I get what you mean. Do you know if the same kind of explanation
applies to these ?

Apr 20, 2015 12:11:05 AM org.apache.coyote.AbstractProcessor setErrorState
INFO: An error occurred in processing while on a non-container thread. The
connection will be closed immediately
java.nio.channels.AsynchronousCloseException
        at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:205)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:496)
        at org.apache.tomcat.util.net.NioChannel.write(NioChannel.java:128)
        at
org.apache.tomcat.util.net.NioBlockingSelector.write(NioBlockingSelector.java:101)
        at
org.apache.tomcat.util.net.NioSelectorPool.write(NioSelectorPool.java:174)
        at
org.apache.coyote.http11.InternalNioOutputBuffer.writeToSocket(InternalNioOutputBuffer.java:163)
        at
org.apache.coyote.http11.InternalNioOutputBuffer.flushBuffer(InternalNioOutputBuffer.java:242)
        at
org.apache.coyote.http11.InternalNioOutputBuffer.endRequest(InternalNioOutputBuffer.java:121)
        at
org.apache.coyote.http11.AbstractHttp11Processor.action(AbstractHttp11Processor.java:762)
        at org.apache.coyote.Response.action(Response.java:174)
        at org.apache.coyote.Response.finish(Response.java:274)
        at
org.apache.catalina.connector.OutputBuffer.close(OutputBuffer.java:319)
        at
org.apache.catalina.connector.CoyoteWriter.close(CoyoteWriter.java:112)
        at
networkComm.commands.HttpCommand.sendResponse(HttpCommand.java:224)
        at
com.stickyadstv.adex.AuctioneerResponseWriter.respondToClient(AuctioneerResponseWriter.java:322)
        at
com.stickyadstv.adex.BidSerializationListener.checkSerializationIsComplete(BidSerializationListener.java:70)
        at
com.stickyadstv.adex.BidSerializationListener.completed(BidSerializationListener.java:53)
        at
com.stickyadstv.adex.bidder.marketplace.MarketPlaceBidSerializationWriter.respondToClient(MarketPlaceBidSerializationWriter.java:92)
        at
com.stickyadstv.adex.BidSerializationListener.checkSerializationIsComplete(BidSerializationListener.java:70)
        at
com.stickyadstv.adex.BidSerializationListener.completed(BidSerializationListener.java:53)
        at
com.stickyadstv.adex.BidSerializationListener.completed(BidSerializationListener.java:24)
        at
org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:115)
        at
com.stickyadstv.adex.bidder.internal.InternalBid.serializeAsVAST(InternalBid.java:56)
        at
com.stickyadstv.adex.bidder.marketplace.MarketPlaceBid.serializeAsVAST(MarketPlaceBid.java:151)
        at
com.stickyadstv.adex.AuctioneerResponseWriter.completed(AuctioneerResponseWriter.java:120)
        at com.stickyadstv.adex.Auctioneer.bangTheGavel(Auctioneer.java:521)
        at com.stickyadstv.adex.Auctioneer.close(Auctioneer.java:236)
        at
com.stickyadstv.adex.Auctioneer.checkCompletion(Auctioneer.java:195)
        at
com.stickyadstv.adex.Auctioneer.registerBuyerPlatformResponses(Auctioneer.java:178)
        at
com.stickyadstv.adex.bidder.openrtb.OpenRTBBuyerRequest.flagAllAsNotParticipating(OpenRTBBuyerRequest.java:428)
        at
com.stickyadstv.adex.bidder.openrtb.OpenRTBBuyerRequest.setBidResponse(OpenRTBBuyerRequest.java:139)
        at
com.stickyadstv.adex.bidder.openrtb.OpenRTBBidRequestAsyncListener.completed(OpenRTBBidRequestAsyncListener.java:27)
        at
com.stickyadstv.adex.bidder.openrtb.OpenRTBBidRequestAsyncListener.completed(OpenRTBBidRequestAsyncListener.java:12)
        at
org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:115)
        at
org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:173)
        at
org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:355)
        at
org.apache.http.nio.protocol.HttpAsyncRequestExecutor.responseReceived(HttpAsyncRequestExecutor.java:230)
        at
org.apache.http.impl.nio.client.LoggingAsyncRequestExecutor.responseReceived(LoggingAsyncRequestExecutor.java:112)
        at
org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:254)
        at
org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:73)
        at
org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:37)
        at
org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:113)
        at
org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:159)
        at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:338)
        at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:316)
        at
org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:277)
        at
org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:105)
        at
org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:584)
        at java.lang.Thread.run(Thread.java:745)

Also I'm glad to get a lot of feed back that help confirming my opinions or
give me new path to follow. Unfortunetly so far nothing seems to explain
why the peaks occur a lot in the last release of tomcat but only
occassionally in the previous one we used.

Thomas

2015-04-21 16:56 GMT+02:00 André Warnier <aw...@ice-sa.com>:

> Thomas Boniface wrote:
>
>> The file descriptor peak show up in our monitoring application. We have
>> some charts showing the number of file descriptors owned by the tomcat
>> process (ls /proc/$(pgrep -u tomcat7)/fd/ | wc -l).
>>
>> The calatalina.out log shows errors, the most frequent being a
>> java.io.IOException: Broken pipe.
>>
>>  [..]
>
> A "broken pipe", from the server perspective while sending a response to
> the client, is a rather usual thing.  It usually means that the (human)
> client got tired of waiting for a response, and clicked somewhere else in
> the browser (maybe a "cancel" button; maybe he closed the window; etc..).
> The browser would then immediately close the connection with the server,
> and when the server eventually tries to write anything else to that
> connection, the "broken pipe" exception would be the result.
> With the numbers you quoted previously regarding the number of
> simultaneous client sessions, it doesn't look extraordinary that this would
> happen regularly.
> Maybe the thing to investigate here is whether your server is really so
> slow in answering clients, that a significant portion of them do get tired
> of waiting and get an irresistible urge to click elsewhere..
>
> Apart from the human client, browsers themselves have a built-in timeout
> for waiting for a server response, and will themselves give up after a
> while.  That is on the order of 4-5 minutes after sending the request and
> not receiving anything from the server in response.
> Some applications are such that they can sometimes take more than that to
> be able to send a response.  In such cases, to avoid the browser timeout
> (and connection close), there are "tricks" to use, to send intermediate
> kind of "wait message" to the browser, so that it does not hang up.
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by André Warnier <aw...@ice-sa.com>.

Thomas Boniface wrote:
> The file descriptor peak show up in our monitoring application. We have
> some charts showing the number of file descriptors owned by the tomcat
> process (ls /proc/$(pgrep -u tomcat7)/fd/ | wc -l).
> 
> The calatalina.out log shows errors, the most frequent being a
> java.io.IOException: Broken pipe.
> 
[..]

A "broken pipe", from the server perspective while sending a response to the client, is a 
rather usual thing.  It usually means that the (human) client got tired of waiting for a 
response, and clicked somewhere else in the browser (maybe a "cancel" button; maybe he 
closed the window; etc..).  The browser would then immediately close the connection with 
the server, and when the server eventually tries to write anything else to that 
connection, the "broken pipe" exception would be the result.
With the numbers you quoted previously regarding the number of simultaneous client 
sessions, it doesn't look extraordinary that this would happen regularly.
Maybe the thing to investigate here is whether your server is really so slow in answering 
clients, that a significant portion of them do get tired of waiting and get an 
irresistible urge to click elsewhere..

Apart from the human client, browsers themselves have a built-in timeout for waiting for a 
server response, and will themselves give up after a while.  That is on the order of 4-5 
minutes after sending the request and not receiving anything from the server in response.
Some applications are such that they can sometimes take more than that to be able to send 
a response.  In such cases, to avoid the browser timeout (and connection close), there are 
"tricks" to use, to send intermediate kind of "wait message" to the browser, so that it 
does not hang up.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Thomas Boniface <th...@stickyads.tv>.

The file descriptor peak show up in our monitoring application. We have
some charts showing the number of file descriptors owned by the tomcat
process (ls /proc/$(pgrep -u tomcat7)/fd/ | wc -l).

The calatalina.out log shows errors, the most frequent being a
java.io.IOException: Broken pipe.

Apr 20, 2015 12:11:02 AM org.apache.coyote.AbstractProcessor setErrorState
INFO: An error occurred in processing while on a non-container thread. The
connection will be closed immediately
java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
        at sun.nio.ch.IOUtil.write(IOUtil.java:65)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
        at org.apache.tomcat.util.net.NioChannel.write(NioChannel.java:128)
        at
org.apache.tomcat.util.net.NioBlockingSelector.write(NioBlockingSelector.java:101)
        at
org.apache.tomcat.util.net.NioSelectorPool.write(NioSelectorPool.java:174)
        at
org.apache.coyote.http11.InternalNioOutputBuffer.writeToSocket(InternalNioOutputBuffer.java:163)
        at
org.apache.coyote.http11.InternalNioOutputBuffer.flushBuffer(InternalNioOutputBuffer.java:242)
        at
org.apache.coyote.http11.InternalNioOutputBuffer.flush(InternalNioOutputBuffer.java:94)
        at
org.apache.coyote.http11.AbstractHttp11Processor.action(AbstractHttp11Processor.java:801)
        at org.apache.coyote.Response.action(Response.java:172)
        at
org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:363)
        at
org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:331)
        at
org.apache.catalina.connector.CoyoteWriter.flush(CoyoteWriter.java:98)
        at
networkComm.commands.HttpCommand.sendResponse(HttpCommand.java:223)
        at
com.stickyadstv.adex.AuctioneerResponseWriter.respondToClient(AuctioneerResponseWriter.java:322)
        at
com.stickyadstv.adex.BidSerializationListener.checkSerializationIsComplete(BidSerializationListener.java:70)
        at
com.stickyadstv.adex.BidSerializationListener.completed(BidSerializationListener.java:53)
        at
com.stickyadstv.adex.BidSerializationListener.completed(BidSerializationListener.java:24)
        at
org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:115)
        at
com.stickyadstv.adex.bidder.openrtb.OpenRTBBid.serializeAdm(OpenRTBBid.java:158)
        at
com.stickyadstv.adex.bidder.openrtb.OpenRTBBid$AdmReceived.completed(OpenRTBBid.java:310)
        at
com.stickyadstv.adex.bidder.openrtb.OpenRTBBid$AdmReceived.completed(OpenRTBBid.java:254)
        at
org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:115)
        at
org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:173)
        at
org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:355)
        at
org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:242)
        at
org.apache.http.impl.nio.client.LoggingAsyncRequestExecutor.inputReady(LoggingAsyncRequestExecutor.java:87)
        at
org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:264)
        at
org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:73)
        at
org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:37)
        at
org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:113)
        at
org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:159)
        at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:338)
        at
org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:316)
        at
org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:277)
        at
org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:105)
        at
org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:584)
        at java.lang.Thread.run(Thread.java:745)

Here a count made on "Exception":
  304 java.io.InvalidClassException:
com.stickyadstv.web.commands.request.RequestData; incompatible types for
field time
   6477 java.io.IOException: Broken pipe
      1 java.io.IOException: Connection reset by peer
      3 java.lang.NullPointerException
    821 java.nio.channels.AsynchronousCloseException
     12 java.nio.channels.ClosedChannelException
      1     - locked <0x00000000a503aa38> (a
java.lang.NumberFormatException)
     21 org.apache.catalina.connector.CoyoteAdapter$RecycleRequiredException

I have uploaded a portion of the catalina log during the test I made:
http://www.filedropper.com/catalinatoday11


Thomas

2015-04-21 16:02 GMT+02:00 Christopher Schultz <chris@christopherschultz.net
>:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Thomas,
>
> On 4/20/15 8:11 AM, Thomas Boniface wrote:
> > I have tried to find help regarding an issue we experience with
> > our platform leading to random file descriptor peaks. This happens
> > more often on heavy load but can also happen on low traffic
> > periods.
> >
> > Our application is using servlet 3.0 async features and an async
> > connector. We noticed that a lot of issues regarding asynchronous
> > feature were fixed between our production version and the last
> > stable build. We decided to give it a try to see if it improves
> > things or at least give clues on what can cause the issue;
> > Unfortunately it did neither.
> >
> > The file descriptor peaks and application blocking happens
> > frequently with this version when it only happens rarely on
> > previous version (tomcat7 7.0.28-4).
> >
> > Tomcat is behind an nginx server. The tomcat connector used is
> > configured as follows:
> >
> > We use an Nio connector: <Connector port="8080"
> > protocol="org.apache.coyote. http11.Http11NioProtocol"
> > selectorTimeout="1000" maxThreads="200" maxHttpHeaderSize="16384"
> > address="127.0.0.1" redirectPort="8443"/>
>
> The default maxConnections is 10000, so nginx can open that many
> connections before Tomcat starts to refuse them.
>
> How are you observing the "file descriptor peak"? Are you getting errors
> ?
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2
> Comment: GPGTools - http://gpgtools.org
>
> iQIcBAEBCAAGBQJVNlhgAAoJEBzwKT+lPKRYMWIQAKWiP7T8HkZq0L9SmqmZr3rv
> DrlWBVXBq21M3/e0P3MNEnKp73SbZtq82i6Ib58cjBi6DMQQJTGisdb1WUqNkxWL
> f0J0fizl5wwDho9FJzxHzR/uV3Nm67Bx7QzvroEEyAmE/wRXdFFOlq/rSdKWfVDC
> jlBE5Seo+AQiURCONEZ9CYHPwm50yeSy9JzGuH1VXcfUTl0NVXS63vOjLp8XeJKO
> 68jT6CuY5uzjvv6ZXeES73zvcthkCbF1/Si1KSVshQ/+aXAFDJDuXLLx0D7PWNV7
> N6jxVeHOoTdogYtfVyuOhQ4Xu6d9d9NddKC1ycMBeRfJP/5zG3YXHAbDdwWP8Sc7
> ip9Md6Y+KA089bRhQ92+6kWqWqtxx1Rg1lhRPkY9nOFc5kEFFsWT8NIEIVWtaN80
> zcanU29juMtJK/+Ov/vwyHljTxOikl2So3l19K2bBaCa1pDQ+NeKRQq4KISLlfjB
> 05w88zu7uS8gYbf+uiw/TMZte1skT8tR3AD1Ye5XRV22zz+yKy6Z7nAdGg/bvIug
> 8ngVWAQ7mxWt6QAtLRMFS4nw+xBNNNRyMYzvFEkZ6d6Wr67SnQXxRKqAIf8nhZ3h
> tqAnU0iXlhrdQCcVsMq/iv9lMgHo2x1NBNfeClkIz3XDvgVDJBbHNAr49WmlC5/H
> 3xUS2AOTCIJNuK+W1CTm
> =i7t2
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: File descriptors peaks with latest stable build of Tomcat 7

Posted by Christopher Schultz <ch...@christopherschultz.net>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Thomas,

On 4/20/15 8:11 AM, Thomas Boniface wrote:
> I have tried to find help regarding an issue we experience with
> our platform leading to random file descriptor peaks. This happens
> more often on heavy load but can also happen on low traffic
> periods.
> 
> Our application is using servlet 3.0 async features and an async
> connector. We noticed that a lot of issues regarding asynchronous
> feature were fixed between our production version and the last
> stable build. We decided to give it a try to see if it improves
> things or at least give clues on what can cause the issue;
> Unfortunately it did neither.
> 
> The file descriptor peaks and application blocking happens
> frequently with this version when it only happens rarely on
> previous version (tomcat7 7.0.28-4).
> 
> Tomcat is behind an nginx server. The tomcat connector used is
> configured as follows:
> 
> We use an Nio connector: <Connector port="8080"
> protocol="org.apache.coyote. http11.Http11NioProtocol" 
> selectorTimeout="1000" maxThreads="200" maxHttpHeaderSize="16384" 
> address="127.0.0.1" redirectPort="8443"/>

The default maxConnections is 10000, so nginx can open that many
connections before Tomcat starts to refuse them.

How are you observing the "file descriptor peak"? Are you getting errors
?

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: GPGTools - http://gpgtools.org

iQIcBAEBCAAGBQJVNlhgAAoJEBzwKT+lPKRYMWIQAKWiP7T8HkZq0L9SmqmZr3rv
DrlWBVXBq21M3/e0P3MNEnKp73SbZtq82i6Ib58cjBi6DMQQJTGisdb1WUqNkxWL
f0J0fizl5wwDho9FJzxHzR/uV3Nm67Bx7QzvroEEyAmE/wRXdFFOlq/rSdKWfVDC
jlBE5Seo+AQiURCONEZ9CYHPwm50yeSy9JzGuH1VXcfUTl0NVXS63vOjLp8XeJKO
68jT6CuY5uzjvv6ZXeES73zvcthkCbF1/Si1KSVshQ/+aXAFDJDuXLLx0D7PWNV7
N6jxVeHOoTdogYtfVyuOhQ4Xu6d9d9NddKC1ycMBeRfJP/5zG3YXHAbDdwWP8Sc7
ip9Md6Y+KA089bRhQ92+6kWqWqtxx1Rg1lhRPkY9nOFc5kEFFsWT8NIEIVWtaN80
zcanU29juMtJK/+Ov/vwyHljTxOikl2So3l19K2bBaCa1pDQ+NeKRQq4KISLlfjB
05w88zu7uS8gYbf+uiw/TMZte1skT8tR3AD1Ye5XRV22zz+yKy6Z7nAdGg/bvIug
8ngVWAQ7mxWt6QAtLRMFS4nw+xBNNNRyMYzvFEkZ6d6Wr67SnQXxRKqAIf8nhZ3h
tqAnU0iXlhrdQCcVsMq/iv9lMgHo2x1NBNfeClkIz3XDvgVDJBbHNAr49WmlC5/H
3xUS2AOTCIJNuK+W1CTm
=i7t2
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org